2. natim-blog


Rémy HUBSCHER  committed 9a01ce9

Add draft circus clustering management on natim.ionyse.com

  • Participants
  • Parent commits 0c022d1
  • Branches default

Comments (0)

Files changed (1)

File src/circus_clustering_management.rst

View file
+Circus clustering management
+:date: 2012-09-16 15:28
+:tags: python, circus
+:category: Python
+:author: Rémy Hubscher
+:lang: en
+:status: draft
+During PyConFr 2012, we spent, `Jonathan Dorival`_, `Mathieu
+Agopian`_ and `me`_, two days sprinting on `Circus`_.
+Circus is a process & socket manager. It can be used to monitor and
+control processes and sockets.
+At Novapost, we usually launch processes on different (virtual) machines.
+So we wanted `Circus`_ to manage processes launched on different servers.
+.. _`Circus`: http://docs.circus.io/
+.. _`Jonathan Dorival`: http://github.com/jojax/
+.. _`Mathieu Agopian`: http://github.com/magopian/
+.. _`me`: http://github.com/natim/
+We had the chance to discuss this with `Tarek Ziadé`_ and `Alexis
+Metaireau`_ at PyconFr 2012. They have the same needs at Mozilla so 
+we seized the opportunity and brainstormed about our needs. We arrived to 
+this conclusion :
+.. _`Tarek Ziadé`: http://ziade.org/
+.. _`Alexis Metaireau`: http://blog.notmyidea.org/
+We want
+* An unique interface to manage processes on different circusd called
+  ``circusmeta``
+* To manage a unique ``circusd`` node or a pool of ``circusd`` nodes
+* To run a new circusd and automatically be able to manage it
+* To add a new worker on a specific circusd node
+* To add a new worker on a service and let ``circusmeta`` choose
+  which node will start it
+* Have global statistics about the cluster and use them in plugins
+* To run a command on a specific node or every nodes
+We don't want
+* To start a new virtualmachine 
+* To register some watcher on an empty ``circusd``
+So after this brainstorming we ended up with this implementation roadmap:
+* Have a default name for the ``circusd`` server but also be able to
+  rename it with the configuration and with a ``circusctl`` command.
+* Modify the `stats_endpoint` protocol, to prefix stats with the ``circusd`` unique name of the node
+* Create a socket on ``circusmeta`` that will agregate every
+  ``circusd stats_endpoint`` on a unique socket base of the pool configuration.
+* Adapt existing circus tools (circus-top, circushttpd, to manage circusd nodes)
+A word about circusmeta
+With that in mind, ``circusmeta`` don't need to be a ``server``. It is
+just a tool which will manage a pool of nodes by connecting node's
+sockets (stats, endpoint and pubsub).
+So ``circusmeta`` just need to be running when
+accessing the pool. (When we use a circustool on the pool)
+``circusmeta`` will be configured with the list of servers and some information about the strategy it will use when adding watchers on the pool.
+This proposal doesn't change the
+core of ``circus``, there is no master/slave thing or complex architecture
+to configure or understand.
+The only changing point is that each stat message need to be identified with
+the node name, in order to use the same command for a
+unique server or for a pool behind ``circusmeta``.
+The codebase is also allready there, we just need some code to take one
+step back and manage a list of node in ``circus`` tools.