README.txt 48 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172
  1. Supervisor: A System for Allowing the Control of Process State on UNIX
  2. History
  3. 7/3/2006: updated for version 2.0
  4. 8/30/2006: updated for version 2.1
  5. 3/31/2007: updated for version 2.2
  6. 8/15/2007: updated for version 3.0
  7. Introduction
  8. The supervisor is a client/server system that allows its users to
  9. control a number of processes on UNIX-like operating systems. It
  10. was inspired by the following:
  11. - It is often inconvenient to need to write "rc.d" scripts for
  12. every single process instance. rc.d scripts are a great
  13. lowest-common-denominator form of process
  14. initialization/autostart/management, but they can be painful to
  15. write and maintain. Additionally, rc.d scripts cannot
  16. automatically restart a crashed process and many programs do not
  17. restart themselves properly on a crash. Supervisord starts
  18. processes as its subprocesses, and can be configured to
  19. automatically restart them on a crash. It can also automatically
  20. be configured to start processes on its own invocation.
  21. - It's often difficult to get accurate up/down status on processes
  22. on UNIX. Pidfiles often lie. Supervisord starts processes as
  23. subprocesses, so it always knows the true up/down status of its
  24. children and can be queried conveniently for this data.
  25. - Users who need to control process state often need only to do
  26. that. They don't want or need full-blown shell access to the
  27. machine on which the processes are running. Supervisorctl allows
  28. a very limited form of access to the machine, essentially
  29. allowing users to see process status and control
  30. supervisord-controlled subprocesses by emitting "stop", "start",
  31. and "restart" commands from a simple shell or web UI.
  32. - Users often need to control processes on many machines.
  33. Supervisor provides a simple, secure, and uniform mechanism for
  34. interactively and automatically controlling processes on groups
  35. of machines.
  36. - Processes which listen on "low" TCP ports often need to be
  37. started and restarted as the root user (a UNIX misfeature). It's
  38. usually the case that it's perfectly fine to allow "normal"
  39. people to stop or restart such a process, but providing them with
  40. shell access is often impractical, and providing them with root
  41. access or sudo access is often impossible. It's also (rightly)
  42. difficult to explain to them why this problem exists. If
  43. supervisord is started as root, it is possible to allow "normal"
  44. users to control such processes without needing to explain the
  45. intricacies of the problem to them.
  46. - Processes often need to be started and stopped in groups,
  47. sometimes even in a "priority order". It's often difficult to
  48. explain to people how to do this. Supervisor allows you to
  49. assign priorities to processes, and allows user to emit commands
  50. via the supervisorctl client like "start all", and "restart all",
  51. which starts them in the preassigned priority order.
  52. Supported Platforms
  53. Supervisor has been tested and is known to run on Linux (Fedora Core
  54. 5, Ubuntu 6), Mac OS X (10.4), and Solaris (10 for Intel) and
  55. FreeBSD 6.1. It will likely work fine on most UNIX systems.
  56. Supervisor will not run at all under any version of Windows.
  57. Supervisor requires Python 2.3 or better.
  58. Installing
  59. Run "python setup.py install". This will download and install all
  60. distributions depended upon by supervisor and finally install
  61. supervisor itself. Once that's done, copy the "sample.conf" file
  62. you'll find in the same directory as this file to
  63. /etc/supervisord.conf and modify to your liking. If you'd rather
  64. not put the supervisord.conf file in /etc, you can place it anywhere
  65. and start supervisord and point it at the configuration file via the
  66. -c flag, e.g. "python supervisord.py -c /path/to/sample/conf" or, if
  67. you use the shell script named "supervisord", "supervisord -c
  68. /path/to/sample.conf".
  69. I make reference below to a "$BINDIR" when explaining how to run
  70. supervisord and supervisorctl. This is the "bindir" directory that
  71. your Python installation has been configured with. For example, for
  72. an installation of Python installed via "./configure
  73. --prefix=/usr/local/python; make; make install", $BINDIR would be
  74. "/usr/local/python/bin". Python interpreters on different platforms
  75. use different $BINDIRs. Look at the output of "setup.py install" if
  76. you can't figure out where yours is.
  77. Installing Without Internet Access
  78. Since "setup.py install" performs downloads of dependent software,
  79. it will not work on machines without internet access. To install to
  80. a machine which is not internet connected, obtain the following
  81. dependencies on a machine which is internet-connected::
  82. - setuptools (latest) from http://pypi.python.org/pypi/setuptools
  83. - meld3 (0.6) from http://www.plope.com/software/meld3/
  84. - medusa (0.5.4) from http://www.amk.ca/python/code/medusa.html
  85. - elementtree (1.2.6) from http://effbot.org/downloads#elementtree
  86. And then copy these files to removable media and put them on the
  87. target machine. Install each onto the target machine as per its
  88. instructions.
  89. *Note* -- if the machine you're installing on does not have a C
  90. compiler, meld3's "setup.py install" probably won't work because
  91. meld3 uses C extensions, but you can either copy the meld3/meld3
  92. directory into your Python's site-packages directory, or you can
  93. build a binary distribution for your platform on a similar machine
  94. that does have a C compiler before shipping it over by doing "python
  95. setup.py bdist".
  96. Finally, run supervisor's "python setup.py install".
  97. Running Supervisord
  98. To start supervisord, run $BINDIR/supervisord. The resulting
  99. process will daemonize itself and detach from the terminal. It
  100. keeps an operations log at "/tmp/supervisor.log" by default.
  101. You can start supervisord in the foreground by passing the "-n" flag
  102. on its command line. This is useful to debug startup problems.
  103. To change the set of programs controlled by supervisord, edit the
  104. supervisord.conf file and kill -HUP or otherwise restart the
  105. supervisord process. This file has several example program
  106. definitions.
  107. Supervisord accepts a number of command-line overrides. Type
  108. 'supervisord -h' for an overview.
  109. Running Supervisorctl
  110. To start supervisorctl, run $BINDIR/supervisorctl. A shell will
  111. be presented that will allow you to control the processes that are
  112. currently managed by supervisord. Type "help" at the prompt to get
  113. information about the supported commands.
  114. supervisorctl may be invoked with "one time" commands when invoked
  115. with arguments from a command line. An example: "supervisorctl stop
  116. all". If arguments are present on the supervisorctl command-line,
  117. it will prevent the interactive shell from being invoked. Instead,
  118. the command will be executed and supervisorctl will exit.
  119. If supervisorctl is invoked in interactive mode against a
  120. supervisord that requires authentication, you will be asked for
  121. authentication credentials.
  122. Components
  123. Supervisord
  124. The server piece of the supervisor is named "supervisord". It is
  125. responsible for responding to commands from the client process as
  126. well as restarting crashed or exited processes. It is meant to be
  127. run as the root user in most production setups. NOTE: see
  128. "Security Notes" at the end of this document for caveats!
  129. The server process uses a configuration file. This is typically
  130. located in "/etc/supervisord.conf". This configuration file is an
  131. "Windows-INI" style config file. It is important to keep this
  132. file secure via proper filesystem permissions because it may
  133. contain unencrypted usernames and passwords.
  134. Supervisorctl
  135. The command-line client piece of the supervisor is named
  136. "supervisorctl". It provides a shell-like interface to the
  137. features provided by supervisord. From supervisorctl, a user can
  138. connect to different supervisord processes, get status on the
  139. subprocesses controlled by a supervisord, stop and start
  140. subprocesses of a supervisord, and get lists of running processes
  141. of a supervisord.
  142. The command-line client talks to the server across a UNIX domain
  143. socket or an Internet socket. The server can assert that the user
  144. of a client should present authentication credentials before it
  145. allows him to perform commands. The client process may use the
  146. same configuration file as the server; any configuration file with
  147. a [supervisorctl] section in it will work.
  148. Web Server
  149. A (sparse) web user interface with functionality comparable to
  150. supervisorctl may be accessed via a browser if you start
  151. supervisord against an internet socket. Visit the server URL
  152. (e.g. http://localhost:9001/) to view and control process status
  153. through the web interface after changing the configuration file's
  154. 'http_port' parameter appropriately.
  155. XML-RPC Interface
  156. The same HTTP server which serves the web UI serves up an XML-RPC
  157. interface that can be used to interrogate and control supervisor
  158. and the programs it runs. To use the XML-RPC interface, connect
  159. to supervisor's http port with any XML-RPC client library and run
  160. commands against it. An example of doing this using Python's
  161. xmlrpclib client library::
  162. import xmlrpclib
  163. server = xmlrpclib.Server('http://localhost:9001')
  164. Call methods against the supervisor and its subprocesses by using
  165. the 'supervisor' namespace::
  166. server.supervisor.getState()
  167. You can get a list of methods supported by supervisor's XML-RPC
  168. interface by using the XML-RPC 'system.listMethods' API:
  169. server.system.listMethods()
  170. You can see help on a method by using the 'system.methodHelp' API
  171. against the method::
  172. print server.system.methodHelp('supervisor.shutdown')
  173. Supervisor's XML-RPC interface also supports the nascent XML-RPC
  174. multicall API described at
  175. http://www.xmlrpc.com/discuss/msgReader$1208.
  176. You can extend supervisor functionality with new XML-RPC API
  177. methods by adding new top-level RPC interfaces as necessary. See
  178. "Configuration File ['rpcinterface:x] Section Settings" in this
  179. file.
  180. Configuration File '[supervisord]' Section Settings
  181. The supervisord.conf log file contains a section named
  182. '[supervisord]' in which global settings for the supervisord process
  183. should be inserted. These are:
  184. 'http_port' -- Either a TCP host:port value or (e.g. 127.0.0.1:9001)
  185. or a path to a UNIX domain socket (e.g. /tmp/supervisord.sock) on
  186. which supervisor will listen for HTTP/XML-RPC requests.
  187. Supervisorctl itself uses XML-RPC to communicate with supervisord
  188. over this port.
  189. 'sockchmod' -- Change the UNIX permission mode bits of the http_port
  190. UNIX domain socket to this value (ignored if using a TCP socket).
  191. Default: 0700.
  192. 'sockchown' -- Change the user and group of the socket file to this
  193. value. May be a username (e.g. chrism) or a username and group
  194. separated by a dot (e.g. chrism.wheel) Default: do not change.
  195. 'umask' -- The umask of the supervisord process. Default: 022.
  196. 'logfile' -- The path to the activity log of the supervisord process.
  197. 'logfile_maxbytes' -- The maximum number of bytes that may be
  198. consumed by the activity log file before it is rotated (suffix
  199. multipliers like "KB", "MB", and "GB" can be used in the value).
  200. Set this value to 0 to indicate an unlimited log size. Default:
  201. 50MB.
  202. 'logfile_backups' -- The number of backups to keep around resulting
  203. from activity log file rotation. Set this to 0 to indicate an
  204. unlimited number of backups. Default: 10.
  205. 'loglevel' -- The logging level, dictating what is written to the
  206. activity log. One of 'critical', 'error', 'warn', 'info', 'debug'
  207. or 'trace'. Note that at log level 'trace', the supervisord log
  208. file will record the stderr/stdout output of its child processes,
  209. which is useful for debugging. Default: info.
  210. 'pidfile' -- The location in which supervisord keeps its pid file.
  211. 'nodaemon' -- If true, supervisord will start in the foreground
  212. instead of daemonizing. Default: false.
  213. 'minfds' -- The minimum number of file descriptors that must be
  214. available before supervisord will start successfully. Default:
  215. 1024.
  216. 'minprocs' -- The minimum nymber of process descriptors that must be
  217. available before supervisord will start successfully. Default: 200.
  218. 'nocleanup' -- prevent supervisord from clearing any existing "AUTO"
  219. log files at startup time. Default: false.
  220. 'http_username' -- the username required for authentication to our
  221. HTTP server. Default: none.
  222. 'http_password' -- the password required for authentication to our
  223. HTTP server. Default: none.
  224. 'childlogdir' -- the directory used for AUTO log files. Default:
  225. value of Python's tempfile.get_tempdir().
  226. 'user' -- if supervisord is run as root, switch users to this UNIX
  227. user account before doing any meaningful processing. This value has
  228. no effect if supervisord is not run as root. Default: do not switch
  229. users.
  230. 'directory' -- When supervisord daemonizes, switch to this
  231. directory. Default: do not cd.
  232. 'strip_ansi' -- Strip all ANSI escape sequences from process log
  233. files.
  234. 'environment' -- A list of key/value pairs in the form
  235. "KEY=val,KEY2=val2" that will be placed in the supervisord process'
  236. environment (and as a result in all of its child process'
  237. environments). Default: none. **Note** that subprocesses will
  238. inherit the environment variables of the shell used to start
  239. "supervisord" except for the ones overridden here and within the
  240. program's "environment" configuration stanza. See "Subprocess
  241. Environment" below.
  242. 'identifier' -- The identifier for this supervisor server, used by
  243. the RPC interface. Default: 'supervisor'.
  244. Configuration File '[supervisorctl]' Section Settings
  245. The configuration file may contain settings for the supervisorctl
  246. interactive shell program. These options are listed below.
  247. 'serverurl' -- The URL that should be used to access the supervisord
  248. server, e.g. "http://localhost:9001". For UNIX domain sockets, use
  249. "unix:///absolute/path/to/file.sock".
  250. 'username' -- The username to pass to the supervisord server for use
  251. in authentication (should be same as 'http_username' in supervisord
  252. config). Optional.
  253. 'password' -- The password to pass to the supervisord server for use
  254. in authentication (should be the same as 'http_password' in
  255. supervisord config). Optional.
  256. 'prompt' -- String used as supervisorctl prompt. Default: supervisor.
  257. Configuration File '[program:x]' Section Settings
  258. The .INI file must contain one or more 'program' sections in order
  259. for supervisord to know which programs it should start and control.
  260. A sample program section has the following structure, the options of
  261. which are described below it::
  262. [program:foo]
  263. command=/path/to/foo
  264. process_name = %(program_name)s
  265. numprocs=1
  266. priority=1
  267. autostart=true
  268. autorestart=true
  269. startsecs=1
  270. startretries=3
  271. exitcodes=0,2
  272. stopsignal=TERM
  273. stopwaitsecs=10
  274. user=nobody
  275. redirect_stderr=false
  276. stdout_logfile=AUTO
  277. stdout_logfile_maxbytes=50MB
  278. stdout_logfile_backups=10
  279. stdout_capturefile=AUTO
  280. stderr_logfile=AUTO
  281. stderr_logfile_maxbytes=50MB
  282. stderr_logfile_backups=10
  283. stderr_capturefile=AUTO
  284. environment=A=1,B=2
  285. '[program:foo]' -- the section header, required for each program.
  286. 'programname' is a descriptive name (arbitrary) used to describe the
  287. program being run. It must not include a colon character or a
  288. bracket character.
  289. 'command' -- the command that will be run when this program is
  290. started. The command can be either absolute,
  291. e.g. ('/path/to/programname') or relative ('programname'). If it is
  292. relative, the PATH will be searched for the executable. Programs
  293. can accept arguments, e.g. ('/path/to/program foo bar'). The
  294. command line can used double quotes to group arguments with spaces
  295. in them to pass to the program, e.g. ('/path/to/program/name -p "foo
  296. bar"'). Note that the value of 'command' may include Python string
  297. expressions, e.g. "/path/to/programname --port=80%(process_num)02d"
  298. might expand to "/path/to/programname --port=8000" at runtime.
  299. String expressions are evaluated against a dictionary containing
  300. "group_name", "process_num" and "program_name". **Controlled
  301. programs should themselves not be daemons, as supervisord assumes it
  302. is responsible for daemonizing its subprocesses (see "Nondaemonizing
  303. of Subprocesses" later in this document).**
  304. 'process_name' -- a Python string expression that is used to compose
  305. the supervisor process name for this process. You usually don't
  306. need to worry about setting this unless you change 'numprocs'. The
  307. string expression is evaluated against a dictionary that includes
  308. "group_name", "process_num" and "program_name". Default:
  309. %(program_name)s.
  310. 'numprocs' -- Supervisor will start as many instances of this
  311. program as named by numprocs. Note that if numprocs > 1, the
  312. 'process_name' expression must include '%(process_num)s' (or any
  313. other valid Python string expression that includes 'process_num')
  314. within it. Default: 1.
  315. 'priority' -- the relative priority of the program in the start and
  316. shutdown ordering. Lower priorities indicate programs that start
  317. first and shut down last at startup and when aggregate commands are
  318. used in various clients (e.g. "start all"/"stop all"). Higher
  319. priorities indicate programs that start last and shut down first.
  320. Default: 999.
  321. 'autostart' -- If true, this program will start automatically when
  322. supervisord is started. Default: true.
  323. 'autorestart' -- If true, when the program exits (either expectedly
  324. or unexpectedly), supervisor will restart it automatically.
  325. Default: true.
  326. 'startsecs' -- The total number of seconds which the program needs
  327. to stay running after a startup to consider the start successful.
  328. If the program does not stay up for this many seconds after it is
  329. started, even if it exits with an "expected" exit code (see
  330. "exitcodes"), the startup will be considered a failure. Set to 0
  331. to indicate that the program needn't stay running for any particular
  332. amount of time. Default: 1
  333. 'startretries' -- The number of serial failure attempts that
  334. supervisord will allow when attempting to start the program before
  335. giving up and puting the process into an ERROR state. Default: 3.
  336. 'exitcodes' -- The list of 'expected' exit codes for this program.
  337. Supervisor log messages will note if the program exits with an exit
  338. code which is not in this list and a stop of the program has not
  339. been explicitly requested. Default: 0,2.
  340. 'stopsignal' -- The signal used to kill the program when a stop is
  341. requested. This can be any of TERM, HUP, INT, QUIT, KILL, USR1, or
  342. USR2. Default: TERM.
  343. 'stopwaitsecs' -- The number of seconds to wait for the program to
  344. return a SIGCHILD to supervisord after the program has been sent a
  345. stopsignal. If this number of seconds elapses before supervisord
  346. receives a SIGCHILD from the process, supervisord will attempt to
  347. kill it with a final SIGKILL. Default: 10.
  348. 'user' -- If supervisord is running as root, this UNIX user account
  349. will be used as the account which runs the program. If supervisord
  350. is not running as root, this option has no effect. Defaut: do not
  351. switch users.
  352. 'redirect_stderr' -- If true, cause the process' stderr output to be
  353. sent back to supervisor on it's stdout file descriptor (in UNIX
  354. shell terms, this is the equivalent of executing "/the/program
  355. 2>&1". Default: false.
  356. 'stdout_logfile' -- Put process stdout output in this file (and if
  357. redirect_stderr is true, also place stderr output in this file). If
  358. 'stdout_logfile' is unset or set to 'AUTO', supervisor will
  359. automatically choose a file location. If this is set to 'NONE',
  360. supervisord will create no log file. AUTO log files and their
  361. backups will be deleted when supervisord restarts. Default: AUTO.
  362. 'stdout_logfile_maxbytes' -- The maximum number of bytes that may be
  363. consumed by stdout_logfile before it is rotated (suffix multipliers
  364. like "KB", "MB", and "GB" can be used in the value). Set this value
  365. to 0 to indicate an unlimited log size. Default: 50MB.
  366. 'stdout_logfile_backups' -- The number of stdout_logfile backups to
  367. keep around resulting from process stdout log file rotation. Set
  368. this to 0 to indicate an unlimited number of backups. Default: 10.
  369. 'stdout_capturefile' -- file written to when process is in "stdout
  370. capture mode" (see "Capture Mode and Process Communication Events"
  371. later in this document). May be a file path, NONE, or AUTO.
  372. Default: AUTO.
  373. 'stderr_logfile' -- Put process stderr output in this file unless
  374. redirect_stderr is true. Accepts the same value types as
  375. "stdout_logfile". Default: AUTO.
  376. 'stderr_logfile_maxbytes' -- The maximum number of bytes before
  377. logfile rotation for stderr_logfile. Accepts the same value types
  378. as "stdout_logfile_maxbytes". Default: 50MB.
  379. 'stderr_logfile_backups' -- The number of backups to keep around
  380. resulting from process stderr log file rotation. Default: 10.
  381. 'stderr_capturefile' -- file written to when process is in "stderr
  382. capture mode" (see "Capture Mode and Process Communication Events"
  383. later in this document). May be a file path, NONE, or AUTO.
  384. Default: AUTO.
  385. 'environment' -- A list of key/value pairs in the form
  386. "KEY=val,KEY2=val2" that will be placed in the child process'
  387. environment. Default: none. **Note** that the subprocess will
  388. inherit the environment variables of the shell used to start
  389. "supervisord" except for the ones overridden here. See "Subprocess
  390. Environment" below.
  391. Configuration File '[group:x]' Section Settings
  392. XXX TODO
  393. Configuration File '[eventlistener:x]' Section Settings
  394. XXX TODO
  395. Configuration File '[rpcinterface:x]' Section Settings (ADVANCED)
  396. Changing "rpcinterface:x" settings in the configuration file is only
  397. useful for people who wish to extend supervisor with additional
  398. behavior.
  399. In the sample config file, there is a section which is named
  400. "rpcinterface:supervisor". By default it looks like this:
  401. [rpcinterface:supervisor]
  402. supervisor.rpcinterface_factory = supervisor.xmlrpc:make_main_rpcinterface
  403. This section must remain in the configuration for the standard setup
  404. of supervisor to work properly. If you don't want supervisor to do
  405. anything it doesn't already do out of the box, this is all you need
  406. to know about this type of section.
  407. However, if you wish to add rpc interface namespaces to a custom
  408. version of supervisor, you may add additional [rpcinterface:foo]
  409. sections, where "foo" represents the namespace of the interface
  410. (from the web root), and the value named by
  411. "supervisor.rpcinterface_factory" is a factory callable which should
  412. have a function signature that accepts a single positional argument
  413. "supervisord" and as many keyword arguments as required to perform
  414. configuration. Any key/value pairs defined within the
  415. rpcinterface:foo section will be passed as keyword arguments to the
  416. factory. Here's an example of a factory function, created in the
  417. package "my.package"::
  418. def make_another_rpcinterface(supervisord, **config):
  419. retries = int(config.get('retries', 0))
  420. another_rpc_interface = AnotherRPCInterface(supervisord, retries)
  421. return another_rpc_interface
  422. And a section in the config file meant to configure
  423. it::
  424. [rpcinterface:another]
  425. supervisor.rpcinterface_factory = my.package:make_another_rpcinterface
  426. retries = 1
  427. Nondaemonizing of Subprocesses
  428. Programs run under supervisor *should not* daemonize themselves.
  429. Instead, they should run in the foreground and not detach from the
  430. "terminal" that starts them. The easiest way to tell if a command
  431. will run in the foreground is to run the command from a shell
  432. prompt. If it gives you control of the terminal back, it's
  433. daemonizing itself and that will be the wrong way to run it under
  434. supervisor. You want to run a command that essentially requires you
  435. to press Ctrl-C to get control of the terminal back. If it gives
  436. you a shell prompt back after running it without needing to press
  437. Ctrl-C, it's not useful under supervisor. All programs have options
  438. to be run in the foreground but there's no standard way to do it;
  439. you'll need to read the documentation for each program you want to
  440. do this with.
  441. Subprocess Environment
  442. Subprocesses will inherit the environment of the shell used to start
  443. the supervisord program. Several environment variables will be set
  444. by supervisor itself in the child's environment also, including
  445. "SUPERVISOR_ENABLED" (a flag indicating the process is under
  446. supervisor control), "SUPERVISOR_PROCESS_NAME" (the
  447. config-file-specified process name for this process) and
  448. "SUPERVISOR_GROUP_NAME" (the config-file-specified process group name
  449. for the child process).
  450. These environment variables may be overridden within the
  451. "environment" global config option (applies to all subprocesses) or
  452. within the per-program "environment" config option (applies only to
  453. the subprocess specified within the "program" section). These
  454. "environment" settings are additive. In other words, each
  455. subprocess' environment will consist of::
  456. The environment variables set within the shell used to start
  457. supervisord...
  458. ... added-to/overridden-by ...
  459. ... the environment variables set within the "environment" global
  460. config option ...
  461. ... added-to/overridden-by ...
  462. ... supervisor-specific environment variables
  463. ("SUPERVISOR_ENABLED", "SUPERVISOR_PROCESS_NAME",
  464. "SUPERVISOR_GROUP_NAME") ..
  465. ... added-to/overridden-by ...
  466. .. the environment variables set within the per-process
  467. "environment" config option.
  468. No shell is executed by supervisord when it runs a subprocess, so
  469. settings such as USER, PATH, HOME, SHELL, LOGNAME, etc. are not
  470. changed from their defaults or otherwise reassigned. This is
  471. particularly important to note when you are running a program from a
  472. supervisord run as root with a "user=" stanza in the configuration.
  473. Unlike cron, supervisord does not attempt to divine and override
  474. "fundamental" environment variables like USER, PATH, HOME, and
  475. LOGNAME when it performs a setuid to the user defined within the
  476. "user=" program config option. If you need to set environment
  477. variables for a particular program that might otherwise be set by a
  478. shell invocation for a particular user, you must do it explicitly
  479. within the "environment=" program config option. For example::
  480. [program:apache]
  481. command=/home/chrism/bin/httpd -DNO_DETACH
  482. user=chrism
  483. environment=HOME=/home/chrism,USER=chrism
  484. Examples of Program Configurations
  485. Apache 2.0.54::
  486. [program:apache]
  487. command=/usr/sbin/httpd -DNO_DETACH
  488. Postgres 8.14::
  489. [program:postgres]
  490. command=/path/to/postmaster
  491. ; we use the "fast" shutdown signal SIGINT
  492. stopsignal=INT
  493. redirect_stderr=true
  494. Zope 2.8 instances and ZEO::
  495. [program:zeo]
  496. command=/path/to/runzeo
  497. priority=1
  498. [program:zope1]
  499. command=/path/to/instance/home/bin/runzope
  500. priority=2
  501. redirect_stderr=true
  502. [program:zope2]
  503. command=/path/to/another/instance/home/bin/runzope
  504. priority=2
  505. redirect_stderr=true
  506. OpenLDAP slapd::
  507. [program:slapd]
  508. command=/path/to/slapd -f /path/to/slapd.conf -h ldap://0.0.0.0:8888
  509. Process States
  510. A process controlled by supervisord will be in one of the below
  511. states at any given time. You may see these state names in various
  512. user interface elements.
  513. STOPPED (0) -- The process has been stopped due to a stop request or
  514. has never been started.
  515. STARTING (10) -- The process is starting due to a start request.
  516. RUNNING (20) -- The process is running.
  517. BACKOFF (30) -- The process entered the STARTING state but
  518. subsequently exited too quickly to move to the
  519. RUNNING state.
  520. STOPPING (40) -- The process is stopping due to a stop request.
  521. EXITED (100) -- The process exited from the RUNNING state (expectedly
  522. or unexpectedly).
  523. FATAL (200) -- The process could not be started successfully.
  524. UNKNOWN (1000) -- The process is in an unknown state (programming error).
  525. Process progress through these states as per the following directed
  526. graph::
  527. --> STOPPED
  528. / |
  529. | |
  530. | |
  531. STOPPING |
  532. ^ ^ V
  533. | \--- STARTING <-----> BACKOFF
  534. | / ^ |
  535. | V | |
  536. \-- RUNNING / \ |
  537. | / \ V
  538. V / \ ----- FATAL
  539. EXITED
  540. A process is in the STOPPED state if it has been stopped
  541. adminstratively or if it has never been started.
  542. When an autorestarting process is in the BACKOFF state, it will be
  543. automatically restarted by supervisord. It will switch between
  544. STARTING and BACKOFF states until it becomes evident that it cannot
  545. be started because the number of startretries has exceeded the
  546. maximum, at which point it will transition to the FATAL state. Each
  547. start retry will take progressively more time.
  548. An autorestarted process will never be automtatically restarted if
  549. it ends up in the FATAL state (it must be manually restarted from
  550. this state).
  551. A process transitions into the STOPPING state via an administrative
  552. stop request, and will then end up in the STOPPED state.
  553. A process that cannot be stopped successfully will stay in the
  554. STOPPING state forever. This situation should never be reached
  555. during normal operations as it implies that the process did not
  556. respond to a final SIGKILL, which is "impossible" under UNIX.
  557. State transitions which always require user action to invoke are
  558. these:
  559. FATAL -> STARTING
  560. RUNNING -> STOPPING
  561. State transitions which typically, but not always, require user
  562. action to invoke are these, with exceptions noted:
  563. STOPPED -> STARTING (except at supervisord startup if process is
  564. configured to autostart)
  565. EXITED -> STARTING (except if process is configured to autorestart)
  566. All other state transitions are managed by supervisord
  567. automatically.
  568. Supervisor Events
  569. At certain predefined points during supervisord's operation, "event
  570. notifications" are emitted. An event notification implies that
  571. something potentially interesting happened. Event listeners (see
  572. the "Event Listeners" section below) can be configured to subscribe
  573. to event notifications selectively, and may perform arbitrary
  574. actions based on an event notification (send email, make an HTTP
  575. request, etc).
  576. Event types that may be subscribed to by event listeners are
  577. predefined by supervisor and fall into several major categories,
  578. including "process state change", "process communication",
  579. "supervisor state change", and "event system meta" events. These
  580. are described in detail below.
  581. EVENT -- The base event type. This event type is abstract. It will
  582. never be sent directly. Subscribing to this event type will cause a
  583. subscriber to receive all event notifications emitted by supervisor.
  584. Subtypes of EVENT:
  585. PROCESS_STATE_CHANGE -- The value of this event type will be the
  586. process name. This event type is abstract, it will never be sent
  587. directly. Subscribing to this event type will cause a subscriber
  588. to receive event notifications of all the types listed below in
  589. "Subtypes of PROCESS_STATE_CHANGE".
  590. The serialized body of a PROCESS_STATE_CHANGE event (and all
  591. subtypes) is in the form::
  592. process_name: <name>
  593. group_name: <name>
  594. Subtypes of PROCESS_STATE_CHANGE:
  595. PROCESS_STATE_CHANGE_STARTING -- indicates a process has moved
  596. from a state to the STARTING state. Subscribing to this event
  597. type will cause a subscriber to receive event notifications of
  598. all the types listed below in "Subtypes of
  599. PROCESS_STATE_CHANGE_STARTING".
  600. Subtypes of PROCESS_STATE_CHANGE_STARTING:
  601. PROCESS_STATE_CHANGE_STARTING_FROM_STOPPED -- subtype of
  602. PROCESS_STATE_CHANGE_STARTING, indicates a process has moved
  603. from the STOPPED state from the STARTING state.
  604. PROCESS_STATE_CHANGE_STARTING_FROM_BACKOFF -- subtype of
  605. PROCESS_STATE_CHANGE_STARTING, indicates a process has moved
  606. from BACKOFF state to the STARTING state.
  607. PROCESS_STATE_CHANGE_STARTING_FROM_EXITED -- subtype of
  608. PROCESS_STATE_CHANGE_STARTING, indicates a process has moved
  609. from the EXITED state to the STARTING state.
  610. PROCESS_STATE_CHANGE_STARTING_FROM_FATAL -- subtype of
  611. PROCESS_STATE_CHANGE_STARTING, indicates a process has moved
  612. to the FATAL state to the STARTING state.
  613. PROCESS_STATE_CHANGE_RUNNING_FROM_STARTING -- inidicates a
  614. process has moved from the STARTING state to the RUNNING state.
  615. PROCESS_STATE_CHANGE_BACKOFF_FROM_STARTING -- indicates a
  616. process has moved from the STARTING state to the BACKOFF state.
  617. PROCESS_STATE_CHANGE_STOPPING_FROM_RUNNING -- indicates a
  618. process has moved from the RUNNING state to the STOPPING state.
  619. PROCESS_STATE_CHANGE_STOPPING_FROM_STARTING -- indicates a
  620. process has moved from the RUNNING state to the STARTING state.
  621. PROCESS_STATE_CHANGE_EXITED_OR_STOPPED -- indicates a process
  622. has undergone a state change which caused it to move to the
  623. EXITED or STOPPED state.
  624. Subtypes of PROCESS_STATE_CHANGE_EXITED_OR_STOPPED:
  625. PROCESS_STATE_CHANGE_EXITED_FROM_RUNNNING -- indicates a
  626. process has moved from the RUNNING state to the EXITED state.
  627. PROCESS_STATE_CHANGE_STOPPED_FROM_STOPPING -- indicates a
  628. process has moved from the STOPPING state to the STOPPED
  629. state.
  630. PROCESS_STATE_CHANGE_FATAL_FROM_BACKOFF -- indicates a process
  631. has moved from the BACKOFF state to the FATAL state.
  632. PROCESS_STATE_CHANGE_TO_UNKNOWN -- indicates a process has moved
  633. from a state to the UNKNOWN state (indicates an error in
  634. supervisord).
  635. PROCESS_COMMUNICATION -- an event type raised when any process
  636. attempts to send information between <!--XSUPERVISOR:BEGIN--> and
  637. <!--XSUPERVISOR:END--> tags in its output. This event type is
  638. abstract, it will never be sent directly. Subscribing to this
  639. event type will cause a subscriber to receive event notifications
  640. of all the types listed below in "Subtypes of
  641. PROCESS_COMMUNICATION".
  642. The serialized body of a PROCESS_COMMUNICATION event (and all
  643. subtypes) is::
  644. process_name: <name>
  645. group_name: <name>
  646. <data>
  647. Subtypes of PROCESS_COMMUNICATION:
  648. PROCESS_COMMUNICATION_STDOUT -- indicates a process has sent a
  649. message to supervisor on its stdout file descriptor.
  650. PROCESS_COMMUNICATION_STDERR -- indicates a process has sent a
  651. message to supervisor on its stderr file descriptor.
  652. SUPERVISOR_STATE_CHANGE -- an event type raised when supervisor's
  653. state changes. There is no value. Subscribing to this event type
  654. will cause a subscriber to receive event notifications of all the
  655. types listed below in "Subtypes of SUPERVISOR_STATE_CHANGE".
  656. The serialization of a SUPERVISOR_STATE_CHANGE event is the empty
  657. string.
  658. Subtypes of SUPERVISOR_STATE_CHANGE:
  659. SUPERVISOR_STATE_CHANGE_RUNNING -- indicates that supervisor has
  660. started.
  661. SUPERVISOR_STATE_CHANGE_STOPPING -- indicates that supervisor is
  662. stopping or restarting.
  663. EVENT_BUFFER_OVERFLOW -- an event type raised when a listener
  664. pool's event buffer is overflowed (as can happen when an event
  665. listener pool cannot keep up with all of the events sent to it).
  666. When the pool's event buffer is overflowed, the oldest event in
  667. the buffer is thrown out.
  668. The serialization of an EVENT_BUFFER_OVERFLOW body
  669. is::
  670. group_name: <name>
  671. event_type: <type of discarded event>
  672. Event Listeners
  673. Supervisor event listeners are subprocesses which are treated almost
  674. exactly like supervisor "programs" with the following differences:
  675. - They are defined using an [eventlistener:x] section in the config
  676. file instead of a [program:x] section in the configuration file.
  677. - Supervisor sends specially-formatted input to an event listener's
  678. stdin and expects specially-formatted output from an event
  679. listener's stdout in a request-response cycle. A protocol agreed
  680. upon between supervisor and the listener's implementer allows
  681. listeners to process event notifications.
  682. - Supervisor does not respect "capture mode" output from event
  683. listener processes (see "Capture Mode and Process Communication
  684. Events" elsewhere in this document).
  685. When an [eventlistener:x] section is defined, it actually defines a
  686. "pool", where the number of event listeners in the pool is
  687. determined by the "numprocs" value within the section. Every
  688. process in the event listener pool is treated equally by supervisor,
  689. and supervisor will choose one process from the pool to receive
  690. event notifications (filtered by the "events=" key in the
  691. eventlistener section).
  692. An event listener can send arbitrary output to its stderr, which
  693. will be logged or ignored by supervisord depending on the
  694. stderr-related configuration files in its [eventlistener:x] section.
  695. When an event notification is sent by the supervisor, all event
  696. listener pools which are subscribed to receive events for the
  697. event's type will be found. One of the listeners in each listener
  698. pool will receive the event notification (any "available" listener).
  699. If the event cannot be sent because all listener in a pool are
  700. "busy", the event will be buffered and notification will be retried
  701. later. "Later" is defined as "the next time that supervisord's
  702. select loop executes".
  703. A listener pool has an event buffer queue. The queue is sized via
  704. the listener pool's "buffer_size" config file option. If the queue
  705. is full and supervisor attempts to buffer an event, supervisor will
  706. throw away the oldest event in the buffer, log an error, and send an
  707. EVENT_BUFFER_OVERFLOW event. EVENT_BUFFER_OVERFLOW events are never
  708. themselves buffered.
  709. Event listeners can be implemented in any language. Event listeners
  710. can be long-running or may exit after a single request (depending on
  711. the implementation and the "autorestart" parameter in the
  712. eventlistener's configuration).
  713. An event listener implementation should operate in "unbuffered" mode
  714. or should flush its stdout every time it needs to communicate back
  715. to the supervisord process.
  716. Event Listener States
  717. An event listener process has three possible states that are
  718. maintained by supervisord:
  719. ACKNOWLEDGED -- The event listener has acknowledged (accepted or
  720. rejected) an event send.
  721. READY -- Event notifications may be sent to this event listener.
  722. BUSY -- Event notifications may not be sent to this event
  723. listener.
  724. When an event listener process first starts, supervisor
  725. automatically places it into the ACKNOWLEDGED state to allow for
  726. startup activities or guard against startup failures (hangs). Until
  727. the listener sends a READY token to its stdout, it will stay in this
  728. state.
  729. When supervisord sends an event notification to a listener in the
  730. READY state, the listener will be placed into the BUSY state until
  731. it receives an OK or FAILED response from the listener, at which
  732. time, the listener will be transitioned back into the ACKNOWLEDGED
  733. state.
  734. Event Listener Notification Protocol
  735. Supervisord will notify an event listener in the READY state of an
  736. event by sending data to the stdin of the process. Supervisord will
  737. never send anything to the stdin of an event listener process while
  738. that process is in the BUSY or ACKNOWLEDGED state.
  739. When supervisord sends a notification to an event listener process,
  740. the listener will first be sent a single "header" line on its
  741. stdin. The composition of the line is a set of four tokens separated
  742. by single spaces. The line is terminated with a '\n' (linefeed)
  743. character. The tokens on the line are:
  744. <PROTOCOL_VERSION> <EVENT_TYPE_NAME> <EVENT_SERIAL_NUM> <PAYLOAD_LENGTH>
  745. The PROTOCOL_VERSION always consists of "SUPERVISORD" followed
  746. immediately by numeric characters indicating the protocol version,
  747. with no whitespace in between. An example: "SUPERVISOR3.0"
  748. The EVENT_TYPE_NAME is the specific event type name (see "Supervisor
  749. Events" elsewhere in this document). An example:
  750. "PROCESS_COMMUNICATION_STDOUT".
  751. The EVENT_SERIAL_NUM is an integer assigned to each event. It is
  752. useful for functional testing. An example: "30".
  753. The PAYLOAD_LENGTH is an integer indicating the number of bytes in
  754. the event payload. An example: "22".
  755. An example of a complete header line:
  756. SUPERVISOR3.0 PROCESS_COMMUNICATION_STDOUT 30 22\n
  757. Directly following the linefeed character in the header is the event
  758. payload. It consists of PAYLOAD_LENGTH bytes representing a
  759. serialization of the event data. See "Supervisor Events" for the
  760. specific event data serialization definitions. An example payload
  761. for a PROCESS_COMMUNICATION_STDOUT event notification is::
  762. process_name: foo
  763. group_name: bar
  764. This is the data that was sent between the tags
  765. Once it has processed the header, the event listener implementation
  766. should read PAYLOAD_LENGTH bytes from its stdin, perform an
  767. arbitrary action based on the values in the header and the data
  768. parsed out of the serialization. It is free to block for an
  769. arbitrary amount of time while doing this. Supervisor will continue
  770. processing normally as it waits for a response and it will send
  771. other events of the same type to other listener processes in the
  772. same pool as necessary.
  773. After the event listener has processed the event serialization, in
  774. order to notify supervisord about the result, it should send either
  775. an "OK" token or a "FAILED" token immediately followed by a carriage
  776. return character to its stdout. If supervisord receives an "OK"
  777. token, it will assume that the listener processed the event
  778. notification successfully. If it receives a "FAILED" token, it will
  779. assume that the listener has failed to process the event, and the
  780. event will be rebuffered and sent again at a later time. The event
  781. listener may reject the event for any reason by returning a "FAILED"
  782. token. This does not indicate a problem with the event data or the
  783. event listener. Once an "OK" or "FAILED" token is received by
  784. supervisord, the event listener is placed into the ACKNOWLEDGED
  785. state.
  786. Once the listener is in the ACKNOWLEDGED state, it may either exit
  787. (and subsequently be restarted by supervisor if its "autorestart"
  788. config parameter is true), or it may continue running. If it
  789. continues to run, in order to be placed back into the READY state by
  790. supervisord, it must send a "READY" token followed immediately by a
  791. carriage return to its stdout.
  792. Example Event Listener Implementation
  793. A Python implementation of a "long-running" event listener which
  794. accepts an event notification, prints the header and a list of event
  795. serial numbers it has received to its stderr, and responds with an
  796. OK, and then subsequently a READY is as follows::
  797. import sys
  798. L = []
  799. def stdout_write(s):
  800. sys.stdout.write(s)
  801. sys.stdout.flush()
  802. def stderr_write(s):
  803. sys.stderr.write(s)
  804. sys.stderr.flush()
  805. while 1:
  806. stdout_write('READY\n')
  807. line = sys.stdin.readline()
  808. stderr_write(line)
  809. ver, event, serial, length = line.split(' ', 3)
  810. L.append(serial)
  811. data = sys.stdin.read(int(length))
  812. stderr_write(str(L))
  813. stdout_write('OK\n')
  814. Event Listener Error Conditions
  815. If the event listener process dies while the event is being
  816. transmitted to its stdin, or if it dies before sending an OK/FAILED
  817. response back to supervisord, the event is assumed to not be
  818. processed and will be rebuffered by supervisord and sent again
  819. later.
  820. If an event listener sends data to its stdout which supervisor does
  821. not recognize as an appropriate response based on the state that the
  822. event listener is in, the event listener will be placed into the
  823. UNKNOWN state, and no further event notifications will be sent to
  824. it. If an event was being processed by the listener during this
  825. time, it will be rebuffered and sent again later.
  826. Capture Mode and Process Communication Events
  827. XXX TODO
  828. Signals
  829. Killing supervisord with SIGHUP will stop all processes, reload the
  830. configuration from the config file, and restart all processes.
  831. Killing supervisord with SIGUSR2 will close and reopen the
  832. supervisord activity log and child log files.
  833. Access Control
  834. The UNIX permissions on the socket effectively control who may send
  835. commands to the server. HTTP basic authentication provides access
  836. control for internet and UNIX domain sockets as necessary.
  837. Security Notes
  838. I have done my best to assure that use of a supervisord process
  839. running as root cannot lead to unintended privilege escalation, but
  840. caveat emptor. Particularly, it is not as paranoid as something
  841. like DJ Bernstein's "daemontools", inasmuch as "supervisord" allows
  842. for arbitrary path specifications in its configuration file to which
  843. data may be written. Allowing arbitrary path selections can create
  844. vulnerabilities from symlink attacks. Be careful when specifying
  845. paths in your configuration. Ensure that supervisord's
  846. configuration file cannot be read from or written to by unprivileged
  847. users and that all files installed by the supervisor package have
  848. "sane" file permission protection settings. Additionally, ensure
  849. that your PYTHONPATH is sane and that all Python standard library
  850. files have adequate file permission protections. Then, pray to the
  851. deity of your choice.
  852. Other Notes
  853. Some examples of shell scripts to start services under supervisor
  854. can be found "here":http://www.thedjbway.org/services.html. These
  855. examples are actually for daemontools but the premise is the same
  856. for supervisor. Another collection of recipes for starting various
  857. programs in the foreground is
  858. "here":http://smarden.org/runit/runscripts.html .
  859. Some processes (like mysqld) ignore signals sent to the actual
  860. process/thread which is created by supervisord. Instead, a
  861. "special" thread/process is created by these kinds of programs which
  862. is responsible for handling signals. This is problematic, because
  863. supervisord can only kill a pid which it creates itself, not any
  864. child thread or process of the program it creates. Fortunately,
  865. these programs typically write a pidfile which is meant to be read
  866. in order to kill the process. As a workaround for this case, a
  867. special "pidproxy" program can handle startup of these kinds of
  868. processes. The pidproxy program is a small shim that starts a
  869. process, and upon the receipt of a signal, sends the signal to the
  870. pid provided in a pidfile. A sample supervisord configuration
  871. program entry for a pidproxy-enabled program is provided here::
  872. [program:mysql]
  873. command=/path/to/pidproxy /path/to/pidfile /path/to/mysqld_safe
  874. The pidproxy program is named 'pidproxy.py' and is in the
  875. supervisor distribution.
  876. FAQ
  877. My program never starts and supervisor doesn't indicate any error:
  878. Make sure the "x" bit is set on the executable file you're using in
  879. the command= line.
  880. How can I tell if my program is running under supervisor? Supervisor
  881. and its subprocesses share an environment variable
  882. "SUPERVISOR_ENABLED". When a process is run under supervisor, your
  883. program can check for the presence of this variable to determine
  884. whether it is running under supervisor (new in 2.0).
  885. My command line works fine when I invoke it by hand from a shell
  886. prompt, but when I use the same command line in a supervisor
  887. "command=" section, the program fails mysteriously. Why? This may
  888. be due to your process' dependence on environment variable settings.
  889. See "Subprocess Environment" in this document.
  890. Reporting Bugs
  891. Please report bugs at http://www.plope.com/software/collector .
  892. Author Information
  893. Chris McDonough (chrism@plope.com)
  894. http://www.plope.com