================================================================================ sclapp ================================================================================ -------------------------------------------------------------------------------- A framework for python command-line applications. -------------------------------------------------------------------------------- :Author: Forest Bond :Copyright: © 2005-2007 Forest Bond :License: GPL version 2 (see the COPYING file) Overview ======== sclapp is a Python module that makes it easier to write well-behaved command-line applications. Good command-line applications respond in a consistent manner to common situations that occur during normal use in a shell. Programs that do not behave consistently with other command-line applications are not intuitive for command-line users. This makes them unpleasant to use. sclapp helps command-line programs deal with the following issues: * Signal handling * Terminal character encodings * Standard output failures (broken pipes) * Common command-line options (like --help and --version) Using sclapp to implement functionality most command-line programs should implement reduces boiler-plate code and increases consistency across applications. In addition to these standard features, sclapp also provides other functionality that developers of command-line programs may find helpful. Many of these features we added as I needed them for my programs, but are general enough that others may find use for them as well. A Note On The ``locale`` Module =============================== The standard Python ``locale`` module always returns "mac-roman" on Darwin systems, ignoring the LC_* and LANG environment variables. sclapp takes special precautions against this by wrapping the import with some code that works around this quirk. To take advantage of this, callers should not import the ``locale`` module directly, but rather import it directly from sclapp:: from sclapp import locale Wrapping The Main Function ========================== Wrapping the main function with some additional functionality is, to a limited extent, the raison d'être of the sclapp module. The goals of these features are the following: * Improved signal handling on POSIX systems * More graceful handling of stdout and stderr failure conditions * Reduced boiler-plate code and increased consistency for: - Standard help (--help/-h) and (--version/-v) options - User-friendly handling of uncaught exceptions - Reporting critical failures to the user - Daemonization - Stdio character decoding and encoding - Passing of sys.argv into the main function At its simplest, utilizing sclapp's main-wrapping features is as easy as adding a decorator to our main function:: >>> import sys, sclapp >>> @sclapp.main_function ... def main(argv): ... do_something() >>> if __name__ == '__main__': ... sys.exit(main()) While this example utilizes the main_function decorator, the mainWrapper function could have been just as easily used:: >>> import sys, sclapp >>> def main(argv): ... do_something() >>> main = sclapp.mainWrapper(main) >>> if __name__ == '__main__': ... sys.exit(main()) The differences between the main_function decorator and the mainWrapper function are covered in detail below. While duplicate examples will not be provided throughout the entirety of this document, be aware that both methods can be used to achieve the same end. The distinction between the two is largely syntactic. The following sub-sections discuss the benefits provided by sclapp's main function-wrapping features. Improved Signal Handling ------------------------ Rationale ~~~~~~~~~ Python's default signal handling behavior is less than satisfactory for most command-line applications. The following simple example illustrates this:: #!/usr/bin/env python '''yes.py: a yes(1) pseudo-clone in Python''' import sys def main(argv): if len(argv) > 1: string = ' '.join(argv[1:]) else: string = 'y' while True: print string if __name__ == '__main__': main(sys.argv) Let's try our yes clone out. If we run it, we'll see an endless stream of y's fly by on the screen. To stop the program, our instinct would be to press Control-C to send it an interrupt:: $ python yes.py y y y [...] y y y Traceback (most recent call last): File "yes.py", line 16, in main(sys.argv) File "yes.py", line 13, in main print string KeyboardInterrupt Python converts SIGINT to a KeyboardInterrupt (a perfectly reasonable thing to do, given Python's cross-platform nature), which causes a traceback to be printed when we press Control-C. The program does quit like we wanted it to, but this error message is surely not going to go over well with users. In order to write a good yes clone, we'll need to handle KeyboardInterrupt's. How about another experiment:: $ python yes.py | head -n5 y y y y y Traceback (most recent call last): File "yes.py", line 16, in main(sys.argv) File "yes.py", line 13, in main print string IOError: [Errno 32] Broken pipe We get another traceback, but with a different exception. Here, Python is converting SIGPIPE into an IOError. Again, this is probably reasonable behavior for many cross-platform programs, but programs that are designed for use at the command-line will need to handle this better. Pipes are a key operation with UNIX shells. Other situations can be tested, too. Python responds to SIGHUP by immediately exiting, with no exception raised, and printing "Hangup" to standard error. With our yes clone that's not too far from what we want, but what if our program required some cleanup action to be performed prior to exiting? With no exception raised, our cleanup code would never have the opportunity to run. The problem here is that we really do want (or need) to handle signals explicitly in order to avoid these issues. There are a few reasons many Python command-line programs don't actually do this, though: * It's easy to write bad signal-handling code that is either not responsive enough, incorrect, or unreliable. * Writing signal handlers for every Python program is a lot of work, and would result in a lot of duplicate code. * It takes extra attention to handle signals explicitly for systems that support them, without breaking cross-platform compatibility. sclapp makes it possible to handle signals correctly with minimal code changes, and eliminates the duplication of code that would be the result of writing signal handlers for every program. Handling Signals The sclapp Way ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A more behaviorally correct yes clone could be written as follows:: #!/usr/bin/env python '''yes.py: a yes(1) pseudo-clone in Python''' import sys, sclapp @sclapp.main_function def main(argv): if len(argv) > 1: string = ' '.join(argv[1:]) else: string = 'y' while True: print string if __name__ == '__main__': sys.exit(main()) Using sclapp's default options, SIGINT, SIGPIPE, and SIGHUP trigger immediate exit (with no exception raised). This behavior is perfect for our yes clone. For applications that require more sophisticated signal handling, sclapp's signal-handling strategy is to convert signals to exceptions. While signals can be difficult to deal with appropriately, exceptions, are trivially handled in Python, which has language constructs to make them easy to work with. If we needed to perform some cleanup prior to exit, the following Pythonic idiom would be more appropriate:: >>> import sys, sclapp >>> @sclapp.main_function( ... exit_signals = ('SIGPIPE', 'SIGINT', 'SIGHUP', 'SIGTERM') ... ) ... def main(argv): ... try: ... do_something() ... finally: ... cleanup() Since signals are mapped to exceptions, we can actually use a try...finally block to execute cleanup actions. This, of course, is the most appropriate way to perform cleanup actions in Python. Note that signals can be specified by name as strings or by number as integers. If specified by name (like in the example above), any of the specified signals which are nonexistent (due to lack of support by the local system) will be silently ignored. However, unsupported signals specified by number will likely generate exceptions from calls to signal.signal. See the documentation for the signal module in the standard library for more information. More Advanced Signal Handling Configurations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Of course, we may need to handle some signals differently than others, depending on the kind of program that we are writing. In general, sclapp considers each signal to be in one of the following four sets: * ``notify_signals`` * ``exit_signals`` * ``default_signals`` * ``ignore_signals`` sclapp expects that callers will specify which signals are of each type at initialization time:: >>> import sys, signal, sclapp >>> @sclapp.main_function( ... notify_signals = (signal.SIGHUP, signal.SIGUSR1), ... exit_signals = (signal.SIGINT, signal.SIGTERM, signal.SIGPIPE), ... default_signals = ( ), ... ignore_signals = (signal.SIGUSR2, signal.SIGALRM), ... ) ... def main(argv): ... try: ... do_something() ... except SignalError, e: ... if e.signum == signal.SIGHUP: ... handle_sighup() ... elif e.signum == signal.SIGUSR1: ... handle_sigusr1() ... else: ... raise These signals are handled as follows: ``notify_signals`` SignalError is raised asynchronously. The program should catch this exception and do something useful with it. ``exit_signals`` ExitSignalError is raised asynchronously, and the program should exit. Preferably, this exception is handled only through the use of a try...finally block (or equivalent). ``default_signals`` The signal is mapped to signal.SIG_DFL. Program behavior is system-specific. Many of the more command signals on many of the more common systems cause immediate program termination, with no exception raised. ``ignore_signals`` The signal is mapped to signal.SIG_IGN. The signal is completely ignored. Note: I need to be able to ignore signals through a critical section, without missing them altogether. sclapp defines two exceptions that may be raised when signals are caught: * SignalError * ExitSignalError Note that ExitSignalError does not inherit from SignalError, so it is appropriate to use a try...except block to handle SignalError exceptions. If a signal number is specified in more than one of the four categories, an AssertionError will be raised. Protected Output ---------------- sclapp's protected output functionality was created with the purpose of circumventing improper program termination due to massive failure of standard I/O. For instance, suppose your program was run like this: $ python myprog.py 2>&1 | head -n2 The problem that occurs in this sort of scenario is simple, but can be destructive. After the first two lines of output, SIGPIPE is sent to the program. Presumably, this triggers the program to exit, but in the process of doing so, many programs will commonly print some messages to stderr to indicate to the user actions being taken and the status of those actions. If enough output is generated, the program may be terminated without an exception being raised. This would be particularly bad if the program depends on an exception initiating some cleanup action. The scheme used by sclapp to deal with this is to disable output from stderr or stdout if it is triggering EPIPE IOError's. If SIGPIPE is mapped as an exit signal, sclapp wait's until SIGPIPE has been caught before disabling the file. Help and Version Options ------------------------ sclapp can automatically handle help and version command-line options for callers. The option literals are not configurable; -h/--help and -v/--version are intercepted and handled appropriately. To take advantage of this functionality, the ``doc`` and ``version`` keyword arguments to main_function/mainWrapper must be specified. Programs needing more sophisticated command-line option handling should probably use the optparse module from the standard library. User-Friendly Handling of Uncaught Exceptions --------------------------------------------- Uncaught Exceptions are bugs, and sclapp handles them by printing a message to stderr indicating that. The exact text of that message can be customized by passing a template string as argument bug_message to main_function or mainWrapper. If this is left unspecified, sclapp uses a default message (let an exception fly to get a peek at that). See Customizing Messages, below, for more information on template strings. Error Reporting (CriticalError, UsageError) ------------------------------------------- Command-line programs have a few different mechanisms by which they notify the user of failure conditions. In many cases, a message indicating the nature of the error that occured should be printed to sys.stderr, and the program should exit with a non-zero return code. Since this functionality is a common requirement of command-line programs, sclapp provides a few exceptions to assist: ``CriticalError`` Indicates that an irrecoverable error that is not a usage error has occured, and the program must terminate. ``UsageError`` Indicates that the user's specification of runtime parameters was erroneous. These exceptions are caught and handled by the main-wrapping code, so callers should raise them to indicate errors. Note that UsageError inherits from CriticalError. CriticalError exceptions accept two optional positional arguments: an exit code, and an error message. If the message is omitted, none will be printed at exit. If the exit code is omitted, the program will exit with zero status. UsageError exceptions accept a signle optional positional argument: an error message. If this error message is not specified, the program will exit with zero status and simply print the usage information for the program, if possible. Otherwise, the message will be printed, the usage information for the program will be printed, and the program will exit with the specified exit status. Note that sclapp's main-wrapping code does not actually call the sys.exit function. Instead, the main function's return value will be the appropriate exit status, and the caller should call sys.exit itself. Daemonization ------------- Daemonization (on UNIX-like systems) is a technique that causes programs to be run in the background, completely detached from a terminal. It is generally accepted that the following four steps must be taken to properly daemonize: 1. The current working directory set to the "/" directory. 2. The current file creation mode mask set to 0. 3. Close all open files (1024). 4. Redirect standard I/O streams to "/dev/null". For further discussion of this, see http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/278731. To this end, the keyword argument ``daemonize`` may be given to the main_function decorator (or the mainWrapper function) and sclapp will daemonize the program before passing control to the caller's main function. Transparent Encoding and Decoding of Standard IO ------------------------------------------------ By default, sclapp will provide conversion to and from the preferred character encoding (as determined by locale.getpreferredencoding) for sys.stdin, sys.stdout, and sys.stderr. That means that unicode objects may be printed directly to sys.stdout and sys.stderr, and the output will be encoded appropriately. Characters read from sys.stdin will be decoded to unicode strings also. There may be some applications for which this behavior would be undesirable. To disable it, pass False for the one or more of the following keyword arguments to sclapp's main_function or mainWrapper: * decode_stdin * encode_stdout * encode_stderr In particular, automatic decoding of sys.stdin causes problems for the built-in functions input and raw_input. Thus, when this decoding feature is engaged, the alternative functions provided by sclapp should be used. Command-Line Argument Decoding ------------------------------ sclapp will automatically decode the command-line arguments to Unicode strings. If the ``decode_argv`` argument to ``main_function`` and ``mainWrapper`` is True, the members of the ``argv`` argument passed to your main function will contain Unicode strings instead of byte strings. This feature is on by default. Customizing Messages -------------------- sclapp uses string templates to format messages. Several arguments to the ``mainWrapper`` function represent customizable messages that the user may see on a variety of occasions: ``version_message`` Used for handling the --version command-line option. ``doc`` Printed to the screen when handling the --help command-line option. ``bug_message`` Printed to the screen when an unhandled exception is caught. Each of these messages can be customized with the following template substitutions: ``name`` The name of the program. ``author`` The author of the program. Suggested format includes the author's e-mail address, like "Forest Bond ". ``version`` A string representing the version of the program, like "0.6.3". ``traceback`` Traceback for the uncaught exception. Only useful for ``bug_message``. So, to customize these messages, use template-style string substitution. For instance:: >>> @sclapp.main_function( ... version = '0.2.5', ... author = 'Forest Bond ', ... doc = '''\ ... Ouch! This looks like a bug! ... ... ${traceback} ... ... Program version: ${version} ... ... Please send the full text of this message in an e-mail to ... ${author}. ... ''' ... ) ... def main(argv): ... return 0 See the documentation for template strings (string.Template) in the standard library for more information on syntax. Modules ======= sclapp.daemonize ---------------- This module contains a function of the same name that will cause the caller's process to become a daemon. See the section Daemonization, above. sclapp.pipes ------------ I implemented some strange functions that make it possible to run Python functions in sub-processes and connect their standard I/O streams together with pipes. At the very least, they are useful for testing behavior with pipes. The functions: :: pipeFns(fns, argses = None, kwargses = None, pipe_out = True, pipe_err = False, stdin = None, stdout = None, stderr = None) The functions in lists ``fns`` are piped in to each other in order from first to last. The positional and keyword arguments for each function is specified as arguments ``argses`` and ``kwargses``, respectively. ``pipe_out`` and ``pipe_err`` are booleans that determine the source of the input stream for the following function. Arguments ``stdin``, ``stdout``, and ``stderr`` may be specified as for ``redirection.redirectFds`` in order to change the default standard I/O streams for the sub-processes in the pipe. sclapp.processes ---------------- This module contains a few classes for managing background processes. It is often useful to run a program or function in the background. The classes: BackgroundFunction( self, function, args, kwargs, stdin = None, stdout = None, stderr = None) This class can be used to manage a Python function running in a sub-process. ``function`` is the Python function to call; ``args`` and ``kwargs`` are the arguments and keyword arguments (respectively) to pass to the function. BackgroundCommand( self, command, args, stdin = None, stdout = None, stderr = None) This class can be used to manage an external program running in a sub-process. The arguments ``command`` and ``args`` should specify the command to run and the arguments to pass to it, as for ``os.execvp``. For both classes, the arguments ``stdin``, ``stdout``, and ``stderr`` are as for ``redirection.redirectFds``, and the sub-process's standard I/O streams will be redirected as indicated prior to launching the caller's function or command. sclapp.redirection ------------------ Implements a single function: ``redirectFds(stdin = None, stdout = None, stderr = None)`` The standard I/O streams will be immediately redirected as specified. Each argument may be one of the following: * None, indicating no redirection for that stream. * A file-like object supporting method ``fileno``. * An integer file descriptor. * A filename. sclapp.services --------------- startService(pid_file_name, fn, args = None, kwargs = None) stopService(pid_file_name) sclapp.shell ------------ This module implements some classes for interacting with ongoing shell sessions. Warning: my testing with the version of bash distributed with Mac OSX 10.4 indicates that the readline support causes problems due to interference with disabling the tty echo function. If you are dealing with such a version of bash, you should pass the ``--noediting`` command-line option to bash in order to disable readline, and the ``--posix`` command-line option for more standards-compliant behavior. Starting A Session ~~~~~~~~~~~~~~~~~~ A shell session is begun by instantiating the approprate class. Generally, you will use the ``Shell`` class. The class initializer has no required arguments, however, the following optional keyword arguments may be specified: shell (default: '/bin/sh') Path to the shell executable; must be Bourne compatible. prompt (default: randomly generated prompt) The desired shell prompt. Be aware that the prompt is the only indication that a command has finished executing, so it is normally set to a reasonably long random string of alpha-numeric characters. See Searching For The Prompt, below. failure_exceptions (default: True) Boolean specifying whether or not a CommandFailed exception should be raised when a command exits with a non-zero return code. signal_exceptions (default: True) Boolean specifying whether or not a CommandSignalled exception should be raised when a command exits due to a signal trace (default: False) If True, commands are printed to stdout prior to being executed. This is for debugging. timeout (default: 30) The initial timeout used by the underling Pexpect object. Note that other timeout values may be used by various ``Shell`` methods as necessary, but the timeout will be restored after those temporary changes. delaybeforesend (default: 100; defined by pexpect module) The number of milliseconds to delay before sending input to the shell process. See the pexpect documentation for more information. Any additional keyword arguments are passed on to the underlying Pexpect object's initializer. The ``SudoShell`` class provides the same functionality as the ``Shell`` class, but the shell session is started using the sudo command. Thus, the shell has root privileges. The ``SudoShell`` initializer takes an additional optional keyword argument, ``password``. If this argument is not specified and sudo requests a password, or if an incorrect password is specified, a ValueError is raised. Executing Commands ~~~~~~~~~~~~~~~~~~ There are a few different ways that commands can be executed, depending upon what kind of behavior is desired. The following methods can be used: ``close()`` Close the tty that the shell is attached to. This should always be called to clean up after the Shell instance. ``execute(cmd, *values)`` Execute ``cmd``, using ``sclapp.shinterp.interpolate`` to interpolate ``values`` into ``cmd``. The return value is the tuple ``(status, output)``, indicating the exit status and output of the command. ``follow(cmd, *values, **kwargs)`` ``cmd`` is executed and ``values`` interpolated as with ``execute``, but ``follow`` is actually a generator. Callers should iterate over the return value to handle each output character individually. For instance:: output = '' for ch in follow('echo foo'): output = output + ch print output The above code would print the string 'foo\n' to the screen. ``follow`` and the following wrapper methods accept a keyword argument ``follow_input``. If True, the executed command is included in the resulting character stream. This argument defaults to False. ``followCallback(cmd, *values, **kwargs)`` Like ``follow``, but rather than yielding each output character to the caller, ``followCallback`` requires a keyword argument, ``callback``, which should be used to pass a callback function that will be called once for each output character. The callback function should accept a single positional argument, the character being handled. Returns the exit status of the command. ``followWrite(cmd, *values, **kwargs)`` Calls ``followCallback`` with a simple callback function that writes each output character to a file-like object. This object can be specified using optional keyword argument ``outfile``, which defaults to ``sys.stdout``. Returns the exit status of the command. ``followWriteReturn(cmd, *values, **kwargs)`` Like ``followWrite``, but in addition to writing each output character to ``outfile``, the command output is also captured and returned as part of a two-tuple like that returned by ``execute``. ``interact(fitted = False)`` Cause the shell to be connected to stdin/stdout/stderr so that the user can use the shell directly. If keyword argument ``fitted`` is set to True, the window size for the underlying tty object is kept in sync with the window size of stdout. It has been observed that commands including lines longer than 4095 characters cause problems with some shells on some systems. The methods above will refuse to execute such a command unless the optional keyword argument ``force`` is passed with a boolean True value. Otherwise, if such a command is passed for execution, a ValueError will be raised. sclapp.shinterp --------------- This module provides some simple functions for performing string interpolation with shell quoting of parameters. For instance: >>> from sclapp import shinterp >>> x, y = 'foo', 'bar' >>> print shinterp.interpolate('cat ? ?', x, y) cat 'foo' 'bar' However, this function will also correctly handle double and single quotes: >>> from sclapp import shinterp >>> x, y = 'fo"o', "ba'r" >>> print shinterp.interpolate('cat ? ?', x, y) cat 'fo"o' 'ba'\''r' To insert a literal question mark, use two of them: >>> from sclapp import shinterp >>> x = 'foo' >>> print shinterp.interpolate('cat ? ??', x) cat 'foo' ? sclapp.stdio_encoding --------------------- This module implements support for transparent character encoding for standard I/O. In addition it contains a few utility functions related to this implementation. Transparent decoding of standard input has been found to conflict with the built-in functions ``raw_input`` and ``input``. As a result, alternatives have been implemented that do not cause problems. These functions can be imported from this module, and are have the same names as the functions they replace. sclapp.termcontrol ------------------ Based on a ASPN Python Cookbook recipe written by Edward Loper, this module wraps some basic functionality provided by the curses module to provide easy access to some simple terminal features. To use the module, the initialization funcdtion ``initializeTermControl`` must first be called. Terminal capabilities are utilized by writing control strings to standard output. These control strings are accessible to callers via the dict ``sclapp.termcontrol.caps`` that is populated during initialization. If a given terminal capability is not supported, the corresponding value in the ``caps`` dict will be the empty string. Thus, most applications will degrade gracefully when dealing with a terminal lacking capabilities. The following keys are used to access terminal capabilities in the ``caps`` dict: BOL Move the cursor to the beginning of the line. UP Move the cursor up one line. DOWN Move the cursor down one line. LEFT Move the cursor left one column. RIGHT Move the cursor right one column. CLEAR_SCREEN Clear the screen. CLEAR_EOL Clear to the end of the current line. CLEAR_BOL Clear to the beginning of the current line. CLEAR_EOS Clear to the end of the screen. BOLD Use bold text. BLINK Use blinking text. DIM Use dim text. REVERSE Use reverse-color text. UNDERLINE Use underlined text. NORMAL Use normal text (cancels previously set text color and style). HIDE_CURSOR Hide the cursor. SHOW_CURSOR Show the cursor. The following dict keys correspond with various text colors. For each color, the same key can be used with the prefix "BG\_" to affect the background color instead of the foreground color. * BLACK * BLUE * GREEN * CYAN * RED * MAGENTA * YELLOW * WHITE sclapp.util ----------- A few miscellaneous functions: safe_encode(s) Encodes the Unicode string s using the encoding returned by locale.getprefferedencoding() without ever raising a UnicodeEncodeError. safe_decode(s) Decodes the byte string s to unicode using the encoding returned by locale.getpreferredencoding() without ever raising a UnicodeDecodeError.