3
=========================
4
Library and Extension FAQ
5
=========================
9
General Library Questions
10
=========================
12
How do I find a module or application to perform task X?
13
--------------------------------------------------------
15
Check :ref:`the Library Reference <library-index>` to see if there's a relevant
16
standard library module. (Eventually you'll learn what's in the standard
17
library and will able to skip this step.)
19
For third-party packages, search the `Python Package Index
20
<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
21
another Web search engine. Searching for "Python" plus a keyword or two for
22
your topic of interest will usually find something helpful.
25
Where is the math.py (socket.py, regex.py, etc.) source file?
26
-------------------------------------------------------------
28
If you can't find a source file for a module it may be a built-in or
29
dynamically loaded module implemented in C, C++ or other compiled language.
30
In this case you may not have the source file or it may be something like
31
mathmodule.c, somewhere in a C source directory (not on the Python Path).
33
There are (at least) three kinds of modules in Python:
35
1) modules written in Python (.py);
36
2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
37
3) modules written in C and linked with the interpreter; to get a list of these,
41
print sys.builtin_module_names
44
How do I make a Python script executable on Unix?
45
-------------------------------------------------
47
You need to do two things: the script file's mode must be executable and the
48
first line must begin with ``#!`` followed by the path of the Python
51
The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
54
The second can be done in a number of ways. The most straightforward way is to
57
#!/usr/local/bin/python
59
as the very first line of your file, using the pathname for where the Python
60
interpreter is installed on your platform.
62
If you would like the script to be independent of where the Python interpreter
63
lives, you can use the "env" program. Almost all Unix variants support the
64
following, assuming the Python interpreter is in a directory on the user's
69
*Don't* do this for CGI scripts. The $PATH variable for CGI scripts is often
70
very minimal, so you need to use the actual absolute pathname of the
73
Occasionally, a user's environment is so full that the /usr/bin/env program
74
fails; or there's no env program at all. In that case, you can try the
75
following hack (due to Alex Rezinsky)::
79
exec python $0 ${1+"$@"}
82
The minor disadvantage is that this defines the script's __doc__ string.
83
However, you can fix that by adding ::
85
__doc__ = """...Whatever..."""
89
Is there a curses/termcap package for Python?
90
---------------------------------------------
92
.. XXX curses *is* built by default, isn't it?
94
For Unix variants: The standard Python source distribution comes with a curses
95
module in the ``Modules/`` subdirectory, though it's not compiled by default
96
(note that this is not available in the Windows distribution -- there is no
97
curses module for Windows).
99
The curses module supports basic curses features as well as many additional
100
functions from ncurses and SYSV curses such as colour, alternative character set
101
support, pads, and mouse support. This means the module isn't compatible with
102
operating systems that only have BSD curses, but there don't seem to be any
103
currently maintained OSes that fall into this category.
105
For Windows: use `the consolelib module
106
<http://effbot.org/zone/console-index.htm>`_.
109
Is there an equivalent to C's onexit() in Python?
110
-------------------------------------------------
112
The :mod:`atexit` module provides a register function that is similar to C's
116
Why don't my signal handlers work?
117
----------------------------------
119
The most common problem is that the signal handler is declared with the wrong
120
argument list. It is called as ::
122
handler(signum, frame)
124
so it should be declared with two arguments::
126
def handler(signum, frame):
133
How do I test a Python program or component?
134
--------------------------------------------
136
Python comes with two testing frameworks. The :mod:`doctest` module finds
137
examples in the docstrings for a module and runs them, comparing the output with
138
the expected output given in the docstring.
140
The :mod:`unittest` module is a fancier testing framework modelled on Java and
141
Smalltalk testing frameworks.
143
For testing, it helps to write the program so that it may be easily tested by
144
using good modular design. Your program should have almost all functionality
145
encapsulated in either functions or class methods -- and this sometimes has the
146
surprising and delightful effect of making the program run faster (because local
147
variable accesses are faster than global accesses). Furthermore the program
148
should avoid depending on mutating global variables, since this makes testing
149
much more difficult to do.
151
The "global main logic" of your program may be as simple as ::
153
if __name__ == "__main__":
156
at the bottom of the main module of your program.
158
Once your program is organized as a tractable collection of functions and class
159
behaviours you should write test functions that exercise the behaviours. A test
160
suite can be associated with each module which automates a sequence of tests.
161
This sounds like a lot of work, but since Python is so terse and flexible it's
162
surprisingly easy. You can make coding much more pleasant and fun by writing
163
your test functions in parallel with the "production code", since this makes it
164
easy to find bugs and even design flaws earlier.
166
"Support modules" that are not intended to be the main module of a program may
167
include a self-test of the module. ::
169
if __name__ == "__main__":
172
Even programs that interact with complex external interfaces may be tested when
173
the external interfaces are unavailable by using "fake" interfaces implemented
177
How do I create documentation from doc strings?
178
-----------------------------------------------
180
The :mod:`pydoc` module can create HTML from the doc strings in your Python
181
source code. An alternative for creating API documentation purely from
182
docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
183
<http://sphinx.pocoo.org>`_ can also include docstring content.
186
How do I get a single keypress at a time?
187
-----------------------------------------
189
For Unix variants: There are several solutions. It's straightforward to do this
190
using curses, but curses is a fairly large module to learn. Here's a solution
193
import termios, fcntl, sys, os
194
fd = sys.stdin.fileno()
196
oldterm = termios.tcgetattr(fd)
197
newattr = termios.tcgetattr(fd)
198
newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
199
termios.tcsetattr(fd, termios.TCSANOW, newattr)
201
oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
202
fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
207
c = sys.stdin.read(1)
208
print "Got character", `c`
211
termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
212
fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
214
You need the :mod:`termios` and the :mod:`fcntl` module for any of this to work,
215
and I've only tried it on Linux, though it should work elsewhere. In this code,
216
characters are read and printed one at a time.
218
:func:`termios.tcsetattr` turns off stdin's echoing and disables canonical mode.
219
:func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags and modify
220
them for non-blocking mode. Since reading stdin when it is empty results in an
221
:exc:`IOError`, this error is caught and ignored.
227
How do I program using threads?
228
-------------------------------
230
.. XXX it's _thread in py3k
232
Be sure to use the :mod:`threading` module and not the :mod:`thread` module.
233
The :mod:`threading` module builds convenient abstractions on top of the
234
low-level primitives provided by the :mod:`thread` module.
236
Aahz has a set of slides from his threading tutorial that are helpful; see
237
http://www.pythoncraft.com/OSCON2001/.
240
None of my threads seem to run: why?
241
------------------------------------
243
As soon as the main thread exits, all threads are killed. Your main thread is
244
running too quickly, giving the threads no time to do any work.
246
A simple fix is to add a sleep to the end of the program that's long enough for
247
all the threads to finish::
249
import threading, time
251
def thread_task(name, n):
252
for i in range(n): print name, i
255
T = threading.Thread(target=thread_task, args=(str(i), i))
258
time.sleep(10) # <----------------------------!
260
But now (on many platforms) the threads don't run in parallel, but appear to run
261
sequentially, one at a time! The reason is that the OS thread scheduler doesn't
262
start a new thread until the previous thread is blocked.
264
A simple fix is to add a tiny sleep to the start of the run function::
266
def thread_task(name, n):
267
time.sleep(0.001) # <---------------------!
268
for i in range(n): print name, i
271
T = threading.Thread(target=thread_task, args=(str(i), i))
276
Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
277
it's better to use some kind of semaphore mechanism. One idea is to use the
278
:mod:`Queue` module to create a queue object, let each thread append a token to
279
the queue when it finishes, and let the main thread read as many tokens from the
280
queue as there are threads.
283
How do I parcel out work among a bunch of worker threads?
284
---------------------------------------------------------
286
Use the :mod:`Queue` module to create a queue containing a list of jobs. The
287
:class:`~Queue.Queue` class maintains a list of objects with ``.put(obj)`` to
288
add an item to the queue and ``.get()`` to return an item. The class will take
289
care of the locking necessary to ensure that each job is handed out exactly
292
Here's a trivial example::
294
import threading, Queue, time
296
# The worker thread gets jobs off the queue. When the queue is empty, it
297
# assumes there will be no more work and exits.
298
# (Realistically workers will run until terminated.)
300
print 'Running worker'
304
arg = q.get(block=False)
306
print 'Worker', threading.currentThread(),
310
print 'Worker', threading.currentThread(),
311
print 'running with argument', arg
317
# Start a pool of 5 workers
319
t = threading.Thread(target=worker, name='worker %i' % (i+1))
322
# Begin adding work to the queue
326
# Give threads time to run
327
print 'Main thread sleeping'
330
When run, this will produce the following output:
338
Worker <Thread(worker 1, started)> running with argument 0
339
Worker <Thread(worker 2, started)> running with argument 1
340
Worker <Thread(worker 3, started)> running with argument 2
341
Worker <Thread(worker 4, started)> running with argument 3
342
Worker <Thread(worker 5, started)> running with argument 4
343
Worker <Thread(worker 1, started)> running with argument 5
346
Consult the module's documentation for more details; the ``Queue`` class
347
provides a featureful interface.
350
What kinds of global value mutation are thread-safe?
351
----------------------------------------------------
353
A global interpreter lock (GIL) is used internally to ensure that only one
354
thread runs in the Python VM at a time. In general, Python offers to switch
355
among threads only between bytecode instructions; how frequently it switches can
356
be set via :func:`sys.setcheckinterval`. Each bytecode instruction and
357
therefore all the C implementation code reached from each instruction is
358
therefore atomic from the point of view of a Python program.
360
In theory, this means an exact accounting requires an exact understanding of the
361
PVM bytecode implementation. In practice, it means that operations on shared
362
variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
365
For example, the following operations are all atomic (L, L1, L2 are lists, D,
366
D1, D2 are dicts, x, y are objects, i, j are ints)::
387
Operations that replace other objects may invoke those other objects'
388
:meth:`__del__` method when their reference count reaches zero, and that can
389
affect things. This is especially true for the mass updates to dictionaries and
390
lists. When in doubt, use a mutex!
393
Can't we get rid of the Global Interpreter Lock?
394
------------------------------------------------
396
.. XXX mention multiprocessing
397
.. XXX link to dbeazley's talk about GIL?
399
The Global Interpreter Lock (GIL) is often seen as a hindrance to Python's
400
deployment on high-end multiprocessor server machines, because a multi-threaded
401
Python program effectively only uses one CPU, due to the insistence that
402
(almost) all Python code can only run while the GIL is held.
404
Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
405
patch set (the "free threading" patches) that removed the GIL and replaced it
406
with fine-grained locking. Unfortunately, even on Windows (where locks are very
407
efficient) this ran ordinary Python code about twice as slow as the interpreter
408
using the GIL. On Linux the performance loss was even worse because pthread
409
locks aren't as efficient.
411
Since then, the idea of getting rid of the GIL has occasionally come up but
412
nobody has found a way to deal with the expected slowdown, and users who don't
413
use threads would not be happy if their code ran at half at the speed. Greg's
414
free threading patch set has not been kept up-to-date for later Python versions.
416
This doesn't mean that you can't make good use of Python on multi-CPU machines!
417
You just have to be creative with dividing the work up between multiple
418
*processes* rather than multiple *threads*. Judicious use of C extensions will
419
also help; if you use a C extension to perform a time-consuming task, the
420
extension can release the GIL while the thread of execution is in the C code and
421
allow other threads to get some work done.
423
It has been suggested that the GIL should be a per-interpreter-state lock rather
424
than truly global; interpreters then wouldn't be able to share objects.
425
Unfortunately, this isn't likely to happen either. It would be a tremendous
426
amount of work, because many object implementations currently have global state.
427
For example, small integers and short strings are cached; these caches would
428
have to be moved to the interpreter state. Other object types have their own
429
free list; these free lists would have to be moved to the interpreter state.
432
And I doubt that it can even be done in finite time, because the same problem
433
exists for 3rd party extensions. It is likely that 3rd party extensions are
434
being written at a faster rate than you can convert them to store all their
435
global state in the interpreter state.
437
And finally, once you have multiple interpreters not sharing any state, what
438
have you gained over running each interpreter in a separate process?
444
How do I delete a file? (And other file questions...)
445
-----------------------------------------------------
447
Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
448
the :mod:`os` module. The two functions are identical; :func:`unlink` is simply
449
the name of the Unix system call for this function.
451
To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
452
``os.makedirs(path)`` will create any intermediate directories in ``path`` that
453
don't exist. ``os.removedirs(path)`` will remove intermediate directories as
454
long as they're empty; if you want to delete an entire directory tree and its
455
contents, use :func:`shutil.rmtree`.
457
To rename a file, use ``os.rename(old_path, new_path)``.
459
To truncate a file, open it using ``f = open(filename, "r+")``, and use
460
``f.truncate(offset)``; offset defaults to the current seek position. There's
461
also ```os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
462
``fd`` is the file descriptor (a small integer).
464
The :mod:`shutil` module also contains a number of functions to work on files
465
including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
466
:func:`~shutil.rmtree`.
469
How do I copy a file?
470
---------------------
472
The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
473
that on MacOS 9 it doesn't copy the resource fork and Finder info.
476
How do I read (or write) binary data?
477
-------------------------------------
479
To read or write complex binary data formats, it's best to use the :mod:`struct`
480
module. It allows you to take a string containing binary data (usually numbers)
481
and convert it to Python objects; and vice versa.
483
For example, the following code reads two 2-byte integers and one 4-byte integer
484
in big-endian format from a file::
488
f = open(filename, "rb") # Open in binary mode for portability
490
x, y, z = struct.unpack(">hhl", s)
492
The '>' in the format string forces big-endian data; the letter 'h' reads one
493
"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
496
For data that is more regular (e.g. a homogeneous list of ints or thefloats),
497
you can also use the :mod:`array` module.
500
I can't seem to use os.read() on a pipe created with os.popen(); why?
501
---------------------------------------------------------------------
503
:func:`os.read` is a low-level function which takes a file descriptor, a small
504
integer representing the opened file. :func:`os.popen` creates a high-level
505
file object, the same type returned by the built-in :func:`open` function.
506
Thus, to read n bytes from a pipe p created with :func:`os.popen`, you need to
510
How do I run a subprocess with pipes connected to both input and output?
511
------------------------------------------------------------------------
513
.. XXX update to use subprocess
515
Use the :mod:`popen2` module. For example::
518
fromchild, tochild = popen2.popen2("command")
519
tochild.write("input\n")
521
output = fromchild.readline()
523
Warning: in general it is unwise to do this because you can easily cause a
524
deadlock where your process is blocked waiting for output from the child while
525
the child is blocked waiting for input from you. This can be caused because the
526
parent expects the child to output more text than it does, or it can be caused
527
by data being stuck in stdio buffers due to lack of flushing. The Python parent
528
can of course explicitly flush the data it sends to the child before it reads
529
any output, but if the child is a naive C program it may have been written to
530
never explicitly flush its output, even if it is interactive, since flushing is
533
Note that a deadlock is also possible if you use :func:`popen3` to read stdout
534
and stderr. If one of the two is too large for the internal buffer (increasing
535
the buffer size does not help) and you ``read()`` the other one first, there is
538
Note on a bug in popen2: unless your program calls ``wait()`` or ``waitpid()``,
539
finished child processes are never removed, and eventually calls to popen2 will
540
fail because of a limit on the number of child processes. Calling
541
:func:`os.waitpid` with the :data:`os.WNOHANG` option can prevent this; a good
542
place to insert such a call would be before calling ``popen2`` again.
544
In many cases, all you really need is to run some data through a command and get
545
the result back. Unless the amount of data is very large, the easiest way to do
546
this is to write it to a temporary file and run the command with that temporary
547
file as input. The standard module :mod:`tempfile` exports a ``mktemp()``
548
function to generate unique temporary file names. ::
555
This is a deadlock-safe version of popen that returns
556
an object with errorlevel, out (a string) and err (a string).
557
(capturestderr may not work under windows.)
558
Example: print Popen3('grep spam','\n\nhere spam\n\n').out
560
def __init__(self,command,input=None,capturestderr=None):
561
outfile=tempfile.mktemp()
562
command="( %s ) > %s" % (command,outfile)
564
infile=tempfile.mktemp()
565
open(infile,"w").write(input)
566
command=command+" <"+infile
568
errfile=tempfile.mktemp()
569
command=command+" 2>"+errfile
570
self.errorlevel=os.system(command) >> 8
571
self.out=open(outfile,"r").read()
576
self.err=open(errfile,"r").read()
579
Note that many interactive programs (e.g. vi) don't work well with pipes
580
substituted for standard input and output. You will have to use pseudo ttys
581
("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
582
"expect" library. A Python extension that interfaces to expect is called "expy"
583
and available from http://expectpy.sourceforge.net. A pure Python solution that
584
works like expect is `pexpect <http://pypi.python.org/pypi/pexpect/>`_.
587
How do I access the serial (RS232) port?
588
----------------------------------------
590
For Win32, POSIX (Linux, BSD, etc.), Jython:
592
http://pyserial.sourceforge.net
594
For Unix, see a Usenet post by Mitch Chapman:
596
http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
599
Why doesn't closing sys.stdout (stdin, stderr) really close it?
600
---------------------------------------------------------------
602
Python file objects are a high-level layer of abstraction on top of C streams,
603
which in turn are a medium-level layer of abstraction on top of (among other
604
things) low-level C file descriptors.
606
For most file objects you create in Python via the built-in ``file``
607
constructor, ``f.close()`` marks the Python file object as being closed from
608
Python's point of view, and also arranges to close the underlying C stream.
609
This also happens automatically in ``f``'s destructor, when ``f`` becomes
612
But stdin, stdout and stderr are treated specially by Python, because of the
613
special status also given to them by C. Running ``sys.stdout.close()`` marks
614
the Python-level file object as being closed, but does *not* close the
617
To close the underlying C stream for one of these three, you should first be
618
sure that's what you really want to do (e.g., you may confuse extension modules
619
trying to do I/O). If it is, use os.close::
621
os.close(0) # close C's stdin stream
622
os.close(1) # close C's stdout stream
623
os.close(2) # close C's stderr stream
626
Network/Internet Programming
627
============================
629
What WWW tools are there for Python?
630
------------------------------------
632
See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
633
Reference Manual. Python has many modules that will help you build server-side
634
and client-side web systems.
636
.. XXX check if wiki page is still up to date
638
A summary of available frameworks is maintained by Paul Boddie at
639
http://wiki.python.org/moin/WebProgramming .
641
Cameron Laird maintains a useful set of pages about Python web technologies at
642
http://phaseit.net/claird/comp.lang.python/web_python.
645
How can I mimic CGI form submission (METHOD=POST)?
646
--------------------------------------------------
648
I would like to retrieve web pages that are the result of POSTing a form. Is
649
there existing code that would let me do this easily?
651
Yes. Here's a simple example that uses httplib::
653
#!/usr/local/bin/python
655
import httplib, sys, time
657
### build the query string
658
qs = "First=Josephine&MI=Q&Last=Public"
660
### connect and send the server a path
661
httpobj = httplib.HTTP('www.some-server.out-there', 80)
662
httpobj.putrequest('POST', '/cgi-bin/some-cgi-script')
663
### now generate the rest of the HTTP headers...
664
httpobj.putheader('Accept', '*/*')
665
httpobj.putheader('Connection', 'Keep-Alive')
666
httpobj.putheader('Content-type', 'application/x-www-form-urlencoded')
667
httpobj.putheader('Content-length', '%d' % len(qs))
670
### find out what the server said in response...
671
reply, msg, hdrs = httpobj.getreply()
673
sys.stdout.write(httpobj.getfile().read())
675
Note that in general for URL-encoded POST operations, query strings must be
676
quoted by using :func:`urllib.quote`. For example to send name="Guy Steele,
679
>>> from urllib import quote
680
>>> x = quote("Guy Steele, Jr.")
682
'Guy%20Steele,%20Jr.'
683
>>> query_string = "name="+x
685
'name=Guy%20Steele,%20Jr.'
688
What module should I use to help with generating HTML?
689
------------------------------------------------------
691
.. XXX add modern template languages
693
There are many different modules available:
695
* HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
696
tags. It's used when you are writing in Python and wish to synthesize HTML
697
pages for generating a web or for CGI forms, etc.
699
* DocumentTemplate and Zope Page Templates are two different systems that are
702
* Quixote's PTL uses Python syntax to assemble strings of text.
704
Consult the `Web Programming wiki pages
705
<http://wiki.python.org/moin/WebProgramming>`_ for more links.
708
How do I send mail from a Python script?
709
----------------------------------------
711
Use the standard library module :mod:`smtplib`.
713
Here's a very simple interactive mail sender that uses it. This method will
714
work on any host that supports an SMTP listener. ::
718
fromaddr = raw_input("From: ")
719
toaddrs = raw_input("To: ").split(',')
720
print "Enter message, end with ^D:"
723
line = sys.stdin.readline()
728
# The actual mail send
729
server = smtplib.SMTP('localhost')
730
server.sendmail(fromaddr, toaddrs, msg)
733
A Unix-only alternative uses sendmail. The location of the sendmail program
734
varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
735
``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
738
SENDMAIL = "/usr/sbin/sendmail" # sendmail location
740
p = os.popen("%s -t -i" % SENDMAIL, "w")
741
p.write("To: receiver@example.com\n")
742
p.write("Subject: test\n")
743
p.write("\n") # blank line separating headers from body
744
p.write("Some text\n")
745
p.write("some more text\n")
748
print "Sendmail exit status", sts
751
How do I avoid blocking in the connect() method of a socket?
752
------------------------------------------------------------
754
The select module is commonly used to help with asynchronous I/O on sockets.
756
To prevent the TCP connect from blocking, you can set the socket to non-blocking
757
mode. Then when you do the ``connect()``, you will either connect immediately
758
(unlikely) or get an exception that contains the error number as ``.errno``.
759
``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
760
finished yet. Different OSes will return different values, so you're going to
761
have to check what's returned on your system.
763
You can use the ``connect_ex()`` method to avoid creating an exception. It will
764
just return the errno value. To poll, you can call ``connect_ex()`` again later
765
-- 0 or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
766
socket to select to check if it's writable.
772
Are there any interfaces to database packages in Python?
773
--------------------------------------------------------
777
.. XXX remove bsddb in py3k, fix other module names
779
Python 2.3 includes the :mod:`bsddb` package which provides an interface to the
780
BerkeleyDB library. Interfaces to disk-based hashes such as :mod:`DBM <dbm>`
781
and :mod:`GDBM <gdbm>` are also included with standard Python.
783
Support for most relational databases is available. See the
784
`DatabaseProgramming wiki page
785
<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
788
How do you implement persistent objects in Python?
789
--------------------------------------------------
791
The :mod:`pickle` library module solves this in a very general way (though you
792
still can't store things like open files, sockets or windows), and the
793
:mod:`shelve` library module uses pickle and (g)dbm to create persistent
794
mappings containing arbitrary Python objects. For better performance, you can
795
use the :mod:`cPickle` module.
797
A more awkward way of doing things is to use pickle's little sister, marshal.
798
The :mod:`marshal` module provides very fast ways to store noncircular basic
799
Python types to files and strings, and back again. Although marshal does not do
800
fancy things like store instances or handle shared references properly, it does
801
run extremely fast. For example loading a half megabyte of data may take less
802
than a third of a second. This often beats doing something more complex and
803
general such as using gdbm with pickle/shelve.
806
Why is cPickle so slow?
807
-----------------------
809
.. XXX update this, default protocol is 2/3
811
The default format used by the pickle module is a slow one that results in
812
readable pickles. Making it the default, but it would break backward
815
largeString = 'z' * (100 * 1024)
816
myPickle = cPickle.dumps(largeString, protocol=1)
819
If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
820
------------------------------------------------------------------------------------------
822
Databases opened for write access with the bsddb module (and often by the anydbm
823
module, since it will preferentially use bsddb) must explicitly be closed using
824
the ``.close()`` method of the database. The underlying library caches database
825
contents which need to be converted to on-disk form and written.
827
If you have initialized a new bsddb database but not written anything to it
828
before the program crashes, you will often wind up with a zero-length file and
829
encounter an exception the next time the file is opened.
832
I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
833
----------------------------------------------------------------------------------------------------------------------------
835
Don't panic! Your data is probably intact. The most frequent cause for the error
836
is that you tried to open an earlier Berkeley DB file with a later version of
837
the Berkeley DB library.
839
Many Linux systems now have all three versions of Berkeley DB available. If you
840
are migrating from version 1 to a newer version use db_dump185 to dump a plain
841
text version of the database. If you are migrating from version 2 to version 3
842
use db2_dump to create a plain text version of the database. In either case,
843
use db_load to create a new native database for the latest version installed on
844
your computer. If you have version 3 of Berkeley DB installed, you should be
845
able to use db2_load to create a native version 2 database.
847
You should move away from Berkeley DB version 1 files because the hash file code
848
contains known bugs that can corrupt your data.
851
Mathematics and Numerics
852
========================
854
How do I generate random numbers in Python?
855
-------------------------------------------
857
The standard module :mod:`random` implements a random number generator. Usage
863
This returns a random floating point number in the range [0, 1).
865
There are also many other specialized generators in this module, such as:
867
* ``randrange(a, b)`` chooses an integer in the range [a, b).
868
* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
869
* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
871
Some higher-level functions operate on sequences directly, such as:
873
* ``choice(S)`` chooses random element from a given sequence
874
* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
876
There's also a ``Random`` class you can instantiate to create independent
877
multiple random number generators.