2
Title: Docutils Design Specification
3
Version: $Revision: 4163 $
4
Last-Modified: $Date: 2005-12-09 05:21:34 +0100 (Fri, 09 Dec 2005) $
5
Author: David Goodger <goodger@users.sourceforge.net>
6
Discussions-To: <doc-sig@python.org>
9
Content-Type: text/x-rst
12
Post-History: 13-Jun-2001
19
This PEP documents design issues and implementation details for
20
Docutils, a Python Docstring Processing System (DPS). The rationale
21
and high-level concepts of a DPS are documented in PEP 256, "Docstring
22
Processing System Framework" [#PEP-256]_. Also see PEP 256 for a
23
"Road Map to the Docstring PEPs".
25
Docutils is being designed modularly so that any of its components can
26
be replaced easily. In addition, Docutils is not limited to the
27
processing of Python docstrings; it processes standalone documents as
28
well, in several contexts.
30
No changes to the core Python language are required by this PEP. Its
31
deliverables consist of a package for the standard library and its
39
Docutils Project Model
40
======================
42
Project components and data flow::
44
+---------------------------+
46
| docutils.core.Publisher, |
47
| docutils.core.publish_*() |
48
+---------------------------+
52
+--------+ +-------------+ +--------+
53
| READER | ----> | TRANSFORMER | ====> | WRITER |
54
+--------+ +-------------+ +--------+
58
+-------+ +--------+ +--------+
59
| INPUT | | PARSER | | OUTPUT |
60
+-------+ +--------+ +--------+
62
The numbers above each component indicate the path a document's data
63
takes. Double-width lines between Reader & Parser and between
64
Transformer & Writer indicate that data sent along these paths should
65
be standard (pure & unextended) Docutils doc trees. Single-width
66
lines signify that internal tree extensions or completely unrelated
67
representations are possible, but they must be supported at both ends.
73
The ``docutils.core`` module contains a "Publisher" facade class and
74
several convenience functions: "publish_cmdline()" (for command-line
75
front ends), "publish_file()" (for programmatic use with file-like
76
I/O), and "publish_string()" (for programmatic use with string I/O).
77
The Publisher class encapsulates the high-level logic of a Docutils
78
system. The Publisher class has overall responsibility for
79
processing, controlled by the ``Publisher.publish()`` method:
81
1. Set up internal settings (may include config files & command-line
82
options) and I/O objects.
84
2. Call the Reader object to read data from the source Input object
85
and parse the data with the Parser object. A document object is
88
3. Set up and apply transforms via the Transformer object attached to
91
4. Call the Writer object which translates the document to the final
92
output format and writes the formatted data to the destination
93
Output object. Depending on the Output object, the output may be
94
returned from the Writer, and then from the ``publish()`` method.
96
Calling the "publish" function (or instantiating a "Publisher" object)
97
with component names will result in default behavior. For custom
98
behavior (customizing component settings), create custom component
99
objects first, and pass *them* to the Publisher or ``publish_*``
100
convenience functions.
106
Readers understand the input context (where the data is coming from),
107
send the whole input or discrete "chunks" to the parser, and provide
108
the context to bind the chunks together back into a cohesive whole.
110
Each reader is a module or package exporting a "Reader" class with a
111
"read" method. The base "Reader" class can be found in the
112
``docutils/readers/__init__.py`` module.
114
Most Readers will have to be told what parser to use. So far (see the
115
list of examples below), only the Python Source Reader ("PySource";
116
still incomplete) will be able to determine the parser on its own.
120
* Get input text from the source I/O.
122
* Pass the input text to the parser, along with a fresh `document
127
* Standalone (Raw/Plain): Just read a text file and process it.
128
The reader needs to be told which parser to use.
130
The "Standalone Reader" has been implemented in module
131
``docutils.readers.standalone``.
133
* Python Source: See `Python Source Reader`_ below. This Reader is
134
currently in development in the Docutils sandbox.
136
* Email: RFC-822 headers, quoted excerpts, signatures, MIME parts.
138
* PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to URIs.
139
The "PEP Reader" has been implemented in module
140
``docutils.readers.pep``; see PEP 287 and PEP 12.
142
* Wiki: Global reference lookups of "wiki links" incorporated into
143
transforms. (CamelCase only or unrestricted?) Lazy
146
* Web Page: As standalone, but recognize meta fields as meta tags.
147
Support for templates of some sort? (After ``<body>``, before
150
* FAQ: Structured "question & answer(s)" constructs.
152
* Compound document: Merge chapters into a book. Master manifest
159
Parsers analyze their input and produce a Docutils `document tree`_.
160
They don't know or care anything about the source or destination of
163
Each input parser is a module or package exporting a "Parser" class
164
with a "parse" method. The base "Parser" class can be found in the
165
``docutils/parsers/__init__.py`` module.
167
Responsibilities: Given raw input text and a doctree root node,
168
populate the doctree by parsing the input text.
170
Example: The only parser implemented so far is for the
171
reStructuredText markup. It is implemented in the
172
``docutils/parsers/rst/`` package.
174
The development and integration of other parsers is possible and
183
The Transformer class, in ``docutils/transforms/__init__.py``, stores
184
transforms and applies them to documents. A transformer object is
185
attached to every new document tree. The Publisher_ calls
186
``Transformer.apply_transforms()`` to apply all stored transforms to
187
the document tree. Transforms change the document tree from one form
188
to another, add to the tree, or prune it. Transforms resolve
189
references and footnote numbers, process interpreted text, and do
190
other context-sensitive processing.
192
Some transforms are specific to components (Readers, Parser, Writers,
193
Input, Output). Standard component-specific transforms are specified
194
in the ``default_transforms`` attribute of component classes. After
195
the Reader has finished processing, the Publisher_ calls
196
``Transformer.populate_from_components()`` with a list of components
197
and all default transforms are stored.
199
Each transform is a class in a module in the ``docutils/transforms/``
200
package, a subclass of ``docutils.tranforms.Transform``. Transform
201
classes each have a ``default_priority`` attribute which is used by
202
the Transformer to apply transforms in order (low to high). The
203
default priority can be overridden when adding transforms to the
206
Transformer responsibilities:
208
* Apply transforms to the document tree, in priority order.
210
* Store a mapping of component type name ('reader', 'writer', etc.) to
211
component objects. These are used by certain transforms (such as
212
"components.Filter") to determine suitability.
214
Transform responsibilities:
216
* Modify a doctree in-place, either purely transforming one structure
217
into another, or adding new structures based on the doctree and/or
220
Examples of transforms (in the ``docutils/transforms/`` package):
222
* frontmatter.DocInfo: Conversion of document metadata (bibliographic
225
* references.AnonymousHyperlinks: Resolution of anonymous references
226
to corresponding targets.
228
* parts.Contents: Generates a table of contents for a document.
230
* document.Merger: Combining multiple populated doctrees into one.
231
(Not yet implemented or fully understood.)
233
* document.Splitter: Splits a document into a tree-structure of
234
subdocuments, perhaps by section. It will have to transform
235
references appropriately. (Neither implemented not remotely
238
* components.Filter: Includes or excludes elements which depend on a
239
specific Docutils component.
245
Writers produce the final output (HTML, XML, TeX, etc.). Writers
246
translate the internal `document tree`_ structure into the final data
247
format, possibly running Writer-specific transforms_ first.
249
By the time the document gets to the Writer, it should be in final
250
form. The Writer's job is simply (and only) to translate from the
251
Docutils doctree structure to the target format. Some small
252
transforms may be required, but they should be local and
255
Each writer is a module or package exporting a "Writer" class with a
256
"write" method. The base "Writer" class can be found in the
257
``docutils/writers/__init__.py`` module.
261
* Translate doctree(s) into specific output formats.
263
- Transform references into format-native forms.
265
* Write the translated output to the destination I/O.
269
* XML: Various forms, such as:
271
- Docutils XML (an expression of the internal document tree,
272
implemented as ``docutils.writers.docutils_xml``).
274
- DocBook (being implemented in the Docutils sandbox).
276
* HTML (XHTML implemented as ``docutils.writers.html4css1``).
278
* PDF (a ReportLabs interface is being developed in the Docutils
281
* TeX (a LaTeX Writer is being implemented in the sandbox).
283
* Docutils-native pseudo-XML (implemented as
284
``docutils.writers.pseudoxml``, used for testing).
294
I/O classes provide a uniform API for low-level input and output.
295
Subclasses will exist for a variety of input/output mechanisms.
296
However, they can be considered an implementation detail. Most
297
applications should be satisfied using one of the convenience
298
functions associated with the Publisher_.
300
I/O classes are currently in the preliminary stages; there's a lot of
301
work yet to be done. Issues:
303
* How to represent multi-file input (files & directories) in the API?
305
* How to represent multi-file output? Perhaps "Writer" variants, one
306
for each output distribution type? Or Output objects with
307
associated transforms?
311
* Read data from the input source (Input objects) or write data to the
312
output destination (Output objects).
314
Examples of input sources:
316
* A single file on disk or a stream (implemented as
317
``docutils.io.FileInput``).
319
* Multiple files on disk (``MultiFileInput``?).
321
* Python source files: modules and packages.
323
* Python strings, as received from a client application
324
(implemented as ``docutils.io.StringInput``).
326
Examples of output destinations:
328
* A single file on disk or a stream (implemented as
329
``docutils.io.FileOutput``).
331
* A tree of directories and files on disk.
333
* A Python string, returned to a client application (implemented as
334
``docutils.io.StringOutput``).
336
* No output; useful for programmatic applications where only a portion
337
of the normal output is to be used (implemented as
338
``docutils.io.NullOutput``).
340
* A single tree-shaped data structure in memory.
342
* Some other set of data structures in memory.
345
Docutils Package Structure
346
==========================
348
* Package "docutils".
350
- Module "__init__.py" contains: class "Component", a base class for
351
Docutils components; class "SettingsSpec", a base class for
352
specifying runtime settings (used by docutils.frontend); and class
353
"TransformSpec", a base class for specifying transforms.
355
- Module "docutils.core" contains facade class "Publisher" and
356
convenience functions. See `Publisher`_ above.
358
- Module "docutils.frontend" provides runtime settings support, for
359
programmatic use and front-end tools (including configuration file
360
support, and command-line argument and option processing).
362
- Module "docutils.io" provides a uniform API for low-level input
363
and output. See `Input/Output`_ above.
365
- Module "docutils.nodes" contains the Docutils document tree
366
element class library plus tree-traversal Visitor pattern base
367
classes. See `Document Tree`_ below.
369
- Module "docutils.statemachine" contains a finite state machine
370
specialized for regular-expression-based text filters and parsers.
371
The reStructuredText parser implementation is based on this
374
- Module "docutils.urischemes" contains a mapping of known URI
375
schemes ("http", "ftp", "mail", etc.).
377
- Module "docutils.utils" contains utility functions and classes,
378
including a logger class ("Reporter"; see `Error Handling`_
381
- Package "docutils.parsers": markup parsers_.
383
- Function "get_parser_class(parser_name)" returns a parser module
384
by name. Class "Parser" is the base class of specific parsers.
385
(``docutils/parsers/__init__.py``)
387
- Package "docutils.parsers.rst": the reStructuredText parser.
389
- Alternate markup parsers may be added.
391
See `Parsers`_ above.
393
- Package "docutils.readers": context-aware input readers.
395
- Function "get_reader_class(reader_name)" returns a reader module
396
by name or alias. Class "Reader" is the base class of specific
397
readers. (``docutils/readers/__init__.py``)
399
- Module "docutils.readers.standalone" reads independent document
402
- Module "docutils.readers.pep" reads PEPs (Python Enhancement
405
- Module "docutils.readers.doctree" is used to re-read a
406
previously stored document tree for reprocessing.
408
- Readers to be added for: Python source code (structure &
409
docstrings), email, FAQ, and perhaps Wiki and others.
411
See `Readers`_ above.
413
- Package "docutils.writers": output format writers.
415
- Function "get_writer_class(writer_name)" returns a writer module
416
by name. Class "Writer" is the base class of specific writers.
417
(``docutils/writers/__init__.py``)
419
- Package "docutils.writers.html4css1" is a simple HyperText
420
Markup Language document tree writer for HTML 4.01 and CSS1.
422
- Package "docutils.writers.pep_html" generates HTML from
423
reStructuredText PEPs.
425
- Package "docutils.writers.s5_html" generates S5/HTML slide
428
- Package "docutils.writers.latex2e" writes LaTeX.
430
- Package "docutils.writers.newlatex2e" also writes LaTeX; it is a
433
- Module "docutils.writers.docutils_xml" writes the internal
434
document tree in XML form.
436
- Module "docutils.writers.pseudoxml" is a simple internal
437
document tree writer; it writes indented pseudo-XML.
439
- Module "docutils.writers.null" is a do-nothing writer; it is
440
used for specialized purposes such as storing the internal
443
- Writers to be added: HTML 3.2 or 4.01-loose, XML (various forms,
444
such as DocBook), PDF, plaintext, reStructuredText, and perhaps
447
Subpackages of "docutils.writers" contain modules and data files
448
(such as stylesheets) that support the individual writers.
450
See `Writers`_ above.
452
- Package "docutils.transforms": tree transform classes.
454
- Class "Transformer" stores transforms and applies them to
455
document trees. (``docutils/transforms/__init__.py``)
457
- Class "Transform" is the base class of specific transforms.
458
(``docutils/transforms/__init__.py``)
460
- Each module contains related transform classes.
462
See `Transforms`_ above.
464
- Package "docutils.languages": Language modules contain
465
language-dependent strings and mappings. They are named for their
466
language identifier (as defined in `Choice of Docstring Format`_
467
below), converting dashes to underscores.
469
- Function "get_language(language_code)", returns matching
470
language module. (``docutils/languages/__init__.py``)
472
- Modules: en.py (English), de.py (German), fr.py (French), it.py
473
(Italian), sk.py (Slovak), sv.py (Swedish).
475
- Other languages to be added.
477
* Third-party modules: "extras" directory. These modules are
478
installed only if they're not already present in the Python
481
- ``extras/optparse.py`` and ``extras/textwrap.py`` provide
482
option parsing and command-line help; from Greg Ward's
483
http://optik.sf.net/ project, included for convenience.
485
- ``extras/roman.py`` contains Roman numeral conversion routines.
491
The ``tools/`` directory contains several front ends for common
492
Docutils processing. See `Docutils Front-End Tools`_ for details.
494
.. _Docutils Front-End Tools:
495
http://docutils.sourceforge.net/docs/user/tools.html
501
A single intermediate data structure is used internally by Docutils,
502
in the interfaces between components; it is defined in the
503
``docutils.nodes`` module. It is not required that this data
504
structure be used *internally* by any of the components, just
505
*between* components as outlined in the diagram in the `Docutils
506
Project Model`_ above.
508
Custom node types are allowed, provided that either (a) a transform
509
converts them to standard Docutils nodes before they reach the Writer
510
proper, or (b) the custom node is explicitly supported by certain
511
Writers, and is wrapped in a filtered "pending" node. An example of
512
condition (a) is the `Python Source Reader`_ (see below), where a
513
"stylist" transform converts custom nodes. The HTML ``<meta>`` tag is
514
an example of condition (b); it is supported by the HTML Writer but
515
not by others. The reStructuredText "meta" directive creates a
516
"pending" node, which contains knowledge that the embedded "meta" node
517
can only be handled by HTML-compatible writers. The "pending" node is
518
resolved by the ``docutils.transforms.components.Filter`` transform,
519
which checks that the calling writer supports HTML; if it doesn't, the
520
"pending" node (and enclosed "meta" node) is removed from the
523
The document tree data structure is similar to a DOM tree, but with
524
specific node names (classes) instead of DOM's generic nodes. The
525
schema is documented in an XML DTD (eXtensible Markup Language
526
Document Type Definition), which comes in two parts:
528
* the Docutils Generic DTD, docutils.dtd_, and
530
* the OASIS Exchange Table Model, soextbl.dtd_.
532
The DTD defines a rich set of elements, suitable for many input and
533
output formats. The DTD retains all information necessary to
534
reconstruct the original input text, or a reasonable facsimile
537
See `The Docutils Document Tree`_ for details (incomplete).
543
When the parser encounters an error in markup, it inserts a system
544
message (DTD element "system_message"). There are five levels of
547
* Level-0, "DEBUG": an internal reporting issue. There is no effect
548
on the processing. Level-0 system messages are handled separately
551
* Level-1, "INFO": a minor issue that can be ignored. There is little
552
or no effect on the processing. Typically level-1 system messages
555
* Level-2, "WARNING": an issue that should be addressed. If ignored,
556
there may be minor problems with the output. Typically level-2
557
system messages are reported but do not halt processing.
559
* Level-3, "ERROR": a major issue that should be addressed. If
560
ignored, the output will contain unpredictable errors. Typically
561
level-3 system messages are reported but do not halt processing.
563
* Level-4, "SEVERE": a critical error that must be addressed.
564
Typically level-4 system messages are turned into exceptions which
565
do halt processing. If ignored, the output will contain severe
568
Although the initial message levels were devised independently, they
569
have a strong correspondence to `VMS error condition severity
570
levels`_; the names in quotes for levels 1 through 4 were borrowed
571
from VMS. Error handling has since been influenced by the `log4j
578
The Python Source Reader ("PySource") is the Docutils component that
579
reads Python source files, extracts docstrings in context, then
580
parses, links, and assembles the docstrings into a cohesive whole. It
581
is a major and non-trivial component, currently under experimental
582
development in the Docutils sandbox. High-level design issues are
589
This model will evolve over time, incorporating experience and
592
1. The PySource Reader uses an Input class to read in Python packages
593
and modules, into a tree of strings.
595
2. The Python modules are parsed, converting the tree of strings into
596
a tree of abstract syntax trees with docstring nodes.
598
3. The abstract syntax trees are converted into an internal
599
representation of the packages/modules. Docstrings are extracted,
600
as well as code structure details. See `AST Mining`_ below.
601
Namespaces are constructed for lookup in step 6.
603
4. One at a time, the docstrings are parsed, producing standard
606
5. PySource assembles all the individual docstrings' doctrees into a
607
Python-specific custom Docutils tree paralleling the
608
package/module/class structure; this is a custom Reader-specific
609
internal representation (see the `Docutils Python Source DTD`_).
610
Namespaces must be merged: Python identifiers, hyperlink targets.
612
6. Cross-references from docstrings (interpreted text) to Python
613
identifiers are resolved according to the Python namespace lookup
614
rules. See `Identifier Cross-References`_ below.
616
7. A "Stylist" transform is applied to the custom doctree (by the
617
Transformer_), custom nodes are rendered using standard nodes as
618
primitives, and a standard document tree is emitted. See `Stylist
621
8. Other transforms are applied to the standard doctree by the
624
9. The standard doctree is sent to a Writer, which translates the
625
document into a concrete format (HTML, PDF, etc.).
627
10. The Writer uses an Output class to write the resulting data to its
628
destination (disk file, directories and files, etc.).
634
Abstract Syntax Tree mining code will be written (or adapted) that
635
scans a parsed Python module, and returns an ordered tree containing
636
the names, docstrings (including attribute and additional docstrings;
637
see below), and additional info (in parentheses below) of all of the
642
* module attributes (+ initial values)
643
* classes (+ inheritance)
644
* class attributes (+ initial values)
645
* instance attributes (+ initial values)
646
* methods (+ parameters & defaults)
647
* functions (+ parameters & defaults)
649
(Extract comments too? For example, comments at the start of a module
650
would be a good place for bibliographic field lists.)
652
In order to evaluate interpreted text cross-references, namespaces for
653
each of the above will also be required.
655
See the python-dev/docstring-develop thread "AST mining", started on
659
Docstring Extraction Rules
660
--------------------------
664
a) If the "``__all__``" variable is present in the module being
665
documented, only identifiers listed in "``__all__``" are
666
examined for docstrings.
668
b) In the absence of "``__all__``", all identifiers are examined,
669
except those whose names are private (names begin with "_" but
670
don't begin and end with "__").
672
c) 1a and 1b can be overridden by runtime settings.
676
Docstrings are string literal expressions, and are recognized in
677
the following places within Python modules:
679
a) At the beginning of a module, function definition, class
680
definition, or method definition, after any comments. This is
681
the standard for Python ``__doc__`` attributes.
683
b) Immediately following a simple assignment at the top level of a
684
module, class definition, or ``__init__`` method definition,
685
after any comments. See `Attribute Docstrings`_ below.
687
c) Additional string literals found immediately after the
688
docstrings in (a) and (b) will be recognized, extracted, and
689
concatenated. See `Additional Docstrings`_ below.
691
d) @@@ 2.2-style "properties" with attribute docstrings? Wait for
696
Whenever possible, Python modules should be parsed by Docutils, not
697
imported. There are several reasons:
699
- Importing untrusted code is inherently insecure.
701
- Information from the source is lost when using introspection to
702
examine an imported module, such as comments and the order of
705
- Docstrings are to be recognized in places where the byte-code
706
compiler ignores string literal expressions (2b and 2c above),
707
meaning importing the module will lose these docstrings.
709
Of course, standard Python parsing tools such as the "parser"
710
library module should be used.
712
When the Python source code for a module is not available
713
(i.e. only the ``.pyc`` file exists) or for C extension modules, to
714
access docstrings the module can only be imported, and any
715
limitations must be lived with.
717
Since attribute docstrings and additional docstrings are ignored by
718
the Python byte-code compiler, no namespace pollution or runtime bloat
719
will result from their use. They are not assigned to ``__doc__`` or
720
to any other attribute. The initial parsing of a module may take a
721
slight performance hit.
727
(This is a simplified version of PEP 224 [#PEP-224]_.)
729
A string literal immediately following an assignment statement is
730
interpreted by the docstring extraction machinery as the docstring of
731
the target of the assignment statement, under the following
734
1. The assignment must be in one of the following contexts:
736
a) At the top level of a module (i.e., not nested inside a compound
737
statement such as a loop or conditional): a module attribute.
739
b) At the top level of a class definition: a class attribute.
741
c) At the top level of the "``__init__``" method definition of a
742
class: an instance attribute. Instance attributes assigned in
743
other methods are assumed to be implementation details. (@@@
744
``__new__`` methods?)
746
d) A function attribute assignment at the top level of a module or
749
Since each of the above contexts are at the top level (i.e., in the
750
outermost suite of a definition), it may be necessary to place
751
dummy assignments for attributes assigned conditionally or in a
754
2. The assignment must be to a single target, not to a list or a tuple
757
3. The form of the target:
759
a) For contexts 1a and 1b above, the target must be a simple
760
identifier (not a dotted identifier, a subscripted expression,
761
or a sliced expression).
763
b) For context 1c above, the target must be of the form
764
"``self.attrib``", where "``self``" matches the "``__init__``"
765
method's first parameter (the instance parameter) and "attrib"
766
is a simple identifier as in 3a.
768
c) For context 1d above, the target must be of the form
769
"``name.attrib``", where "``name``" matches an already-defined
770
function or method name and "attrib" is a simple identifier as
773
Blank lines may be used after attribute docstrings to emphasize the
774
connection between the assignment and the docstring.
778
g = 'module attribute (module-global variable)'
779
"""This is g's docstring."""
783
c = 'class attribute'
784
"""This is AClass.c's docstring."""
787
"""Method __init__'s docstring."""
789
self.i = 'instance attribute'
790
"""This is self.i's docstring."""
793
"""Function f's docstring."""
797
"""Function attribute f.a's docstring."""
800
Additional Docstrings
801
'''''''''''''''''''''
803
(This idea was adapted from PEP 216 [#PEP-216]_.)
805
Many programmers would like to make extensive use of docstrings for
806
API documentation. However, docstrings do take up space in the
807
running program, so some programmers are reluctant to "bloat up" their
808
code. Also, not all API documentation is applicable to interactive
809
environments, where ``__doc__`` would be displayed.
811
Docutils' docstring extraction tools will concatenate all string
812
literal expressions which appear at the beginning of a definition or
813
after a simple assignment. Only the first strings in definitions will
814
be available as ``__doc__``, and can be used for brief usage text
815
suitable for interactive sessions; subsequent string literals and all
816
attribute docstrings are ignored by the Python byte-code compiler and
817
may contain more extensive API information.
822
"""This is __doc__, function's docstring."""
824
This is an additional docstring, ignored by the byte-code
825
compiler, but extracted by Docutils.
829
.. topic:: Issue: ``from __future__ import``
831
This would break "``from __future__ import``" statements introduced
832
in Python 2.1 for multiple module docstrings (main docstring plus
833
additional docstring(s)). The Python Reference Manual specifies:
835
A future statement must appear near the top of the module. The
836
only lines that can appear before a future statement are:
838
* the module docstring (if any),
841
* other future statements.
845
1. Should we search for docstrings after a ``__future__``
846
statement? Very ugly.
848
2. Redefine ``__future__`` statements to allow multiple preceding
851
3. Or should we not even worry about this? There probably
852
shouldn't be ``__future__`` statements in production code, after
853
all. Perhaps modules with ``__future__`` statements will simply
854
have to put up with the single-docstring limitation.
857
Choice of Docstring Format
858
--------------------------
860
Rather than force everyone to use a single docstring format, multiple
861
input formats are allowed by the processing system. A special
862
variable, ``__docformat__``, may appear at the top level of a module
863
before any function or class definitions. Over time or through
864
decree, a standard format or set of formats should emerge.
866
A module's ``__docformat__`` variable only applies to the objects
867
defined in the module's file. In particular, the ``__docformat__``
868
variable in a package's ``__init__.py`` file does not apply to objects
869
defined in subpackages and submodules.
871
The ``__docformat__`` variable is a string containing the name of the
872
format being used, a case-insensitive string matching the input
873
parser's module or package name (i.e., the same name as required to
874
"import" the module or package), or a registered alias. If no
875
``__docformat__`` is specified, the default format is "plaintext" for
876
now; this may be changed to the standard format if one is ever
879
The ``__docformat__`` string may contain an optional second field,
880
separated from the format name (first field) by a single space: a
881
case-insensitive language identifier as defined in RFC 1766. A
882
typical language identifier consists of a 2-letter language code from
883
`ISO 639`_ (3-letter codes used only if no 2-letter code exists; RFC
884
1766 is currently being revised to allow 3-letter codes). If no
885
language identifier is specified, the default is "en" for English.
886
The language identifier is passed to the parser and can be used for
887
language-dependent markup features.
890
Identifier Cross-References
891
---------------------------
893
In Python docstrings, interpreted text is used to classify and mark up
894
program identifiers, such as the names of variables, functions,
895
classes, and modules. If the identifier alone is given, its role is
896
inferred implicitly according to the Python namespace lookup rules.
897
For functions and methods (even when dynamically assigned),
898
parentheses ('()') may be included::
900
This function uses `another()` to do its work.
902
For class, instance and module attributes, dotted identifiers are used
903
when necessary. For example (using reStructuredText markup)::
905
class Keeper(Storer):
908
Extend `Storer`. Class attribute `instances` keeps track
909
of the number of `Keeper` objects instantiated.
913
"""How many `Keeper` objects are there?"""
917
Extend `Storer.__init__()` to keep track of instances.
919
Keep count in `Keeper.instances`, data in `self.data`.
921
Storer.__init__(self)
922
Keeper.instances += 1
925
"""Store data in a list, most recent last."""
927
def store_data(self, data):
929
Extend `Storer.store_data()`; append new `data` to a
930
list (in `self.data`).
934
Each of the identifiers quoted with backquotes ("`") will become
935
references to the definitions of the identifiers themselves.
941
Stylist transforms are specialized transforms specific to the PySource
942
Reader. The PySource Reader doesn't have to make any decisions as to
943
style; it just produces a logically constructed document tree, parsed
944
and linked, including custom node types. Stylist transforms
945
understand the custom nodes created by the Reader and convert them
946
into standard Docutils nodes.
948
Multiple Stylist transforms may be implemented and one can be chosen
949
at runtime (through a "--style" or "--stylist" command-line option).
950
Each Stylist transform implements a different layout or style; thus
951
the name. They decouple the context-understanding part of the Reader
952
from the layout-generating part of processing, resulting in a more
953
flexible and robust system. This also serves to "separate style from
954
content", the SGML/XML ideal.
956
By keeping the piece of code that does the styling small and modular,
957
it becomes much easier for people to roll their own styles. The
958
"barrier to entry" is too high with existing tools; extracting the
959
stylist code will lower the barrier considerably.
962
==========================
963
References and Footnotes
964
==========================
966
.. [#PEP-256] PEP 256, Docstring Processing System Framework, Goodger
967
(http://www.python.org/peps/pep-0256.html)
969
.. [#PEP-224] PEP 224, Attribute Docstrings, Lemburg
970
(http://www.python.org/peps/pep-0224.html)
972
.. [#PEP-216] PEP 216, Docstring Format, Zadka
973
(http://www.python.org/peps/pep-0216.html)
976
http://docutils.sourceforge.net/docs/ref/docutils.dtd
979
http://docutils.sourceforge.net/docs/ref/soextblx.dtd
981
.. _The Docutils Document Tree:
982
http://docutils.sourceforge.net/docs/ref/doctree.html
984
.. _VMS error condition severity levels:
985
http://www.openvms.compaq.com:8000/73final/5841/841pro_027.html
988
.. _log4j project: http://logging.apache.org/log4j/docs/index.html
990
.. _Docutils Python Source DTD:
991
http://docutils.sourceforge.net/docs/dev/pysource.dtd
993
.. _ISO 639: http://www.loc.gov/standards/iso639-2/englangn.html
995
.. _Python Doc-SIG: http://www.python.org/sigs/doc-sig/
1003
A SourceForge project has been set up for this work at
1004
http://docutils.sourceforge.net/.
1011
This document has been placed in the public domain.
1018
This document borrows ideas from the archives of the `Python
1019
Doc-SIG`_. Thanks to all members past & present.
1026
indent-tabs-mode: nil
1027
sentence-end-double-space: t