4
This document describes actual and proposed changes to the djvu
5
format since the release of the DjVu3 specification by Lizardtech in
9
2- ESCAPE SEQUENCES IN ANNOTATION CHUNK STRINGS
11
The treatment of escape sequence in annotation chunk strings has
12
historically been slightly different in Lizardtech DjVu and
13
DjVuLibre. We are expecting that the DjVuLibre solution will
14
eventually become the standard.
16
Lizardtech DjVu uses the "old" rule described in section 8.3.4.2.
17
The sequence of characters BACKSLASH DOUBLEQUOTE represents a
18
DOUBLEQUOTE character without terminating the string. There are no
19
other escape sequences. All other utf8 characters are written
20
directly. The main drawback of this approach is the inability to
21
write a string containing the sequence of characters BACKSLASH
22
DOUBLEQUOTE since there is no way to escape the first BACKSLASH
25
DjVuLibre has introduced a more flexible scheme a few years ago.
26
Annotation strings are similar to strings in the C language.
27
Character sequences starting with a backslash have special meaning.
28
A BACKSLASH followed by "a", "b", "t", "n", "v", "f", "r", or "\"
29
stands for the ascii character BEL, BS, HT, LF, VT, FF, CR,
30
BACKSLASH or DOUBLEQUOTE. A BACKSLASH followed by one to three
31
digits stands for the byte whose octal code is expressed by the
32
digits. All other backslash sequences are illegal. Non printable
33
ascii characters must be escaped. Multibyte characters should either
34
be entered directly, or represented using octal sequences.
36
DjVuLibre minimizes the compatibility problems by searching illegal
37
escape sequences in the annotation chunk. If any illegal sequence
38
is found, the Lizardtech rule is used instead of the DjVuLibre rule.
39
It is expected that Lizardtech will at some point adopt the improved
40
DjVuLibre rules. We will then be able to state that all DjVu files
41
with version greater than some constant use the
46
Each page in a DjVu document is identified by three strings named
47
the ID, the NAME, and the TITLE. The semantic distinction between
48
these three strings is no longer very clear. The current software
49
does not work consistently when these strings are different. See
50
the comment in the table, section 8.3.2.2 of the specification.
52
Recent versions of DjVuLibre still require that the ID and NAME
53
string are equal. The TITLE string however can be different and
54
should be used to display friendly page names. The djvused program
55
now features a command 'set-page-title' to install a TITLE different
56
from the ID string. The djview program then displays and recognizes
57
these page titles in lieu of the sequential page numbers.
60
4- METADATA ANNOTATIONS
62
DjVuLibre has introduced metadata annotations a few years ago.
63
Metadata entries for each page is represent by key/value pairs
64
located in a metadata directive in the annotation chunk.
66
The metadata directive has the form
68
(metadata ... (key "value") ... )
70
Each entry is identified by a symbol <key> representing the nature
71
of the meta data entry. Typical keys include 'year', 'booktitle',
72
'editor', 'author', etc. It is suggested to use the same key names
73
as the BibTeX bibliography system. The string <"value"> represents
74
the value associated with the corresponding key.
77
5- ANNOTATIONS FOR THE DOCUMENT
79
This scheme provides a simple way to specify metadata for each page.
80
But it is often useful to provide metadata that applies to the whole
81
document. Document wide metadata are represented using one or
82
several metadata directives in the shared annotations chunk.
84
This scheme has a potential drawback. Since the shared annotations
85
is included by all pages, the document wide metadata also appears as
86
page metadata for all pages. This might not be adequate for some
87
uses. As a workaround, the djview4 viewer (in preparation) only
88
displays the page metadata that differ from the document metadata.
90
A more definitive answer would be the definition of a document
91
annotation chunk located after the DIRM chunk and before any
92
component file. This space is already used by the NAVM chunk.
93
This is being considered.