~zyga/dh-splitpackage/trunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.38.4.
.TH DH_SPLITPACKAGE "1" "June 2011" "dh_splitpackage 0.2.3" "User Commands"
.SH NAME
dh_splitpackage \- split monolithic installation directory into sub\-packages
.SH SYNOPSIS
\fBdh_splitpackage [\-h] [\-v] [\-c config] [\-\-sourcedir dir] [\-n] [\-q]
.SH DESCRIPTION
.PP
.B
dh_splitpackage
works by interpreting the file 
.B
debian/splitpackage
, which must be a JSON file, describing the split operation to perform. The
actual operation is performed in stages:
.PP
First the debian/splitpackage file is read to check what packages should be
considered and what patterns should be associated with each package. The file
is also checked for syntax errors and structure. 
.PP
Then each file found under debian/tmp/ is classified to determine which
package it belongs to. If a file matches patterns of more than one package an
appropriate error is reported, with details of the problem. If a file does not
match any pattern it is automatically associated to the primary package, if
designated.
.PP
Finally the split operation is performed. dh_splitpackage walks over the list
of files and directories and, for each file or directory, copies it (with the
full prefix), to the appropriate package.  Directories are not copied
recursively, if you want to copy the contents simply specify appropriate
pattern that would match all the descendants of that directory.
.SH FILES
.SS "debian/splitpackage"
.PP
The primary configuration file for dh_splitpacakge. 
.PP
In the configuration defines any number of packages and inclusion and
exclusion rules for each package. In addition one package can be designated as
primary to make all unclassified files belong there automatically. The
inclusion/exclusion patterns may be defined as a list of strings or just a
single string. Each string is using a special glob-like matching pattern
documented below. It's best to see some examples to understand how it works.
.PP
Note that JSON is very picky about missing or excessive commas, if processing
your configuration file throws errors then it's most likely caused by this
annoying property.
.PP
See
.B
CONFIG FILE SCHEMA
and
.B
EXAMPLES
below for more information.
.PP
\fBNOTE:\fR there is a new additional format similar to well-known INI-style
files that can express the very same data as current json format. This format
will be released properly in the future along with proper documentation.
.SH OPTIONS
.TP
\fB\-h\fR, \fB\-\-help\fR
Show help message and exit
.TP
\fB\-v\fR, \fB\-\-version\fR
Show program's version number and exit
.TP
\fB\-c\fR config, \fB\-\-conf\fR config
Use alternate configuration file.
.TP
\fB\-\-sourcedir\fR dir
Use alternate source directory.
.TP
\fB\-n\fR, \fB\-\-dry\-run\fR
Don't actually copy any files or directories.
.TP
\fB\-q\fR, \fB\-\-quiet\fR
Don't print the classification table.
.SH "CONFIG FILE SCHEMA"
.PP
The configuration file format it described by the following JSON Schema. The
schema is following the 2nd draft of the JSON Schema specification. You can
find the specification here:
http://tools.ietf.org/html/draft-zyp-json-schema-02
.PP
The schema of \fBdebian/splitpackage\fR is defined below
.nf
{
    "type": "object",
    "properties": {
        "format": {
            "type": "string",
            "enum": ["dh_splitpackage 0.1"],
        },
        "packages": {
            "type": "object",
            "additionalProperties": {
                "type": "object",
                "properties": {
                    "inclusion_patterns": {
                        "type": [
                            "string", {
                                "type": "array",
                                "items": {
                                    "type": "string"
                                 }
                            }
                        ],
                        "optional": true
                    },
                    "exclusion_patterns": {
                        "type": [
                            "string", {
                                "type": "array",
                                "items": {
                                    "type": "string"
                                }
                             }
                         ],
                        "optional": true,
                    }
                },
                "additionalProperties": false,
            }
        },
        "primary_package": {
            "type": "string",
            "optional": true
        }
     },
    "additionalProperties": false
}
.fi
.SH "PATHNAME PATTERNS"
.PP
Patterns used by dh_splitpackage are similar to globs. When discussing them
it's important to remember that each pathname that denotes a directory is
always terminated with a forward slash. 
.PP
The patterns behave mostly as normal globs with the following differences:
.RS
.PP
Dot ('.') is a normal character, not a wildcard.
.PP
A single star ('*') may match a filename and a filename only. It will never
match a directory name. Note, if you use '*/' it \fBwill\fR match directories but
a sole '*' will not.
.PP
A special extension for matching directories is provided in the form of star,
star, forward slash ('**/'). This pattern matches directories (and directories
only) of any depth (including not matching any directory at all).
.RE
.SS "Pattern to regular expression algorithm"
.PP
Technically patterns are implemented with python regular expressions. The
algorithm used to translate from patterns to regular expressions is defined
below.
.PP
.RS
.PP
The dot pattern looses match-single-character semantics normally found in
regular expressions. Each occurrence '.' is replaced with the regular
expression '\\.'.
.PP
The single star pattern ('*') is rewritten to ensure it only matches filenames,
never directories. This is achieved by replacing each occurrence of '*' with
 '[^/]*'. This regular expression matches everything except for the forward
slash that is guaranteed to terminate each pathname pointing to a directory. 
.PP
The double-star-forward-slash ('**/') pattern is rewritten to ensure it matches
any sequence of directories but never files. This is achieved by replacing each
occurrence of '**/' with '(.+/|)'. This regular expression matches a non-empty
string followed by a forward slash \fBor\fR an empty string.
.PP
Finally the pattern must match the whole pathname. To do that the pattern is
extended with leading '^' and trailing '$'.
.RE
.SH EXAMPLES
.SS "Example pathname patterns"
.nf
foo                    - match a file called 'foo' in the root directory
fo.                    - match a file called 'fo.' in the root directory
*                      - match all files in the root directory
**/                    - match all directories
**/*                   - match all files and directories
*.txt                  - match all files with the extension '.txt' in the
                         root directory
foo/**/*               - match all files and directories underneath foo/
foo/bar/froz.txt       - match this path explicitly
**/man*/*.[0-9]        - match all manual pages, this shows how regular
                         expressions can still be used alongside the new
                         pattern extensions.
.fi
.SS "Hypothetical library package"
.PP
A library separated into library, development files and documentation. Since
there is no primary package designated any files not matched by the patterns
defined below would simply be left behind.
.PP
.nf
{
    "format": "dh_splitpackage 0.1",
    "packages": {
        "libfoo-dev": {
            "inclusion_patterns": [
               "**/*.a",
               "**/*.h",
               "**/*.la",
               "**/*.m4",
               "**/*.pc",
               "man/man**/*.[1-9]"
             ]
        },
        "libfoo": {
            "inclusion_patterns": "**/*.so"
        },
        "libfoo-doc": {
            "inclusion_patterns": "/usr/share/doc/**/*"
        }
    }
}
.fi
.SS "Hypothetical server package"
.PP
Hypothetical "server" package, with two packages for foo and bar modules,
special "server-module-others" package that grabs the remaining modules and a
documentation package. 
.PP
The "server-module-others" package is using exclusion patterns to avoid
clashes between "server-module-foo" and "server-module-bar".
.PP
.nf
{
    "format": "dh_splitpackage 0.1",
    "packages": {
        "server": {
            "inclusion_patterns": [ 
               "/usr/bin/*",
               "/etc/server.d/*.conf"
             ]
        },
        "server-module-foo": {
            "inclusion_patterns": "/usr/lib/server-module-foo.so"
        },
        "server-module-bar": {
            "inclusion_patterns": "/usr/lib/server-module-bar.so"
        },
        "server-module-others": {
            "inclusion_patterns": "/usr/lib/server-module-*.so",
            "exclusion_patterns": [
                "/usr/lib/server-module-foo.so",
                "/usr/lib/server-module-bar.so"
            ]
        },
        "server-doc": {
            "inclusion_patterns": "/usr/share/doc/**/*"
        }
    }
}
.fi
.SS "Python library with unit tests"
.PP
Python library with separated tests. Tests are in a sub-directory of the
actual library. The package relies on "primary_package" setting to associate
leftover files with the "python-foo" package.
.PP
.nf
{
    "format": "dh_splitpackage 0.1",
    "primary_package": "python-foo",
    "packages": {
        "python-foo.tests": {
            "inclusion_patterns": "**/tests/**/*.py"
        }
    }
}
.fi
.SH "AUTHOR"
.PP
This manual page as well as dh_splitpackage itself was written by Zygmunt
Krynicki. You can contact me using the email address given below.
.SH "BUGS"
.PP
Please report bugs to Zygmunt Krynicki <zygmunt.krynicki@canonical.com>