1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
|
== Simple Sync Format ==
Simple Sync format consists of 2 different file formats:
* A "products list" (format=products:1.0)
* A "index" (format=index:1.0)
Files contain JSON formatted data.
Data can come in one of 2 formats:
* JSON file: <file>.json
A .json file can be accompanied by a .json.gpg file which contains
signature data for .json file.
Due to race conditions caused by the fact that .json and .json.gpg
may not be able to be obtained from a storage location at the same
time, the preferred delivery of signed data is via '.sjson' format.
* Signed JSON File: <file>.sjson
This is a GPG cleartext signed message:
https://tools.ietf.org/html/rfc4880#section-7
The payload the same content that would be included in the JSON file.
Special dictionary entries:
* 'path': If a 'product' dictionary in an index file or a item dictionary
in a products file contains a 'path' element, then that indicates there
is content to be downloaded associated with that element.
A 'path' must value must be relative to the base of the mirror.
* 'md5', 'sha256', 'sha512':
If an item contains a 'path' and one of these fields, then the content
referenced must have the given checksum(s).
* 'size':
For an item with a 'path', this indicates the expected download size.
It should be present for a item with a path in a products file.
Having access to expected size allows the client to provide progress
and also reduces the potential for hash collision attacks.
* 'updated'
This field can exist at the top level of a products or index, and
contains a RFC 2822 timestamp indicating when the file was last updated
This allows a client to quickly note that it is up to date.
== Simple Sync Mirrors ==
The default/expected location of an index file is 'streams/v1/index.sjson'
or 'streams/v1/index.json' underneath the top level of a mirror.
'path' entries as described above are relative to the top level of a
mirror, not relative to the location of the index.
For example:
http://example.com/my-mirror/
would be the top level of a mirror, and the expected path of an index is
http://example.com/my-mirror/streams/v1/index.sjson
To describe a file that lives at:
http://example.com/my-mirror/streams/v1/products.sjson
The 'path' element must be: 'streams/v1/products.sjson'
== Products List ==
products list: (format=products:1.0)
For Ubuntu, an example product is 'server:precise:amd64'
A Products list has a 'content_id' and multiple products.
a product has multiple versions
a version has multiple items
An item can be globally uniquely identified by the path to it.
i.e., the 'content_id' for a products list and the key in each
element of the tree form a unique tuple for that item. Given:
content_id = tree['content_id']
prod_name = tree['products'].keys()[0]
ver_name = tree['products'][prod_name]['versions'].keys(0)
item_name = tree['products'][prod_name]['versions'][ver_name].keys(0)
that unique tuple is:
(content_id, prod_name, ver_name, item_name)
The following is a description of each of these fields:
* content_id is formed similarly to an ISCSI qualified name (IQN)
An example is:
com.ubuntu.cloud:released:aws
It should have a reverse domain portion followed by a portion
that represents a name underneath that domain.
* product_name: product name is unique within a products list. The same
product name may appear in multiple products_lists. For example,
in Ubuntu, 'server:precise:amd64' will appear in both
'com.ubuntu.cloud:released:aws' and
'com.ubuntu.cloud:released:download'.
That name collision should imply that the two separate
<content_id><product_name> pairs are equivalent in some manner.
* version_name:
A 'version' of a product represents a release, build or collection of
that product. A key in the 'versions' dictionary should be sortable
by rules of a 'LANG=C sort()'. That allows the client to trivially
order versions to find the most recent. Ubuntu uses "serial" numbers
for these keys, in the format YYYYMMDD[.0-9].
* item_name:
Inside of a version, there may be multiple items. An example would be
a binary build and a source tarball.
For Ubuntu download images, these are things like '.tar.gz',
'-disk1.img' and '-root.tar.gz'.
The item name does not need to be user-friendly. It must be
consistent. Because this id is unique within the given
'version_name', a client needs only to store that key, rather than
trying to determine which keys inside the item dictionary identify it.
An 'item' dictionary may contain a 'path' element.
'path' entries for a given item must be immutable. That is, for a
given 'path' under a mirror, the content must never change.
== Index ==
This is a index of products files that are available.
It has a top level 'index' dictionary. Each entry in that dictionary is a
content_id of a products file. The entry should have a 'path' item that
indicates where to download the product.
All other data inside the product entry is not required, but helps a client
to find what they're looking for.
item groups of the same "type".
this is 'stream:1.0' format.
* stream collection: a list of content streams
A stream collection is simply a way to provide an index of known content
streams, and information about them.
This is 'stream-collection:1.0'
Useful definitions
* item group
an item group is a list of like items. e.g. all produced by the same build.
requirements:
* serial: a 'serial' entry that can be sorted by YYYYMMDD[.X]
* items: a list of items
Example item groups are:
* output of the amd64 cloud image build done on 2012-04-04
* amd64 images from the cirros release version 0.3.1
* item
There are 1 or more items in a item group.
requirements:
* name: must be unique within the item group.
special fields:
* path: If an item has a 'path', then the target must be obtainable and
should be downloaded when mirroring.
* md5sum: stores checksum
Example:
* "disk1.img" produced from the amd64 cloud image build done on 2012-04-04
* -root.tar.gz produced from the same build.
Notes:
* index files are not required to be signed, as they only
contain references to other content that is signed, and that is hosted
on the same mirror.
|