1
== Simple Sync Format ==
2
Simple Sync format consists of 2 different file formats.
3
* A "products list" (format=products:1.0)
4
* A "index" (format=index:1.0)
6
Files contain json formated data.
7
Data can come in one of 2 formats:
9
A .js file can be accompanied by a .js.gpg file which contains
10
signature data for .js file.
12
Due to race conditions caused by the fact that .js and .js.gpg
13
may not be able to be obtained from a storage location at the same
14
time, the preferred delivery of signed data is via '.sjs' format.
16
* Signed Json File: <file>.sjs
17
This is a gpg cleartext signed message.
18
http://rfc-ref.org/RFC-TEXTS/2440/chapter7.html
19
The payload the same content that would be included in the json file.
21
Special dictionary entries:
22
* 'path': If a 'product' dictionary in an index file or a item dictionary
23
in a products file contains a 'path' element, then that indicates there
24
is content to be downloaded associated with that element.
26
A 'path' must value must be relative to the base of the mirror.
29
* 'md5', 'sha256', 'sha512':
30
If an item contains a 'path' and one of these fields, then the content
31
referenced must have the given checksum(s).
34
For an item with a 'path', this indicates the expected download size.
35
It should be present for a item with a path in a products file.
36
Having access to expected size allows the client to provide progress
37
and also reduces the potential for hash collision attacks.
40
This field can exist at the top level of a products or index, and
41
contains a rfc-2822 timestamp indicating when the file was last updated
42
This allows a client to quickly note that it is up to date.
44
== Simple Sync Mirrors ==
45
The default/expected location of an index file is 'streams/v1/index.sjs'
46
or 'streams/v1/index.js' underneath the top level of a mirror.
48
'path' entries as described above are relative to the top level of a
49
mirror, not relative to the location of the index.
52
http://example.com/my-mirror/
53
would be the top level of a mirror, and the expected path of an index is
54
http://example.com/my-mirror/streams/v1/index.sjs
56
To describe a file that lives at:
57
http://example.com/my-mirror/streams/v1/products.sjs
59
The 'path' element must be: 'streams/v1/products.sjs'
62
products list: (format=products:1.0)
63
For Ubuntu, a product is 'server:precise:amd64'
64
A Products list has a 'content_id' and multiple products.
65
a product has multiple versions
66
a version has multiple items
68
An item can be globally uniquely identified by the path to it.
69
Ie, the 'content_id' for a products list and the key in each
70
element of the tree form a unique tuple for that item. Given:
71
content_id = tree['content_id']
72
prod_name = tree['products'].keys()[0]
73
ver_name = tree['products'][prod_name]['versions'].keys(0)
74
item_name = tree['products'][prod_name]['versions'][ver_name].keys(0)
76
(content_id, prod_name, ver_name, item_name)
78
The following is a description of each of these fields:
79
* content_id is formed similarly to an ISCSI qualified name (IQN)
81
com.ubuntu.cloud:released:aws
82
It should have a reverse domain portion followed by a portion
83
that represents a name underneith that domain.
85
* product_name: product name is unique within a products list. The same
86
product name may appear in multiple products_lists. For example,
87
In Ubuntu, 'server:precise:amd64' will appear in both
88
'com.ubuntu.cloud:released:aws' and
89
'com.ubuntu.cloud:released:download'.
91
That name collision should imply that the two separate
92
<content_id><product_name> pairs are equivalent in some manner.
95
A 'version' of a product represents a release, build or collection of
96
that product. A key in the 'versions' dictionary should be sortable
97
by rules of a 'LANG=C sort()'. That allows the client to trivially
98
order versions to find the most recent. Ubuntu uses "serial" numbers
99
for these keys, in the format YYYYMMDD[.0-9].
102
Inside of a version, there may be multiple items. An example would be
103
a binary build and a source tarball.
105
For Ubuntu download images, these are things like '.tar.gz',
106
'-disk1.img' and '-root.tar.gz'.
108
The item name does not need to be user-friendly. It must be
109
consistent. Because this id is unique amoungst the given
110
'version_name', a client needs only to store that key, rather than
111
trying to determine which keys inside the item dictionary identify it.
113
An 'item' dictionary may contain a 'path' element.
115
'path' entries for a given item must be immutable. That is, for a
116
given 'path' under a mirror, the content must never change.
119
This is a index of products files that are available.
120
It has a top level 'index' dictionary. Each entry in that dictionary is a
121
content_id of a products file. The entry should have a 'path' item that
122
indicates where to download the product.
124
All other data inside the product entry is not required, but helps a client
125
to find what they're looking for.
127
item groups of the same "type".
128
this is 'stream:1.0' format.
129
* stream collection: a list of content streams
130
A stream collection is simply a way to provide an index of known content
131
streams, and information about them.
132
This is 'stream-collection:1.0'
136
an item group is a list of like items. Ie, all produced by the same build.
138
* serial: a 'serial' entry that can be sorted by YYYYMMDD[.X]
139
* items: a list of items
141
Example item groups are:
142
* output of the amd64 cloud image build done on 2012-04-04
143
* amd64 images from the cirros release version 0.3.1
146
There are 1 or more items in a item group.
148
* name: must be unique within the item group.
151
* path: If an item has a 'path', then the target must be obtainable and
152
should be downloaded when mirroring.
153
* md5sum: stores checksum
156
* "disk1.img" produced from the amd64 cloud image build done on 2012-04-04
157
* -root.tar.gz produced from the same build.
160
* index files are not required to be signed, as they only
161
contain references to other content that is signed, and that is hosted