1
<!-- DTD for TV listings
3
This is a DTD to represent a TV listing. It doesn't explicitly group
4
programmes by day or by channel, instead broadcast time and channel
5
are attributes of the 'programme' element. Optionally, data about the
6
TV channels used can be stored in 'channel' elements.
8
Data about a TV programme are stored in the subelements of element
9
'programme', but metadata such as when it will be broadcast are stored
12
Many of the details have a 'lang' attribute so that you can
13
store them in multiple languages or have mixed languages in a single
14
listing. This 'lang' should be the two-letter code such as 'en' or
15
'fr_FR'. Or you can just leave it out and let your reader take a
18
Unless otherwise specified, an element containing CDATA must have some
19
text if it is written.
21
An example XML file for this DTD might look like this:
23
<tv generator-info-name="my listings generator">
24
<channel id="3sat.de">
25
<display-name lang="de">3SAT</display-name>
27
<channel id="das-erste.de">
28
<display-name lang="de">ARD</display-name>
29
<display-name lang="de">Das Erste</display-name>
32
<programme start="200006031633" channel="3sat.de">
33
<title lang="de">blah</title>
34
<title lang="en">blah</title>
39
<director>blah</director>
45
<episode-num system="xmltv_ns">2 . 9 . 0/1</episode-num>
49
<rating system="MPAA">
51
<icon src="pg_symbol.png" />
57
<programme> ... </programme>
61
This describes two channels and then a programme broadcast on one of
62
the channels, then some more programmes. Almost everything in the DTD
63
is optional, so you can write files which are much simpler than this
66
All dates and times in this DTD follow the same format, loosely based
67
on ISO 8601. They can be 'YYYYMMDDhhmmss' or some initial
68
substring, for example if you only know the year and month you can
69
have 'YYYYMM'. You can also append a timezone to the end; if no
70
explicit timezone is given, UTC is assumed. Examples:
71
'200007281733 BST', '200209', '19880523083000 +0300'. (BST == +0100.)
73
Unless specified otherwise, textual element content may not contain
74
newlines - this is to make it easy to convert into line-oriented
75
formats, and to avoid the question of what exactly a newline would
76
mean in the middle of someone's name or whatever. Leading and
77
trailing whitespace in element content is not significant.
79
At present versions of this DTD correspond to releases of the 'xmltv'
80
package, which is a set of programs to generate and manipulate files
81
conforming to this DTD. Written by Ed Avis (ed@membled.com) and
82
Gottfried Szing, thanks to others for suggestions.
84
$Id: xmltv.dtd,v 1.34 2006/02/03 19:45:57 mattiasholmlund Exp $
88
<!-- The root element, tv.
90
Date should be the date when the listings were originally produced in
91
whatever format; if you're converting data from another source, then
92
use the date given by that source. The date when the conversion
93
itself was done is not important.
95
To indicate the source of the listings, there are three attributes you
98
'source-info-url' is a URL describing the data source in
99
some human-readable form. So if you are getting your listings from
100
SAT.1, you might set this to the URL of a page explaining how to
101
subscribe to their feed. If you are getting them from a website, the
102
URL might be the index of the site or at least of the TV listings
105
'source-info-name' is the link text for that URL; it should
106
generally be the human-readable name of your listings supplier.
107
Sometimes the link text might be printed without the link itself, in
108
hardcopy listings for example.
110
'source-data-url' is where the actual data is grabbed from. This
111
should link directly to the machine-readable data files if possible,
112
but it's not rigorously defined what 'actual data' means. If you are
113
parsing the data from human-readable pages, then it's more appropriate
114
to link to them with the source-info stuff and omit this attribute.
116
To publicize your wonderful program which generated this file, you can
117
use 'generator-info-name' (preferably in the form 'progname/version')
118
and 'generator-info-url' (a link to more info about the program).
120
<!ELEMENT tv (channel*, programme*)>
121
<!ATTLIST tv date CDATA #IMPLIED
122
source-info-url CDATA #IMPLIED
123
source-info-name CDATA #IMPLIED
124
source-data-url CDATA #IMPLIED
125
generator-info-name CDATA #IMPLIED
126
generator-info-url CDATA #IMPLIED >
128
<!-- channel - details of a channel
130
Each 'programme' element (see below) should have an attribute
131
'channel' giving the channel on which it is broadcast. If you want to
132
provide more detail about channels, you can give some 'channel'
133
elements before listing the programmes. The 'id' attribute of the
134
channel should match what is given in the 'channel' attribute of the
137
Typically, all the channels used in a particular TV listing will be
138
included and then the programmes using those channels. But it's
139
entirely optional to include channel details - you can just leave out
140
channel elements or provide only some of them. It is also okay to
141
give just channels and no programmes, if you just want to describe
142
what TV channels are available in a certain area.
144
Each channel has one id attribute, which must be unique and should
145
preferably be in the form suggested by RFC2838 (the 'broadcast'
146
element of the grammar in that RFC, in other words, a DNS-like name
147
but without any URI scheme). Then one or more display names which are
148
shown to the user. You might want a different display name for
149
different languages, but also you can have more than one name for the
150
same language. Names listed earlier are considered 'more canonical'.
152
Since the display name is just there as a way for humans to refer to
153
the channel, it's acceptable to just put the channel number if it's
154
fairly universal among viewers of the channel. But remember that this
155
isn't an official statement of what channel number has been
156
allocated, and the same number might be used for a different channel
159
The ordering of channel elements makes no difference to the meaning of
160
the file, since they are looked up by id and not by their position.
161
However it makes things like diffing easier if you write the channel
162
elements sorted by ASCII order of their ids.
164
<!ELEMENT channel (display-name+, icon*, url*) >
165
<!ATTLIST channel id CDATA #REQUIRED >
167
<!-- A user-friendly name for the channel - maybe even a channel
168
number. List the most canonical / common ones first and the most
169
obscure names last. The lang attribute follows RFC 1766.
171
<!ELEMENT display-name (#PCDATA)>
172
<!ATTLIST display-name lang CDATA #IMPLIED>
174
<!-- A URL where you can find out more about the element that contains
175
it (programme or channel). This might be the official site, or a fan
176
page, whatever you like really.
178
If multiple url elements are given, the most authoritative or official
179
(which might conflict...) sites should be listed first.
181
<!ELEMENT url (#PCDATA)>
183
<!-- programme - details of a single programme transmission
185
A show will be exactly the same whether it is broadcast at 18:00 or
186
19:00, and on whichever channel. Technical details like broadcast
187
time don't affect the content of the programme itself, so they are
188
included as attributes of this element. Start time and channel are
189
the two that you must include.
191
Sometimes VCR programming systems like PDC or VPS have their own
192
notion of 'start time' which is different from the actual start time,
193
so there are attributes for that. In practice, stop time will usually
194
be the start time of the next programme, but if you can get it more
195
accurate, good for you. Similarly, you can specify a code for
196
Gemstar's Showview or VideoPlus programming systems.
198
TV listings sometimes have the problem of listing two or more
199
programmes in the same timeslot, such as 'News; Weather'. We call
200
this a 'clump' of programmes, and the 'clumpidx' attribute
201
differentiates between two programmes sharing the same timeslot and
202
channel. In this case News would have clumpidx="0/2" and Weather
203
would have clumpidx="1/2". If you don't have this problem, be
206
It's intended that start time and stop time, when both are present,
207
make a half-closed interval: a programme is considered to be
208
broadcasting _at_ its start time but to stop just before its stop
209
time. In this way a programme from 11:00 to 12:00 does not overlap
210
with another programme from 12:00 to 13:00, not even for a moment.
211
Nor is there any gap between the two.
213
To do: Some means of indicating breaks between programmes on the same
214
channel. The 'channel' attribute references the 'id' of a channel
215
element, but the DTD doesn't give a way to specify this constraint.
216
Perhaps there is some better XML syntax we could use for that.
218
<!ELEMENT programme (title+, sub-title*, desc*, credits?, date?,
219
category*, language?, orig-language?, length?,
220
icon*, url*, country*, episode-num*, video?, audio?,
221
previously-shown?, premiere?, last-chance?, new?,
222
subtitles*, rating*, star-rating? )>
223
<!ATTLIST programme start CDATA #REQUIRED
225
pdc-start CDATA #IMPLIED
226
vps-start CDATA #IMPLIED
227
showview CDATA #IMPLIED
228
videoplus CDATA #IMPLIED
229
channel CDATA #REQUIRED
230
clumpidx CDATA "0/1" >
232
<!-- Programme title, eg 'The Simpsons'. -->
233
<!ELEMENT title (#PCDATA)>
234
<!ATTLIST title lang CDATA #IMPLIED>
236
<!-- Sub-title or episode title, eg 'Datalore'. Should probably be
237
called 'secondary title' to avoid confusion with captioning!
239
<!ELEMENT sub-title (#PCDATA)>
240
<!ATTLIST sub-title lang CDATA #IMPLIED>
242
<!-- Description of the programme or episode.
244
Unlike other elements, long bits of whitespace here are treated as
245
equivalent to a single space and newlines are permitted, so you can
246
break lines and write a pretty-looking paragraph if you wish.
248
<!ELEMENT desc (#PCDATA)>
249
<!ATTLIST desc lang CDATA #IMPLIED>
251
<!-- Credits for the programme.
253
People are listed in decreasing order of importance; so for example
254
the starring actors appear first followed by the smaller parts. As
255
with other parts of this file format, not mentioning a particular
256
actor (for example) does not imply that he _didn't_ star in the film -
257
so normally you'd list only the few most important people.
259
Adapter can be either somebody who adapted a work for television, or
260
somebody who did the translation from another language. Maybe these
261
should be separate, but if so how would 'translator' fit in with the
264
<!ELEMENT credits (director*, actor*, writer*, adapter*, producer*,
265
presenter*, commentator*, guest* )>
266
<!ELEMENT director (#PCDATA)>
267
<!ELEMENT actor (#PCDATA)>
268
<!ELEMENT writer (#PCDATA)>
269
<!ELEMENT adapter (#PCDATA)>
270
<!ELEMENT producer (#PCDATA)>
271
<!ELEMENT presenter (#PCDATA)>
272
<!ELEMENT commentator (#PCDATA)>
273
<!ELEMENT guest (#PCDATA)>
276
<!-- The date the programme or film was finished. This will probably
277
be the same as the copyright date.
279
<!ELEMENT date (#PCDATA)>
281
<!-- Type of programme, eg 'soap', 'comedy' or whatever the
282
equivalents are in your language. There's no predefined set of
283
categories and it's okay for a programme to belong to several.
285
<!ELEMENT category (#PCDATA)>
286
<!ATTLIST category lang CDATA #IMPLIED>
288
<!-- The language the programme will be broadcast in. This does not
289
include the language of any subtitles, but it is affected by dubbing
290
into a different language. For example, if a French film is dubbed
291
into English, language=en and orig-language=fr.
293
There are two ways to specify the language. You can use the
294
two-letter codes such as en or fr, or you can give a name such as
295
'English' or 'Deutsch'. In the latter case you might want to use the
296
'lang' attribute, for example
298
<language lang="fr">Allemand</language>
300
<!ELEMENT language (#PCDATA)>
301
<!ATTLIST language lang CDATA #IMPLIED>
303
<!-- The original language, before dubbing. The same remarks as for
306
<!ELEMENT orig-language (#PCDATA)>
307
<!ATTLIST orig-language lang CDATA #IMPLIED>
309
<!-- The true length of the programme, not counting advertisements or
310
trailers. But this does take account of any bits which were cut out
311
of the broadcast version - eg if a two hour film is cut to 110 minutes
312
and then padded with 20 minutes of advertising, length will be 110
313
minutes even though end time minus start time is 130 minutes.
315
<!ELEMENT length (#PCDATA)>
316
<!ATTLIST length units (seconds | minutes | hours) #REQUIRED>
318
<!-- An icon associated with the element that contains it.
320
width, height: (optional) dimensions of image
322
These dimensions are pixel dimensions for the time being, eventually
323
this will change to be more like HTML's 'img'.
325
<!ELEMENT icon EMPTY>
326
<!ATTLIST icon src CDATA #REQUIRED
328
height CDATA #IMPLIED>
330
<!-- The value of the element that contains it. This is for elements
331
that can have both a textual 'value' and an icon. At present there is
332
no 'lang' attribute here because things like 'PG' are not translatable
333
(although a document explaining what 'PG' actually means would be).
334
It happens that 'value' is used only for this sort of thing.
336
<!ELEMENT value (#PCDATA)>
338
<!-- A country where the programme was made or one of the countries in
339
a joint production. You can give the name of a country, in which case
340
you might want to specify the language in which this name is written,
341
or you can give a two-letter uppercase country code, in which case the
342
lang attribute should not be given. For example,
344
<country lang="en">Italy</country>
345
<country>GB</country>
347
<!ELEMENT country (#PCDATA)>
348
<!ATTLIST country lang CDATA #IMPLIED>
352
Not the title of the episode, its number or ID. There are several
353
ways of numbering episodes, so the 'system' attribute lets you specify
356
There are two predefined numbering systems, 'xmltv_ns' and
359
xmltv_ns: This is intended to be a general way to number episodes and
360
parts of multi-part episodes. It is three numbers separated by dots,
361
the first is the series or season, the second the episode number
362
within that series, and the third the part number, if the programme is
363
part of a two-parter. All these numbers are indexed from zero, and
364
they can be given in the form 'X/Y' to show series X out of Y series
365
made, or episode X out of Y episodes in this series, or part X of a
366
Y-part episode. If any of these aren't known they can be omitted.
367
You can put spaces whereever you like to make things easier to read.
369
(NB 'part number' is not used when a whole programme is split in two
370
for purely scheduling reasons; it's intended for cases where there
371
really is a 'Part One' and 'Part Two'. The format doesn't currently
372
have a way to represent a whole programme that happens to be split
373
across two or more timeslots.)
375
Some examples will make things clearer. The first episode of the
376
second series is '1.0.0/1' . If it were a two-part episode, then the
377
first half would be '1.0.0/2' and the second half '1.0.1/2'. If you
378
know that an episode is from the first season, but you don't know
379
which episode it is or whether it is part of a multiparter, you could
380
give the episode-num as '0..'. Here the second and third numbers have
381
been omitted. If you know that this is the first part of a three-part
382
episode, which is the last episode of the first series of thirteen,
383
its number would be '0 . 12/13 . 0/3'. The series number is just '0'
384
because you don't know how many series there are in total - perhaps
385
the show is still being made!
387
The other predefined system, onscreen, is to simply copy what the
388
programme makers write in the credits - 'Episode #FFEE' would
389
translate to '#FFEE'.
391
You are encouraged to use one of these two if possible; if xmltv_ns is
392
not general enough for your needs, let me know. But if you want, you
393
can use your own system and give the 'system' attribute as a URL
394
describing the system you use.
396
<!ELEMENT episode-num (#PCDATA)>
397
<!ATTLIST episode-num system CDATA "onscreen">
399
<!-- Video details: the subelements describe the picture quality as
402
present: whether this programme has a picture (no, in the
403
case of radio stations broadcast on TV or 'Blue'), legal values are
404
'yes' or 'no'. Obviously if the value is 'no', the other elements are
407
colour: 'yes' for colour, 'no' for black-and-white.
409
aspect: The horizontal:vertical aspect ratio, eg '4:3' or '16:9'.
411
quality: information on the quality, eg 'HDTV', '800x600'.
414
<!ELEMENT video (present?, colour?, aspect?, quality?)>
415
<!ELEMENT present (#PCDATA)>
416
<!ELEMENT colour (#PCDATA)>
417
<!ELEMENT aspect (#PCDATA)>
418
<!ELEMENT quality (#PCDATA)>
420
<!-- Audio details, similar to video details above.
422
present: whether this programme has any sound at all, 'yes' or 'no'.
424
stereo: Description of the stereo-ness of the sound. Legal values
425
are currently 'mono','stereo','dolby','dolby digital' and 'surround'; others like 'quad'
426
might be added later.
428
<!ELEMENT audio (present?, stereo?)>
429
<!ELEMENT stereo (#PCDATA)>
431
<!-- When and where the programme was last shown, if known. Normally
432
in TV listings 'repeat' means 'previously shown on this channel', but
433
if you don't know what channel the old screening was on (but do know
434
that it happened) then you can omit the 'channel' attribute.
435
Similarly you can omit the 'start' attribute if you don't know when
436
the previous transmission was (though you can of course give just the
439
The absence of this element does not say for certain that the
440
programme is brand new and has never been screened anywhere before.
442
<!ELEMENT previously-shown EMPTY>
443
<!ATTLIST previously-shown start CDATA #IMPLIED
444
channel CDATA #IMPLIED >
446
<!-- 'Premiere'. Different channels have different meanings for this
447
word - sometimes it means a film has never before been seen on TV in
448
that country, but other channels use it to mean 'the first showing of
449
this film on our channel in the current run'. It might have been
450
shown before, but now they have paid for another set of showings,
451
which makes the first in that set count as a premiere!
453
So this element doesn't have a clear meaning, just use it to represent
454
where 'premiere' would appear in a printed TV listing. You can use
455
the content of the element to explain exactly what is meant, for
459
First showing on national terrestrial TV
462
The textual content is a 'paragraph' as for <desc>. If you don't want
463
to give an explanation, just write empty content:
467
<!ELEMENT premiere (#PCDATA)>
468
<!ATTLIST premiere lang CDATA #IMPLIED>
470
<!-- Last-chance. In a way this is the opposite of premiere. Some
471
channels buy the rights to show a movie a certain number of times, and
472
the first may be flagged 'premiere', the last as 'last showing'.
474
For symmetry with premiere, you may use the element content to give a
475
'paragraph' describing exactly what is meant - it's unlikely to be the
476
last showing ever! Otherwise, explicitly put empty content:
480
<!ELEMENT last-chance (#PCDATA)>
481
<!ATTLIST last-chance lang CDATA #IMPLIED>
483
<!-- New. This is the first screened programme from a new show that
484
has never been shown on television before - if not worldwide then at
485
least never before in this country. After the first episode or
486
programme has been shown, subsequent ones are no longer 'new'.
487
Similarly the second series of an established programme is not 'new'.
489
Note that this does not mean 'new season' or 'new episode' of an
490
existing show. You can express part of that using the episode-num
495
<!-- Subtitles. These can be either 'teletext' (sent digitally, and
496
displayed at the viewer's request) or 'onscreen' (superimposed on the
497
picture and impossible to get rid of). You can have multiple subtitle
498
streams to handle different languages. Language for subtitles is
499
specified in the same way as for programmes.
501
<!ELEMENT subtitles (language?)>
502
<!ATTLIST subtitles type (teletext | onscreen) #IMPLIED>
504
<!-- Rating. Various bodies decide on classifications for films -
505
usually a minimum age you must be to see it. In principle the same
506
could be done for ordinary TV programmes. Because there are many
507
systems for doing this, you can also specify the rating system used
508
(which in practice is the same as the body which made the rating).
510
<!ELEMENT rating (value, icon*)>
511
<!ATTLIST rating system CDATA #IMPLIED>
513
<!-- 'Star rating' - many listings guides award a programme a score as
514
a quick guide to how good it is. The value of this element should be
515
'N / M', for example one star out of a possible five stars would be
516
'1 / 5'. Zero stars is also a possible score (and not the same as
517
'unrated'). You should try to map whatever wacky system your listings
518
source uses to a number of stars: so for example if they have thumbs
519
up, thumbs sideways and thumbs down, you could map that to two, one or
520
zero stars out of two. Whitespace between the numbers and slash is
523
<!ELEMENT star-rating (value, icon*)>
525
<!-- (Why are things like 'stereo', which must be one of a small
526
number of values, stored as the contents of elements rather than as
527
attributes? Because they are data rather than metadata. Attributes
528
are used for things like the language or encoding of element contents,
529
or for programme transmission details.) -->
1
<!-- DTD for TV listings
3
This is a DTD to represent a TV listing. It doesn't explicitly group
4
programmes by day or by channel, instead broadcast time and channel
5
are attributes of the 'programme' element. Optionally, data about the
6
TV channels used can be stored in 'channel' elements.
8
Data about a TV programme are stored in the subelements of element
9
'programme', but metadata such as when it will be broadcast are stored
12
Many of the details have a 'lang' attribute so that you can
13
store them in multiple languages or have mixed languages in a single
14
listing. This 'lang' should be the two-letter code such as 'en' or
15
'fr_FR'. Or you can just leave it out and let your reader take a
18
Unless otherwise specified, an element containing CDATA must have some
19
text if it is written.
21
An example XML file for this DTD might look like this:
23
<tv generator-info-name="my listings generator">
24
<channel id="3sat.de">
25
<display-name lang="de">3SAT</display-name>
27
<channel id="das-erste.de">
28
<display-name lang="de">ARD</display-name>
29
<display-name lang="de">Das Erste</display-name>
32
<programme start="200006031633" channel="3sat.de">
33
<title lang="de">blah</title>
34
<title lang="en">blah</title>
39
<director>blah</director>
45
<episode-num system="xmltv_ns">2 . 9 . 0/1</episode-num>
49
<rating system="MPAA">
51
<icon src="pg_symbol.png" />
57
<programme> ... </programme>
61
This describes two channels and then a programme broadcast on one of
62
the channels, then some more programmes. Almost everything in the DTD
63
is optional, so you can write files which are much simpler than this
66
All dates and times in this DTD follow the same format, loosely based
67
on ISO 8601. They can be 'YYYYMMDDhhmmss' or some initial
68
substring, for example if you only know the year and month you can
69
have 'YYYYMM'. You can also append a timezone to the end; if no
70
explicit timezone is given, UTC is assumed. Examples:
71
'200007281733 BST', '200209', '19880523083000 +0300'. (BST == +0100.)
73
Unless specified otherwise, textual element content may not contain
74
newlines - this is to make it easy to convert into line-oriented
75
formats, and to avoid the question of what exactly a newline would
76
mean in the middle of someone's name or whatever. Leading and
77
trailing whitespace in element content is not significant.
79
At present versions of this DTD correspond to releases of the 'xmltv'
80
package, which is a set of programs to generate and manipulate files
81
conforming to this DTD. Written by Ed Avis (ed@membled.com) and
82
Gottfried Szing, thanks to others for suggestions.
84
$Id: xmltv.dtd,v 1.34 2006/02/03 19:45:57 mattiasholmlund Exp $
88
<!-- The root element, tv.
90
Date should be the date when the listings were originally produced in
91
whatever format; if you're converting data from another source, then
92
use the date given by that source. The date when the conversion
93
itself was done is not important.
95
To indicate the source of the listings, there are three attributes you
98
'source-info-url' is a URL describing the data source in
99
some human-readable form. So if you are getting your listings from
100
SAT.1, you might set this to the URL of a page explaining how to
101
subscribe to their feed. If you are getting them from a website, the
102
URL might be the index of the site or at least of the TV listings
105
'source-info-name' is the link text for that URL; it should
106
generally be the human-readable name of your listings supplier.
107
Sometimes the link text might be printed without the link itself, in
108
hardcopy listings for example.
110
'source-data-url' is where the actual data is grabbed from. This
111
should link directly to the machine-readable data files if possible,
112
but it's not rigorously defined what 'actual data' means. If you are
113
parsing the data from human-readable pages, then it's more appropriate
114
to link to them with the source-info stuff and omit this attribute.
116
To publicize your wonderful program which generated this file, you can
117
use 'generator-info-name' (preferably in the form 'progname/version')
118
and 'generator-info-url' (a link to more info about the program).
120
<!ELEMENT tv (channel*, programme*)>
121
<!ATTLIST tv date CDATA #IMPLIED
122
source-info-url CDATA #IMPLIED
123
source-info-name CDATA #IMPLIED
124
source-data-url CDATA #IMPLIED
125
generator-info-name CDATA #IMPLIED
126
generator-info-url CDATA #IMPLIED >
128
<!-- channel - details of a channel
130
Each 'programme' element (see below) should have an attribute
131
'channel' giving the channel on which it is broadcast. If you want to
132
provide more detail about channels, you can give some 'channel'
133
elements before listing the programmes. The 'id' attribute of the
134
channel should match what is given in the 'channel' attribute of the
137
Typically, all the channels used in a particular TV listing will be
138
included and then the programmes using those channels. But it's
139
entirely optional to include channel details - you can just leave out
140
channel elements or provide only some of them. It is also okay to
141
give just channels and no programmes, if you just want to describe
142
what TV channels are available in a certain area.
144
Each channel has one id attribute, which must be unique and should
145
preferably be in the form suggested by RFC2838 (the 'broadcast'
146
element of the grammar in that RFC, in other words, a DNS-like name
147
but without any URI scheme). Then one or more display names which are
148
shown to the user. You might want a different display name for
149
different languages, but also you can have more than one name for the
150
same language. Names listed earlier are considered 'more canonical'.
152
Since the display name is just there as a way for humans to refer to
153
the channel, it's acceptable to just put the channel number if it's
154
fairly universal among viewers of the channel. But remember that this
155
isn't an official statement of what channel number has been
156
allocated, and the same number might be used for a different channel
159
The ordering of channel elements makes no difference to the meaning of
160
the file, since they are looked up by id and not by their position.
161
However it makes things like diffing easier if you write the channel
162
elements sorted by ASCII order of their ids.
164
<!ELEMENT channel (display-name+, icon*, url*) >
165
<!ATTLIST channel id CDATA #REQUIRED >
167
<!-- A user-friendly name for the channel - maybe even a channel
168
number. List the most canonical / common ones first and the most
169
obscure names last. The lang attribute follows RFC 1766.
171
<!ELEMENT display-name (#PCDATA)>
172
<!ATTLIST display-name lang CDATA #IMPLIED>
174
<!-- A URL where you can find out more about the element that contains
175
it (programme or channel). This might be the official site, or a fan
176
page, whatever you like really.
178
If multiple url elements are given, the most authoritative or official
179
(which might conflict...) sites should be listed first.
181
<!ELEMENT url (#PCDATA)>
183
<!-- programme - details of a single programme transmission
185
A show will be exactly the same whether it is broadcast at 18:00 or
186
19:00, and on whichever channel. Technical details like broadcast
187
time don't affect the content of the programme itself, so they are
188
included as attributes of this element. Start time and channel are
189
the two that you must include.
191
Sometimes VCR programming systems like PDC or VPS have their own
192
notion of 'start time' which is different from the actual start time,
193
so there are attributes for that. In practice, stop time will usually
194
be the start time of the next programme, but if you can get it more
195
accurate, good for you. Similarly, you can specify a code for
196
Gemstar's Showview or VideoPlus programming systems.
198
TV listings sometimes have the problem of listing two or more
199
programmes in the same timeslot, such as 'News; Weather'. We call
200
this a 'clump' of programmes, and the 'clumpidx' attribute
201
differentiates between two programmes sharing the same timeslot and
202
channel. In this case News would have clumpidx="0/2" and Weather
203
would have clumpidx="1/2". If you don't have this problem, be
206
It's intended that start time and stop time, when both are present,
207
make a half-closed interval: a programme is considered to be
208
broadcasting _at_ its start time but to stop just before its stop
209
time. In this way a programme from 11:00 to 12:00 does not overlap
210
with another programme from 12:00 to 13:00, not even for a moment.
211
Nor is there any gap between the two.
213
To do: Some means of indicating breaks between programmes on the same
214
channel. The 'channel' attribute references the 'id' of a channel
215
element, but the DTD doesn't give a way to specify this constraint.
216
Perhaps there is some better XML syntax we could use for that.
218
<!ELEMENT programme (title+, sub-title*, desc*, credits?, date?,
219
category*, language?, orig-language?, length?,
220
icon*, url*, country*, episode-num*, video?, audio?,
221
previously-shown?, premiere?, last-chance?, new?,
222
subtitles*, rating*, star-rating? )>
223
<!ATTLIST programme start CDATA #REQUIRED
225
pdc-start CDATA #IMPLIED
226
vps-start CDATA #IMPLIED
227
showview CDATA #IMPLIED
228
videoplus CDATA #IMPLIED
229
channel CDATA #REQUIRED
230
clumpidx CDATA "0/1" >
232
<!-- Programme title, eg 'The Simpsons'. -->
233
<!ELEMENT title (#PCDATA)>
234
<!ATTLIST title lang CDATA #IMPLIED>
236
<!-- Sub-title or episode title, eg 'Datalore'. Should probably be
237
called 'secondary title' to avoid confusion with captioning!
239
<!ELEMENT sub-title (#PCDATA)>
240
<!ATTLIST sub-title lang CDATA #IMPLIED>
242
<!-- Description of the programme or episode.
244
Unlike other elements, long bits of whitespace here are treated as
245
equivalent to a single space and newlines are permitted, so you can
246
break lines and write a pretty-looking paragraph if you wish.
248
<!ELEMENT desc (#PCDATA)>
249
<!ATTLIST desc lang CDATA #IMPLIED>
251
<!-- Credits for the programme.
253
People are listed in decreasing order of importance; so for example
254
the starring actors appear first followed by the smaller parts. As
255
with other parts of this file format, not mentioning a particular
256
actor (for example) does not imply that he _didn't_ star in the film -
257
so normally you'd list only the few most important people.
259
Adapter can be either somebody who adapted a work for television, or
260
somebody who did the translation from another language. Maybe these
261
should be separate, but if so how would 'translator' fit in with the
264
<!ELEMENT credits (director*, actor*, writer*, adapter*, producer*,
265
presenter*, commentator*, guest* )>
266
<!ELEMENT director (#PCDATA)>
267
<!ELEMENT actor (#PCDATA)>
268
<!ELEMENT writer (#PCDATA)>
269
<!ELEMENT adapter (#PCDATA)>
270
<!ELEMENT producer (#PCDATA)>
271
<!ELEMENT presenter (#PCDATA)>
272
<!ELEMENT commentator (#PCDATA)>
273
<!ELEMENT guest (#PCDATA)>
276
<!-- The date the programme or film was finished. This will probably
277
be the same as the copyright date.
279
<!ELEMENT date (#PCDATA)>
281
<!-- Type of programme, eg 'soap', 'comedy' or whatever the
282
equivalents are in your language. There's no predefined set of
283
categories and it's okay for a programme to belong to several.
285
<!ELEMENT category (#PCDATA)>
286
<!ATTLIST category lang CDATA #IMPLIED>
288
<!-- The language the programme will be broadcast in. This does not
289
include the language of any subtitles, but it is affected by dubbing
290
into a different language. For example, if a French film is dubbed
291
into English, language=en and orig-language=fr.
293
There are two ways to specify the language. You can use the
294
two-letter codes such as en or fr, or you can give a name such as
295
'English' or 'Deutsch'. In the latter case you might want to use the
296
'lang' attribute, for example
298
<language lang="fr">Allemand</language>
300
<!ELEMENT language (#PCDATA)>
301
<!ATTLIST language lang CDATA #IMPLIED>
303
<!-- The original language, before dubbing. The same remarks as for
306
<!ELEMENT orig-language (#PCDATA)>
307
<!ATTLIST orig-language lang CDATA #IMPLIED>
309
<!-- The true length of the programme, not counting advertisements or
310
trailers. But this does take account of any bits which were cut out
311
of the broadcast version - eg if a two hour film is cut to 110 minutes
312
and then padded with 20 minutes of advertising, length will be 110
313
minutes even though end time minus start time is 130 minutes.
315
<!ELEMENT length (#PCDATA)>
316
<!ATTLIST length units (seconds | minutes | hours) #REQUIRED>
318
<!-- An icon associated with the element that contains it.
320
width, height: (optional) dimensions of image
322
These dimensions are pixel dimensions for the time being, eventually
323
this will change to be more like HTML's 'img'.
325
<!ELEMENT icon EMPTY>
326
<!ATTLIST icon src CDATA #REQUIRED
328
height CDATA #IMPLIED>
330
<!-- The value of the element that contains it. This is for elements
331
that can have both a textual 'value' and an icon. At present there is
332
no 'lang' attribute here because things like 'PG' are not translatable
333
(although a document explaining what 'PG' actually means would be).
334
It happens that 'value' is used only for this sort of thing.
336
<!ELEMENT value (#PCDATA)>
338
<!-- A country where the programme was made or one of the countries in
339
a joint production. You can give the name of a country, in which case
340
you might want to specify the language in which this name is written,
341
or you can give a two-letter uppercase country code, in which case the
342
lang attribute should not be given. For example,
344
<country lang="en">Italy</country>
345
<country>GB</country>
347
<!ELEMENT country (#PCDATA)>
348
<!ATTLIST country lang CDATA #IMPLIED>
352
Not the title of the episode, its number or ID. There are several
353
ways of numbering episodes, so the 'system' attribute lets you specify
356
There are two predefined numbering systems, 'xmltv_ns' and
359
xmltv_ns: This is intended to be a general way to number episodes and
360
parts of multi-part episodes. It is three numbers separated by dots,
361
the first is the series or season, the second the episode number
362
within that series, and the third the part number, if the programme is
363
part of a two-parter. All these numbers are indexed from zero, and
364
they can be given in the form 'X/Y' to show series X out of Y series
365
made, or episode X out of Y episodes in this series, or part X of a
366
Y-part episode. If any of these aren't known they can be omitted.
367
You can put spaces whereever you like to make things easier to read.
369
(NB 'part number' is not used when a whole programme is split in two
370
for purely scheduling reasons; it's intended for cases where there
371
really is a 'Part One' and 'Part Two'. The format doesn't currently
372
have a way to represent a whole programme that happens to be split
373
across two or more timeslots.)
375
Some examples will make things clearer. The first episode of the
376
second series is '1.0.0/1' . If it were a two-part episode, then the
377
first half would be '1.0.0/2' and the second half '1.0.1/2'. If you
378
know that an episode is from the first season, but you don't know
379
which episode it is or whether it is part of a multiparter, you could
380
give the episode-num as '0..'. Here the second and third numbers have
381
been omitted. If you know that this is the first part of a three-part
382
episode, which is the last episode of the first series of thirteen,
383
its number would be '0 . 12/13 . 0/3'. The series number is just '0'
384
because you don't know how many series there are in total - perhaps
385
the show is still being made!
387
The other predefined system, onscreen, is to simply copy what the
388
programme makers write in the credits - 'Episode #FFEE' would
389
translate to '#FFEE'.
391
You are encouraged to use one of these two if possible; if xmltv_ns is
392
not general enough for your needs, let me know. But if you want, you
393
can use your own system and give the 'system' attribute as a URL
394
describing the system you use.
396
<!ELEMENT episode-num (#PCDATA)>
397
<!ATTLIST episode-num system CDATA "onscreen">
399
<!-- Video details: the subelements describe the picture quality as
402
present: whether this programme has a picture (no, in the
403
case of radio stations broadcast on TV or 'Blue'), legal values are
404
'yes' or 'no'. Obviously if the value is 'no', the other elements are
407
colour: 'yes' for colour, 'no' for black-and-white.
409
aspect: The horizontal:vertical aspect ratio, eg '4:3' or '16:9'.
411
quality: information on the quality, eg 'HDTV', '800x600'.
414
<!ELEMENT video (present?, colour?, aspect?, quality?)>
415
<!ELEMENT present (#PCDATA)>
416
<!ELEMENT colour (#PCDATA)>
417
<!ELEMENT aspect (#PCDATA)>
418
<!ELEMENT quality (#PCDATA)>
420
<!-- Audio details, similar to video details above.
422
present: whether this programme has any sound at all, 'yes' or 'no'.
424
stereo: Description of the stereo-ness of the sound. Legal values
425
are currently 'mono','stereo','dolby','dolby digital' and 'surround'; others like 'quad'
426
might be added later.
428
<!ELEMENT audio (present?, stereo?)>
429
<!ELEMENT stereo (#PCDATA)>
431
<!-- When and where the programme was last shown, if known. Normally
432
in TV listings 'repeat' means 'previously shown on this channel', but
433
if you don't know what channel the old screening was on (but do know
434
that it happened) then you can omit the 'channel' attribute.
435
Similarly you can omit the 'start' attribute if you don't know when
436
the previous transmission was (though you can of course give just the
439
The absence of this element does not say for certain that the
440
programme is brand new and has never been screened anywhere before.
442
<!ELEMENT previously-shown EMPTY>
443
<!ATTLIST previously-shown start CDATA #IMPLIED
444
channel CDATA #IMPLIED >
446
<!-- 'Premiere'. Different channels have different meanings for this
447
word - sometimes it means a film has never before been seen on TV in
448
that country, but other channels use it to mean 'the first showing of
449
this film on our channel in the current run'. It might have been
450
shown before, but now they have paid for another set of showings,
451
which makes the first in that set count as a premiere!
453
So this element doesn't have a clear meaning, just use it to represent
454
where 'premiere' would appear in a printed TV listing. You can use
455
the content of the element to explain exactly what is meant, for
459
First showing on national terrestrial TV
462
The textual content is a 'paragraph' as for <desc>. If you don't want
463
to give an explanation, just write empty content:
467
<!ELEMENT premiere (#PCDATA)>
468
<!ATTLIST premiere lang CDATA #IMPLIED>
470
<!-- Last-chance. In a way this is the opposite of premiere. Some
471
channels buy the rights to show a movie a certain number of times, and
472
the first may be flagged 'premiere', the last as 'last showing'.
474
For symmetry with premiere, you may use the element content to give a
475
'paragraph' describing exactly what is meant - it's unlikely to be the
476
last showing ever! Otherwise, explicitly put empty content:
480
<!ELEMENT last-chance (#PCDATA)>
481
<!ATTLIST last-chance lang CDATA #IMPLIED>
483
<!-- New. This is the first screened programme from a new show that
484
has never been shown on television before - if not worldwide then at
485
least never before in this country. After the first episode or
486
programme has been shown, subsequent ones are no longer 'new'.
487
Similarly the second series of an established programme is not 'new'.
489
Note that this does not mean 'new season' or 'new episode' of an
490
existing show. You can express part of that using the episode-num
495
<!-- Subtitles. These can be either 'teletext' (sent digitally, and
496
displayed at the viewer's request) or 'onscreen' (superimposed on the
497
picture and impossible to get rid of). You can have multiple subtitle
498
streams to handle different languages. Language for subtitles is
499
specified in the same way as for programmes.
501
<!ELEMENT subtitles (language?)>
502
<!ATTLIST subtitles type (teletext | onscreen) #IMPLIED>
504
<!-- Rating. Various bodies decide on classifications for films -
505
usually a minimum age you must be to see it. In principle the same
506
could be done for ordinary TV programmes. Because there are many
507
systems for doing this, you can also specify the rating system used
508
(which in practice is the same as the body which made the rating).
510
<!ELEMENT rating (value, icon*)>
511
<!ATTLIST rating system CDATA #IMPLIED>
513
<!-- 'Star rating' - many listings guides award a programme a score as
514
a quick guide to how good it is. The value of this element should be
515
'N / M', for example one star out of a possible five stars would be
516
'1 / 5'. Zero stars is also a possible score (and not the same as
517
'unrated'). You should try to map whatever wacky system your listings
518
source uses to a number of stars: so for example if they have thumbs
519
up, thumbs sideways and thumbs down, you could map that to two, one or
520
zero stars out of two. Whitespace between the numbers and slash is
523
<!ELEMENT star-rating (value, icon*)>
525
<!-- (Why are things like 'stereo', which must be one of a small
526
number of values, stored as the contents of elements rather than as
527
attributes? Because they are data rather than metadata. Attributes
528
are used for things like the language or encoding of element contents,
529
or for programme transmission details.) -->