1
<?xml version='1.0' encoding='UTF-8'?>
2
<!DOCTYPE s1 SYSTEM 'dtd/document.dtd'>
3
<s1 title='XNI Samples'>
6
The Xerces Native Interface (XNI) is an internal API that is
7
independent of other XML APIs and is used to implement the
8
Xerces family of parsers. XNI allows a wide variety of parsers
9
to be written in an easy and modular fashion. The XNI samples
10
included with Xerces are simple examples of how to program
11
using the XNI API. However, for information on how to take full
12
advantage of this powerful framework, refer to the
13
<link idref='xni'>XNI Manual</link>.
15
<p>Basic XNI samples:</p>
17
<li><link anchor='Counter'>xni.Counter</link></li>
18
<li><link anchor='DocumentTracer'>xni.DocumentTracer</link></li>
19
<li><link anchor='Writer'>xni.Writer</link></li>
20
<li><link anchor='PSVIWriter'>xni.PSVIWriter</link></li>
21
<li><link anchor='XMLGrammarBuilder'>xni.XMLGrammarBuilder</link></li>
24
<li><link anchor='PassThroughFilter'>xni.PassThroughFilter</link></li>
25
<li><link anchor='UpperCaseFilter'>xni.UpperCaseFilter</link></li>
27
<p>Parser configuration samples:</p>
29
<li><link anchor='NonValidatingParserConfiguration'>xni.parser.NonValidatingParserConfiguration</link></li>
30
<li><link anchor='AbstractConfiguration'>xni.parser.AbstractConfiguration</link></li>
32
- REVISIT: Add in this sample once the proper interfaces have been
33
- designed and implemented in the parser. *And* after the
34
- sample code has been written.
36
<li><link anchor='DynamicParserConfiguration'>xni.parser.DynamicParserConfiguration</link></li>
40
<li><link anchor='CSVConfiguration'>xni.parser.CSVConfiguration</link></li>
41
<li><link anchor='CSVParser'>xni.parser.CSVParser</link></li>
42
<li><link anchor='PSVIConfiguration'>xni.parser.PSVIConfiguration</link></li>
43
<li><link anchor='PSVIParser'>xni.parser.PSVIParser</link></li>
45
<p>Sample xerces.properties</p>
47
<li><link anchor='xercesProperties'>xni/xerces.properties</link></li>
50
Most of the XNI samples have a command line option that allows the
51
user to specify a different XNI parser configuration to use. In
52
order to supply another parser configuration besides the default
53
Xerces <code>StandardParserConfiguration</code>, the configuration
55
<code>org.apache.xerces.xni.parser.XMLParserConfiguration</code>
59
<anchor name='Counter'/>
60
<s2 title='Sample xni.Counter'>
62
A sample XNI counter. The output of this program shows the time
63
and count of elements, attributes, ignorable whitespaces, and
64
characters appearing in the document.
67
This class is useful as a "poor-man's" performance tester to
68
compare the speed and accuracy of various parser configurations.
69
However, it is important to note that the first parse time of a
70
parser will include both VM class load time and parser
71
initialization that would not be present in subsequent parses
75
The results produced by this program should never be accepted as
76
true performance measurements.
79
<source>java xni.Counter (options) uri ...</source>
83
<tr><th>Option</th><th>Description</th></tr>
84
<tr><td>-p name</td><td>Select parser configuration by name.</td></tr>
85
<tr><td>-x number</td><td>Select number of repetitions.</td></tr>
86
<tr><td>-n | -N</td><td>Turn on/off namespace processing.</td></tr>
90
Turn on/off namespace prefixes.<br/>
91
<strong>NOTE:</strong> Requires use of -n.
94
<tr><td>-v | -V</td><td>Turn on/off validation.</td></tr>
98
Turn on/off Schema validation support.<br/>
99
<strong>NOTE:</strong> Not supported by all parser configurations.
105
Turn on/off Schema full checking.<br/>
106
<strong>NOTE:</strong> Requires use of -s and not supported by all parsers.
109
<tr><td>-m | -M</td><td>Turn on/off memory usage report.</td></tr>
110
<tr><td>-t | -T</td><td>Turn on/off \"tagginess\" report.</td></tr>
113
<td>Output user defined comment before next parse.</td>
115
<tr><td>-h</td><td>Display help screen.</td></tr>
120
The speed and memory results from this program should NOT be used
121
as the basis of parser performance comparison! Real analytical
122
methods should be used. For better results, perform multiple
123
document parses within the same virtual machine to remove class
124
loading from parse time and memory usage.
127
The "tagginess" measurement gives a rough estimate of the percentage
128
of markup versus content in the XML document. The percent tagginess
129
of a document is equal to the minimum amount of tag characters
130
required for elements, attributes, and processing instructions
131
divided by the total amount of characters (characters, ignorable
132
whitespace, and tag characters) in the document.
135
Not all features are supported by different parser configurations.
139
<anchor name='DocumentTracer'/>
140
<s2 title='Sample xni.DocumentTracer'>
142
Provides a complete trace of XNI document and DTD events for
146
<source>java xni.DocumentTracer (options) uri ...</source>
150
<tr><th>Option</th><th>Description</th></tr>
151
<tr><td>-p name</td><td>Specify parser configuration by name.</td></tr>
152
<tr><td>-n | -N</td><td>Turn on/off namespace processing.</td></tr>
153
<tr><td>-v | -V</td><td>Turn on/off validation.</td></tr>
157
Turn on/off Schema validation support.<br/>
158
<strong>NOTE:</strong> Not supported by all parser configurations.
161
<tr><td>-c | -C</td><td>Turn on/off character notifications");</td></tr>
162
<tr><td>-h</td><td>Display help screen.</td></tr>
166
<anchor name='Writer'/>
167
<s2 title='Sample xni.Writer'>
169
A sample XNI writer. This sample program illustrates how to
170
take received XMLDocumentHandler callbacks in order to print
171
a document that is parsed.
174
<source>java xni.Writer (options) uri ...</source>
178
<tr><th>Option</th><th>Description</th></tr>
179
<tr><td>-p name</td><td>Select parser configuration by name.</td></tr>
180
<tr><td>-n | -N</td><td>Turn on/off namespace processing.</td></tr>
181
<tr><td>-v | -V</td><td>Turn on/off validation.</td></tr>
185
Turn on/off Schema validation support.<br/>
186
<strong>NOTE:</strong> Not supported by all parser configurations.
193
Turn on/off Canonical XML output.<br/>
194
<strong>NOTE:</strong> This is not W3C canonical output.
198
<tr><td>-h</td><td>Display help screen.</td></tr>
203
<anchor name='PSVIWriter'/>
204
<s2 title='Sample xni.PSVIWriter'>
206
This is an example of a component that converts XNI events for a document into
207
XNI events for that document's PSVI information.
210
This class can <strong>NOT</strong> be run as a standalone
211
program. It is only an example of how to write a component. See
212
<link anchor='PSVIConfiguration'>xni.parser.PSVIConfiguration</link> and
213
<link anchor='PSVIParser'>xni.parser.PSVIParser</link>.
217
<anchor name='XMLGrammarBuilder'/>
218
<s2 title='Sample xni.XMLGrammarBuilder'>
220
This sample illustrates how to use Xerces's grammar
221
preparsing functionality to build a compiled representation of a grammar
222
and use it to parse instance documents. It is also meant
223
to replace the DOM ASBuilder sample (which
224
implements the DOM AS interfaces which have been discontinued by W3C). It
225
handles both XML Schema grammars and DTD external subsets.
228
<source>java xni.XMLGrammarBuilder [-p config_file] -d uri ... | [-f|-F] -a uri ... [-i uri ...]</source>
232
<tr><th>Option</th><th>Description</th></tr>
233
<tr><td>-p name</td><td>Select parser configuration by name.</td></tr>
234
<tr><td>-d</td><td>URI of file(s) to be compiled as DTD external
236
<tr><td>-a</td><td>URI of file(s) to be compiled as XML Schema grammars</td></tr>
240
Turn on/off Schema full checking when validating instances against schemas.<br/>
241
<strong>NOTE:</strong> Requires use of -a and not supported by all parsers.
244
<tr><td>-i</td><td>List of instance documents to validate. The preparsed grammars will be
245
used first, but if a reference is made to a non-preparsed grammar,
246
it will be resolved.</td></tr>
251
No two schema grammars preparsed by this class should share the
252
same targetNamespace (or have no targetNamespace). If this condition is
253
not meant, results are undefined--but, very likely, one of the schemas
254
will simply be ignored.
257
Not all features are supported by different parser configurations.
258
Particularly, if a parser configuration is specified, it would be wise to
259
ensure it supports the kind of grammars to be preparsed.
264
<anchor name='PassThroughFilter'/>
265
<s2 title='Sample xni.PassThroughFilter'>
267
This sample demonstrates how to implement a simple pass-through
268
filter for the document "streaming" information set using XNI.
269
This filter could be used in a pipeline of XNI parser components
270
that communicate document events.
273
This class can <strong>NOT</strong> be run as a standalone
274
program. It is only an example of how to write a document
278
<anchor name='UpperCaseFilter'/>
279
<s2 title='Sample xni.UpperCaseFilter'>
281
This sample demonstrates how to create a filter for the document
282
"streaming" information set that turns element names into upper
286
This class can <strong>NOT</strong> be run as a standalone
287
program. It is only an example of how to write a document
291
<anchor name='NonValidatingParserConfiguration'/>
292
<s2 title='Sample xni.parser.NonValidatingParserConfiguration'>
293
<p>Non-validating parser configuration.</p>
295
This class can <strong>NOT</strong> be run as a standalone
296
program. It is only an example of how to write a parser
297
configuration using XNI. You can use this parser configuration
298
by specifying the fully qualified class name to all of the XNI
299
samples that accept a parser configuration using the
300
<code>-p</code> option. For example:
302
<source>java xni.Counter -p xni.parser.NonValidatingParserConfiguration document.xml</source>
304
<anchor name='AbstractConfiguration'/>
305
<s2 title='Sample xni.parser.AbstractConfiguration'>
307
This abstract parser configuration simply helps manage components,
308
features and properties, and other tasks common to all parser
309
configurations. In order to subclass this configuration and use
310
it effectively, the subclass is required to do the following:
314
Add all configurable components using the <code>addComponent</code>
316
<li>Implement the <code>parse</code> method, and</li>
317
<li>Call the <code>resetComponents</code> before parsing.</li>
320
This class can <strong>NOT</strong> be run as a standalone
321
program. It is only an example of how to write a parser
322
configuration using XNI.
326
- REVISIT: Add in this sample once the proper interfaces have been
327
- designed and implemented in the parser. *And* after the
328
- sample code has been written.
330
<anchor name='DynamicParserConfiguration'/>
331
<s2 title='Sample xni.parser.DynamicParserConfiguration'>
334
<anchor name='CSVConfiguration'/>
335
<s2 title='Sample xni.parser.CSVConfiguration'>
337
This example is a very simple parser configuration that can
338
parse files with comma-separated values (CSV) to generate XML
339
events. For example, the following CSV document:
341
<source>Andy Clark,16 Jan 1973,Cincinnati</source>
343
produces the following XML "document" as represented by the
344
XNI streaming document information:
346
<source><![CDATA[<?xml version='1.0' encoding='UTF-8' standalone='true'?>
348
<!ELEMENT csv (row)*>
349
<!ELEMENT row (col)*>
350
<!ELEMENT col (#PCDATA)>
354
<col>Andy Clark</col>
355
<col>16 Jan 1973</col>
356
<col>Cincinnati</col>
360
This class can <strong>NOT</strong> be run as a standalone
361
program. It is only an example of how to write a parser
362
configuration using XNI. You can use this parser configuration
363
by specifying the fully qualified class name to all of the XNI
364
samples that accept a parser configuration using the
365
<code>-p</code> option. For example:
367
<source>java xni.Counter -p xni.parser.CSVConfiguration document.xml</source>
369
<anchor name='CSVParser'/>
370
<s2 title='Samples xni.parser.CSVParser'>
372
This parser class implements a SAX parser that can parse simple
373
comma-separated value (CSV) files.
376
This class can <strong>NOT</strong> be run as a standalone
377
program. It is only an example of how to write a parser
378
using XNI. You can use this parser
379
by specifying the fully qualified class name to all of the SAX
380
samples that accept a parser using the
381
<code>-p</code> option. For example:
383
<source>java sax.Counter -p xni.parser.CSVParser document.xml</source>
386
<anchor name='PSVIConfiguration'/>
387
<s2 title='Sample xni.parser.PSVIConfiguration'>
389
This example is a parser configuration that can includes a post
390
schema validation infoset converter. The configuration includes:
391
DTD validator, Namespace binder, XML Schema validators and PSVIWriter component.
394
This class can <strong>NOT</strong> be run as a standalone
395
program. It is only an example of how to write a parser
396
configuration using XNI. You can use this parser configuration
397
by specifying the fully qualified class name to all of the XNI
398
samples that accept a parser configuration using the
399
<code>-p</code> option:
401
<source>java xni.Writer -v -s -p xni.parser.PSVIConfiguration personal-schema.xml</source>
402
<note><link idref='features' anchor="validation">Validation</link>
403
and <link idref='features' anchor="validation.schema">schema validation</link>
404
features must be set to true to receive the correct PSVI output.</note>
407
<anchor name='PSVIParser'/>
408
<s2 title='Samples xni.parser.PSVIParser'>
410
This parser class implements a SAX parser that outputs events for
411
the post schema validation infoset of a document.
414
This class can <strong>NOT</strong> be run as a standalone
415
program. It is only an example of how to write a parser
416
using XNI. You can use this parser
417
by specifying the fully qualified class name to all of the SAX
418
samples that accept a parser using the
419
<code>-p</code> option. For example:
421
<source>java sax.Writer -v -s -p xni.parser.PSVIParser personal-schema.xml</source>
422
<note><link idref='features' anchor="validation">Validation</link>
423
and <link idref='features' anchor="validation.schema">schema validation</link>
424
features must be set to true to receive the correct PSVI output.</note>
427
<anchor name='xercesProperties'/>
428
<s2 title='Sample xni/xerces.properties'>
429
<p> When you create a Xerces parser, either directly using a native
430
class like org.apache.xerces.parsers.DOMParser, or via a
431
standard API like JAXP, Xerces provides a dynamic means of
432
dynamically selecting a "configuration" for that parser.
433
Configurations are the basic mechanism Xerces uses to decide
434
exactly how it will treat an XML document (e.g., whether it
435
needs to know about Schema validation, whether it needs to be
436
cognizant of potential denial-of-service attacks launched via
437
malicious XML documents, etc.) The steps are fourfold:
440
<li> * first, Xerces will examine the system property
441
org.apache.xerces.xni.parser.XMLParserConfiguration;
443
<li> next, it will try and find a file called xerces.properties in
444
the lib subdirectory of your JRE installation;
446
<li> next, it will examine all the jars on your classpath to try
447
and find one with the appropriate entry in its
448
META-INF/services directory.
450
<li>if all else fails, it will use a hardcoded default.
453
<p> The third step can be quite time-consuming, especially if you
454
have a lot of jars on your classpath and run applications which
455
require the creation of lots of parsers. If you know you're
456
only using applications which require "standard" API's (that
457
is, don't need some special Xerces property), or you want to
458
try and force applications to use only certain Xerces
459
configurations, then you may wish to copy this file into your
460
JRE's lib directory. We try and ensure that this file contains
461
the currently-recommended default configuration; if you know
462
which configuration you want, you may substitute that class
463
name for what we've provided here.</p>