1
<?xml version="1.0" standalone="no"?>
2
<!DOCTYPE s1 SYSTEM "../../style/dtd/document.dtd">
5
* Copyright 1999-2004 The Apache Software Foundation.
7
* Licensed under the Apache License, Version 2.0 (the "License");
8
* you may not use this file except in compliance with the License.
9
* You may obtain a copy of the License at
11
* http://www.apache.org/licenses/LICENSE-2.0
13
* Unless required by applicable law or agreed to in writing, software
14
* distributed under the License is distributed on an "AS IS" BASIS,
15
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16
* See the License for the specific language governing permissions and
17
* limitations under the License.
19
<!-- $Id: xsltc_trax.xml,v 1.2 2009/12/10 03:18:38 matthewoliver Exp $ -->
20
<s1 title="The Translet API and TrAX">
24
<p>Note: This document describes the design of XSLTC's TrAX implementation.
25
The XSLTC <link idref="xsltc_trax_api">TrAX API user documentation</link>
26
is kept in a separate document.</p>
28
<p>The structure of this document is, and should be kept, as follows:</p>
31
<li>A brief introduction to TrAX/JAXP</li>
32
<li>Overall design of the XSLTC TrAX implementation</li>
33
<li>Detailed design of various TrAX components</li>
37
<li><link anchor="abstract">Abstract</link></li>
38
<li><link anchor="trax">TrAX basics</link></li>
39
<li><link anchor="config">TrAX configuration</link></li>
40
<li><link anchor="design">XSLTC TrAX architecture</link></li>
41
<li><link anchor="detailed_design">XSLTC TrAX detailed design</link></li>
43
<li><link anchor="factory_design">TransformerFactory design</link></li>
44
<li><link anchor="templates_design">Templates design</link></li>
45
<li><link anchor="transformer_design">Transformer design</link></li>
46
<li><link anchor="config_design">TrAX configuration design</link></li>
51
<!--====================== ABSTRACT SECTION ===========================-->
53
<anchor name="abstract"/>
56
<p>JAXP is the Java extension API for XML parsing. TrAX is an API for XML
57
transformations and is included in the later versions of JAXP. JAXP includes
58
two packages, one for XML parsing and one for XML transformations (TrAX):</p>
61
javax.xml.transform</source>
63
<p>XSLTC is an XSLT processing engine and fulfills the role as an XML
64
transformation engine behind the TrAX portion of the JAXP API. XSLTC is a
65
provider for the TrAX API and a client of the JAXP parser API.</p>
67
<p>This document describes the design used for integrating XSLTC translets
68
with the JAXP TrAX API. The heart of the design is a wrapper class around the
69
XSLTC compiler that extends the JAXP <code>SAXTransformerFactory</code>
70
interface. This factory delivers translet class definitions (Java bytecodes)
71
wrapped inside TrAX <code>Templates</code> objects. These
72
<code>Templates</code> objects can be used to instanciate
73
<code>Transformer</code> objects that transform XML documents into markup or
74
plain text. Alternatively a <code>Transformer</code> object can be created
75
directly by the <code>TransformerFactory</code>, but this approach is not
76
recommended with XSLTC. The reason for this will be explained later in this
81
<!--====================== TRAX BASICS SECTION =========================-->
84
<s2 title="TrAX basics">
86
<p>The Java API for XML Processing (JAXP) includes an XSLT framework based
87
on the Transformation API for XML (TrAX). A JAXP transformation application
88
can use the TrAX framework in two ways. The simplest way is:</p>
91
<li>create an instance of the TransformerFactory class</li>
92
<li>from the factory instance and a given XSLT stylesheet, create a new
93
Transformer object</li>
94
<li>call the Transformer object's transform() method, specifying the XML
95
input and a Result object.</li>
97
import javax.xml.transform.*;
99
public class Compile {
101
public void run(Source xsl) {
103
TransformerFactory factory = TransformerFactory.newInstance();
104
Transformer transformer = factory.newTransformer(xsl);
109
<p>This suits most conventional XSLT processors that transform XML documents
110
in one go. XSLTC needs one extra step to compile the XSL stylesheet into a
111
Java class (a "translet"). Fortunately TrAX has another approach
112
that suits XSLTC two-step transformation model:</p>
115
<li>create an instance of the TransformerFactory class</li>
116
<li>from the factory instance and a given XSLTC, stylesheet, create a new
117
Templates object (this step will compile the stylesheet and put the
118
bytecodes for translet class(es) into the Templates object)</li>
119
<li>from the Template object create a Transformer object (this will
120
instanciate a new translet object).</li>
121
<li>call the Transformer object's transform() method, specifying the XML
122
input and a Result object.</li>
124
import javax.xml.transform.*;
126
public class Compile {
128
public void run(Source xsl) {
130
TransformerFactory factory = TransformerFactory.newInstance();
131
Templates templates = factory.newTemplates(xsl);
132
Transformer transformer = templates.newTransformer();
137
<p>Note that the first two steps need be performed only once for each
138
stylesheet. Once the stylesheet is compiled into a translet and wrapped in a
139
<code>Templates</code> object, the <code>Templates</code> object can be used
140
over and over again to create Transformer object (instances of the translet).
141
The <code>Templates</code> instances can even be serialized and stored on
142
stable storage (ie. in a memory or disk cache) for later use.</p>
144
<p>The code below illustrates a simple JAXP transformation application that
145
creates the <code>Transformer</code> directly. Remember that this is not the
146
ideal approach with XSLTC, as the stylesheet is compiled for each
147
transformation.</p><source>
148
import javax.xml.transform.stream.StreamSource;
149
import javax.xml.transform.stream.StreamResult;
150
import javax.xml.transform.Transformer;
151
import javax.xml.transform.TransformerFactory;
155
public void run(String xmlfile, String xslfile) {
156
Transformer transformer;
157
TransformerFactory factory = TransformerFactory.newInstance();
160
StreamSource stylesheet = new StreamSource(xslfile);
161
transformer = factory.newTransformer(stylesheet);
162
transformer.transform(new StreamSource(xmlfile),
163
new StreamResult(System.out));
165
catch (Exception e) {
172
<p>This approach seems simple is probably used in many applications. But, the
173
use of <code>Templates</code> objects is useful when multiple instances of
174
the same <code>Transformer</code> are needed. <code>Transformer</code>
175
objects are not thread safe, and if a server wants to handle several clients
176
requests it would be best off to create one global <code>Templates</code>
177
object, and then from this create a <code>Transformer</code> object for each
178
thread handling the requests. This approach is also by far the best for
179
XSLTC, as the <code>Templates</code> object will hold the class definitions
180
that make up the translet and its auxiliary classes. (Note that the bytecodes
181
and not the actuall class definitions are stored when serializing a
182
<code>Templates</code> object to disk. This is because of class loader
183
security restrictions.) To accomodate this second approach to TrAX
184
transformations, the above class would be modified as follows:</p><source>
186
StreamSource stylesheet = new StreamSource(xslfile);
187
Templates templates = factory.newTemplates(stylesheet);
188
transformer = templates.newTransformer();
189
transformer.transform(new StreamSource(inputFilename),
190
new StreamResult(System.out));
192
catch (Exception e) {
198
<!--====================== TRAX CONFIG SECTION =========================-->
200
<anchor name="config"/>
201
<s2 title="TrAX configuration">
203
<p>JAXP's <code>TransformerFactory</code> is configurable similar to the
204
other Java extensions. The API supports configuring thefactory by:</p>
207
<li>passing vendor-specific attributes from the application, through the
208
TrAX interface, to the underlying XSL processor</li>
209
<li>registering an ErrorListener that will be used to pass error and
210
warning messages from the XSL processor to the application</li>
211
<li>registering an URIResolver that the application can use to load XSL
212
and XML documents on behalf of the XSL processor (the XSL processor will
213
use this to support the xsl:include and xsl:import elements and the
214
document() functions.</li>
217
<p>The JAXP TransformerFactory can be queried at runtime to discover what
218
features it supports. For example, an application might want to know if a
219
particular factory implementation supports the use of SAX events as a source,
220
or whether it can write out transformation results as a DOM. The factory API
221
queries with the getFeature() method. In the above code, we could add the
222
following code before the try-catch block:</p><source>
223
if (!factory.getFeature(StreamSource.FEATURE) || !factory.getFeature(StreamResult.FEATURE)) {
224
System.err.println("Stream Source/Result not supported by TransformerFactory\nExiting....");
228
<p>Other elements in the TrAX API are configurable. A Transformer object can
229
be passed settings that override the default output settings and the settings
230
defined in the stylesheet for indentation, output document type, etc.</p>
234
<!--====================== ARCHITECTURE SECTION ========================-->
236
<anchor name="design"/>
237
<s2 title="XSLTC TrAX architecture">
239
<p>XSLTC's architecture fits nicely in behind the TrAX interface. XSLTC's
240
compiler is put behind the <code>TransformerFactory</code> interface, the
241
translet class definition (either as a set of in-memory
242
<code>Class</code> objects or as a two-dimmensional array of bytecodes on
243
disk) is encapsulated in the <code>Templates</code> implementation and the
244
instanciated translet object is wrapped inside the <code>Transformer</code>
245
implementation. Figure 1 (below) shows this two-layered TrAX architecture:
248
<p><img src="trax_translet_wrapping.gif" alt="TransletWrapping"/></p>
249
<p><ref>Figure 1: Translet class definitions are wrapped inside Templates objects</ref></p>
251
<p>The <code>TransformerFactory</code> implementation also implements the
252
<code>SAXTransformerFactory</code> and <code>ErrorListener</code>
253
interfaces from the TrAX API.</p>
255
<p>The TrAX implementation has intentionally been kept completely separate
256
from the XSLTC native code. This prevents users of XSLTC's native API from
257
having to include the TrAX code in an application. All the code that makes
258
up our TrAX implementation resides in this package:</p><source>
259
org.apache.xalan.xsltc.trax</source>
261
<p>Message to all XSLTC developers: Keep it this way! Do not mix TrAX
266
<!--======================= TRAX DESIGN SECTION ========================-->
268
<anchor name="detailed_design"/>
269
<s2 title="TrAX implementation details">
271
<p>The main components of our TrAX implementation are:</p>
274
<li><link anchor="transformer_factory">the TransformerFactory class</link></li>
275
<li><link anchor="templates">the Templates class</link></li>
276
<li><link anchor="transformer">the Transformer class</link></li>
277
<li><link anchor="transformer">output properties handling</link></li>
280
<anchor name="factory_design"/>
281
<s3 title="TransformerFactory implementation">
283
<p>The methods that make up the basic <code>TransformerFactory</code>
284
iterface are: </p><source>
285
public Templates newTemplates(Source source);
286
public Transformer newTransformer();
287
public ErrorListener getErrorListener();
288
public void setErrorListener(ErrorListener listener);
289
public Object getAttribute(String name);
290
public void setAttribute(String name, Object value);
291
public boolean getFeature(String name);
292
public URIResolver getURIResolver();
293
public void setURIResolver(URIResolver resolver);
294
public Source getAssociatedStylesheet(Source src, String media, String title, String charset);</source>
296
<p>And for the <code>SAXTransformerFactory</code> interface:</p><source>
297
public TemplatesHandler newTemplatesHandler();
298
public TransformerHandler newTransformerHandler();
299
public TransformerHandler newTransformerHandler(Source src);
300
public TransformerHandler newTransformerHandler(Templates templates);
301
public XMLFilter newXMLFilter(Source src);
302
public XMLFilter newXMLFilter(Templates templates);</source>
304
<p>And for the <code>ErrorListener</code> interface:</p><source>
305
public void error(TransformerException exception);
306
public void fatalError(TransformerException exception);
307
public void warning(TransformerException exception);</source>
309
<s4 title="TransformerFactory basics">
310
<p>The very core of XSLTC TrAX support for XSLTC is the implementation of
311
the basic <code>TransformerFactory</code> interface. This factory class is
312
more or less a wrapper around the the XSLTC compiler and creates
313
<code>Templates</code> objects in which compiled translet classes can
314
reside. These <code>Templates</code> objects can then be used to create
315
<code>Transformer</code> objects. In cases where the
316
<code>Transformer</code> is created directly by the factory we will use
317
the <code>Templates</code> class internally. In that way the transformation
318
will appear to be done in one step from the users point of view, while we
319
in reality use to steps. As described earler, this is not the best approach
320
when using XSLTC, as it causes the stylesheet to be compiled for each and
321
every transformation.</p>
324
<s4 title="TransformerFactory attribute settings">
325
<p>The <code>getAttribute()</code> and <code>setAttribute()</code> methods
326
only recognise two attributes: <code>translet-name</code> and
327
<code>debug</code>. The latter is obvious - it forces XSLTC to output debug
328
information (dumps the stack in the very unlikely case of a failure). The
329
<code>translet-name</code> attribute can be used to set the default class
330
name for any nameless translet classes that the factory creates. A nameless
331
translet will, for instance, be created when the factory compiles a translet
332
for the identity transformation. There is a default name,
333
<code>GregorSamsa</code>, for nameless translets, so there is no absolute
334
need to set this attribute. (Gregor Samsa is the main character from Kafka's
335
"Metamorphosis" - transformations, metamorphosis - I am sure you
336
see the connection.)</p>
339
<s4 title="TransformerFactory stylesheet handling">
340
<p>The compiler is can be passed a stylesheet through various methods in
341
the <code>TransformerFactory</code> interface. A stylesheet is passed in as
342
a <code>Source</code> object that containin either a DOM, a SAX parser or
343
a stream. The <code>getInputSource()</code> method handles all inputs and
344
converts them, if necessary, to SAX. The TrAX implementation contains an
345
adapter that will generate SAX events from a DOM, and this adapter is used
346
for DOM input. If the <code>Source</code> object contains a SAX parser, this
347
parser is just passed directly to the compiler. A SAX parse is instanciated
348
(using JAXP) if the <code>Source</code> object contains a stream.</p>
351
<s4 title="TransformerFactory URI resolver">
352
<p>A TransformerFactory needs a <code>URIResolver</code> to locate documents
353
that are referenced in <code><xsl:import></code> and
354
<code><xsl:include></code> elements. XSLTC has an internal interface
355
that shares the same purpose. This internal interface is implemented by the
356
<code>TransformerFactory</code>:</p><source>
357
public InputSource loadSource(String href, String context, XSLTC xsltc);</source>
358
<p>This method will simply use any defined <code>URIResolver</code> and
359
proxy the call on to the URI resolver's <code>resolve()</code> method. This
360
method returns a <code>Source</code> object, which is converted to SAX
361
events and passed back to the compiler.</p>
366
<anchor name="template_design"/>
367
<s3 title="Templates design">
369
<s4 title="Templates creation">
370
<p>The <code>TransformerFactory</code> implementation invokes the XSLTC
371
compiler to generate the translet class and auxiliary classes. These classes
372
are stored inside our <code>Templates</code> implementation in a manner
373
which allows the <code>Templates</code> object to be serialized. By making
374
it possible to store <code>Templates</code> on stable storage we allow the
375
TrAX user to store/cache translet class(es), thus making room for XSLTC's
376
one-compilation-multiple-transformations approach. This was done by giving
377
the <code>Templates</code> implementation an array of byte-arrays that
378
contain the bytecodes for the translet class and its auxiliary classes. When
379
the user first requests a <code>Transformer</code> instance from the
380
<code>Templates</code> object for the first time we create one or more
381
<code>Class</code> objects from these byte arrays. Note that this is done
382
only once as long as the <code>Template</code> object resides in memory. The
383
<code>Templates</code> object then invokes the JVM's class loader with the
384
class definition(s) to instanciate the translet class(es). The translet
385
objects are then wraped inside a <code>Transformer</code> object, which is
386
returned to the client code:</p><source>
388
// Contains the name of the main translet class
389
private String _transletName = null;
391
// Contains the actual class definition for the translet class and
392
// any auxiliary classes (representing node sort records, predicates, etc.)
393
private byte[][] _bytecodes = null;
396
* Defines the translet class and auxiliary classes.
397
* Returns a reference to the Class object that defines the main class
399
private Class defineTransletClasses() {
400
TransletClassLoader loader = getTransletClassLoader();
403
Class transletClass = null;
404
final int classCount = _bytecodes.length;
405
for (int i = 0; i < classCount; i++) {
406
Class clazz = loader.defineClass(_bytecodes[i]);
407
if (clazz.getName().equals(_transletName))
408
transletClass = clazz;
410
return transletClass; // Could still be 'null'
412
catch (ClassFormatError e) {
418
<s4 title="Translet class loader">
420
<p>The <code>Templates</code> object will create the actual translet
421
<code>Class</code> object(s) the first time the
422
<code>newTransformer()</code> method is called. (The "first time" means the
423
first time either after the object was instanciated or the first time after
424
it has been read from storage using serialization.) These class(es) cannot
425
be created using the standard class loader since the method:</p><source>
426
Class defineClass(String name, byte[] b, int off, int len);</source>
428
<p>of the ClassLoader is protected. XSLTC uses its own class loader that
429
extends the standard class loader:</p><source>
430
// Our own private class loader - builds Class definitions from bytecodes
431
private class TransletClassLoader extends ClassLoader {
432
public Class defineClass(byte[] b) {
433
return super.defineClass(null, b, 0, b.length);
436
<p>This class loader is instanciated inside a privileged code section:</p><source>
437
TransletClassLoader loader =
438
(TransletClassLoader) AccessController.doPrivileged(
439
new PrivilegedAction() {
440
public Object run() {
441
return new TransletClassLoader();
446
<p>Then, when the newTransformer() method returns it passes back and
447
instance of XSLTC's <code>Transformer</code> implementation that contains
448
an instance of the main translet class. (One transformation may need several
449
Java classes - for sort-records, predicates, etc. - but there is always one
450
main translet class.)</p>
454
<s4 title="Class loader security issues">
456
<p>When XSLTC is placed inside a JAR-file in the
457
<code>$JAVA_HOME/jre/lib/ext</code> it is loaded by the extensions class
458
loader and not the default (bootstrap) class loader. The extensions class
459
loader does not look for class files/definitions in the user's
460
<code>CLASSPATH</code>. This can cause two problems: A) XSLTC does not find
461
classes for external Java functions, and B) XSLTC does not find translet or
462
auxiliary classes when used through the native API.</p>
464
<p>Both of these problems are caused by XSLTC internally calling the
465
<code>Class.forName()</code> method. This method will use the current class
466
loader to locate the desired class (be it an external Java class or a
467
translet/aux class). This is prevented by forcing XSLTC to use the bootstrap
468
class loader, as illustrated below:</p>
470
<p><img src="class_loader.gif" alt="ClassLoader"/></p>
471
<p><ref>Figure 2: Avoiding the extensions class loader</ref></p>
473
<p>These are the steps that XSLTC will go through to load a class:</p>
476
<li>the application requests an instance of the transformer factory </li>
477
<li>the Java extensions mechanism locates XSLTC as the transformer
478
factory implementation using the extensions class loader</li>
479
<li>the extensions class loader loads XSLTC</li>
480
<li>XSLTC's compiler attempts to get a reference to an external Java
481
class, but the call to Class.forName() fails, as the extensions class
482
loader does not use the user's class path</li>
483
<li>XSLTC attempts to get a reference to the bootstrap class loader, and
484
requests it to load the external class</li>
485
<li>the bootstrap class loader loads the requested class</li>
488
<p>Step 5) is only allowed if XSLTC has special permissions. But, remember
489
that this problem only occurs when XSLTC is put in the
490
<code>$JAVA_HOME/jre/lib/ext</code> directory, where it is given all
491
permissions (by the default security file).</p>
497
<anchor name="transformer_design"/>
498
<s3 title="Transformer detailed design">
500
<p>The <code>Transformer</code> class is a simple proxy that passes
501
transformation settings on to its translet instance before it invokes the
502
translet's <code>doTransform()</code> method. The <code>Transformer</code>'s
503
<code>transform()</code> method maps directly to the translet's
504
<code>doTransform()</code> method.</p>
506
<s4 title="Transformer input and output handling">
507
<p>The <code>Transformer</code> handles its input in a manner similar to
508
that of the <code>TransformerFactory</code>. It has two methods for
509
creating standard SAX input and output handlers for its input and output
511
private DOMImpl getDOM(Source source, int mask);
512
private ContentHandler getOutputHandler(Result result);</source>
514
<p>One aspect of the <code>getDOM</code> method is that it handles four
515
various types of <code>Source</code> objects. In addition to the standard
516
DOM, SAX and stream types, it also handles an extended
517
<code>XSLTCSource</code> input type. This input type is a lightweight
518
wrapper from XSLTC's internal DOM-like input tree. This allows the user
519
to create a cache or pool of XSLTC's native input data structures
520
containing the input XML document. The <code>XSLTCSource</code> class
521
is located in:</p><source>
522
org.apache.xalan.xsltc.trax.XSLTCSource</source>
525
<s4 title="Transformer parameter settings">
526
<p>XSLTC's native interface has get/set methods for stylesheet parameters,
527
identical to those of the TrAX API. The parameter handling methods of
528
the <code>Transformer</code> implementation are pure proxies.</p>
531
<s4 title="Transformer output settings">
532
<p>The Transformer interface of TrAX has for methods for retrieving and
533
defining the transformation output document settings:</p><source>
534
public Properties getOutputProperties();
535
public String getOutputProperty(String name);
536
public void setOutputProperties(Properties properties);
537
public void setOutputProperty(String name, String value);</source>
539
<p>There are three levels of output settings. First there are the default
540
settings defined in the <link anchor="">XSLT 1.0 spec</link>, then there
541
are the settings defined in the attributes of the <xsl:output>
542
element, and finally there are the settings passed in through the TrAX
543
get/setOutputProperty() methods.</p>
545
<p><img src="trax_output_settings.gif" alt="Output settings"/></p>
546
<p><ref>Figure 3: Passing output settings from TrAX to the translet</ref></p>
548
<p>The AbstractTranslet class has a series of fields that contain the
549
default values for the output settings. The compiler/Output class will
550
compile code into the translet's constructor that updates these values
551
depending on the attributes in the <xsl:output> element. The
552
Transformer implementation keeps in instance of the java.util.Properties
553
class where it keeps all properties that are set by the
554
<code>setOutputProperty()</code> and the
555
<code>setOutputProperties()</code> methods. These settings are written to
556
the translet's output settings fields prior to initiating the
561
<s4 title="Transformer URI resolver">
562
<p>The <code>uriResolver()</code> method of the Transformer interface is
563
used to set a locator for documents referenced by the document() function
564
in XSL. The native XSLTC API has a defined interface for a DocumentCache.
565
The functionality provided by XSLTC's internal <code>DocumentCache</code>
566
interface is somewhat complimentary to the <code>URIResolver</code>, and
567
can be used side-by-side. To acomplish this we needed to find out in which
568
ways the translet can load an external document:</p>
570
<p><img src="uri_resolver.gif" alt="URIResolver"/></p>
571
<p><ref>Figure 4: Using URIResolver and DocumentCache objects</ref></p>
573
<p>From the diagram we see that these three ways are:</p>
575
<li>LoadDocument -> .xml</li>
576
<li>LoadDocument -> DocumentCache -> .xml</li>
577
<li>LoadDocument -> URIResolver -> .xml</li>
578
<li>LoadDocument -> DocumentCache -> URIResolver -> .xml</li>