1
This document specifies the format of a .japi file, version 0.9.7. The actual
2
implementations of japize, japifix and japicompat may not honor this spec
3
exactly. If they do not, it is generally a bug or missing feature in the tool;
4
this spec notes such bugs when they are known.
7
Purpose of Version 0.9.7
8
------------------------
10
Version 0.9.7 of the japi file format is a work in progress on the road to
11
figuring out how to deal with the new language constructs introduced in Java
12
1.5. Most of the new constructs are now representable, but some are ignored
13
because a suitable representation has not yet been determined. Currently, this
14
document and the tools using it are evolving in parallel, so 0.9.7 does not
15
refer to a single fixed format. Whether or not 0.9.7 is eventually frozen or
16
a 0.9.8 version is defined instead once the issues have been worked out has
19
The only remaining feature to be inadequately representable in this version
20
is annotations. Specifically, 0.9.7 japi files cannot yet represent:
21
- Annotations applied to classes and members. The plan is to include all
22
annotations that are @Documented.
23
- Default values of annotation methods that are of Array and Annotation types.
29
Types in a japi file get represented in different ways, depending on where they
30
appear. This spec identifies which representation should be used for each
31
item. The possible representations are:
32
- Java Language representation: Can only be used to represent classes and
33
interfaces. The format is simply the format of a fully-qualified class name
34
in the Java language, eg "java.lang.Exception". This format is used when
35
only classes and interfaces can appear, eg in superclasses and exceptions.
36
Inner classes are represented using the name that the compiler gives them:
37
the name of the outer class followed by a "$" followed by the inner
38
class name, eg "java.util.Map$Entry".
39
Generic classes are represented by appending a comma-separated list of type
40
arguments, enclosed in angle brackets, in Type Signature representation (see
41
below). For example "java.util.List<Ljava/lang/String;>".
42
For inner classes of generic classes, the type arguments of the outer class
43
are prepended to the list, eg the Entry inner class of Map<String,int[]>
44
would be "java.util.Map$Entry<Ljava/lang/String;,[I>". (This may
45
change in a future version to more accurately reflect the semantics, but it is
46
a simplistic approach that is easier to implement for now).
47
Type parameters are represented as "@0", "@1", "@2" etc in the order they
48
appear on the containing class. Type parameters of an inner class are numbered
49
starting where the parameters of the outer class leaves off unless the inner
50
class is static. Likewise, type parameters of a method are numbered starting
51
where the declaring class's parameters leave off, unless the method is static.
52
There is NO representation of primitive types OR ARRAYS in Java Language
53
representation (emphasis because it's easy to forget that JL representation
54
can't express all reference types).
55
- Type Signature representation: Can represent any Java type, including
56
primitive types and arrays. Up to Java 1.4 this was the format used internally
57
within the JVM, but japitools has diverged from the JVM in its representation
58
of the new 1.5 features. The format looks like this:
59
- Primitive types are represented as single letter codes, specifically:
60
boolean=Z, byte=B, char=C, short=S, int=I, long=J, float=F, double=D, void=V
61
- Classes and interfaces are represented as the letter L, followed by the
62
fully-qualified classname with periods replaced by slashes, followed by a
63
semicolon. The class generally known as java.io.Writer would be represented
64
as "Ljava/io/Writer;". Inner classes are represented by the name the
65
compiler gives them, just as above.
66
- Array types are represented by the character "[" followed by the type
67
of elements in the array. For example, an array of java.util.Dates would
68
be represented as "[Ljava/util/Date;" and a two-dimensional array of
69
ints (an array of arrays of ints) would be "[[I".
70
- In the special case of a method which takes a variable number of arguments
71
(which Java implements internally as an array parameter), the leading "["
72
should be replaced by a ".". For example, a method taking a variable number
73
of int arguments would be represented as ".I" and a variable number of
74
arrays of strings would be ".[Ljava/lang/String;". Note that a special
75
rule applies for ordering methods with this kind of parameter - see
77
- Generic types are represented by inserting a comma-separated list of type
78
arguments in Type Signature representation, enclosed in angle brackets,
79
immediately before the trailing semicolon. For example
80
"Ljava/util/Map<Ljava/lang/String;,[I>;". Note that a special rule applies
81
for ordering methods with generic-typed parameters - see "Ordering", below.
82
- Type parameters are represented as "@0", "@1", "@2", exactly as in Java
83
Language representation. Note that a special rule applies for ordering
84
methods with parameters of these types - see "Ordering", below.
85
- Wildcard types are represented by the upper or lower bound, as follows:
86
- "? extends pkg.Foo" becomes "{Lpkg/Foo;"
87
- "? super pkg.Foo" becomes "}Lpkg/Foo;"
88
- "?" is equivalent to "? extends java.lang.Object" and hence becomes
89
"{Ljava/lang/Object;" Open question is whether the actual bound of the
90
parameter the wildcard is being used in should appear here instead; this
91
may make a difference to erasure.
92
- Japi sortable representation: Designed to make it easy to sort a japi file
93
into a convenient order. The format is:
94
<pkg>,<class>[$<innerclass>].
95
In this format, <pkg> is the package that the class appears in (dot separated,
96
eg "java.util"), <class> is the full name of the outer class (eg "Map") and
97
<innerclass> is the name of the inner class (eg "Entry"), if applicable. The
98
$ is omitted for top-level classes. Note that it is theoretically possible for
99
a class to have a $ in its name without being an inner class: the $ in that
100
case should be \u escaped (see "Escaping", below; note that this is not yet
101
implemented by any of the japitools programs). Multiple levels of inner-
102
classness are represented the obvious way. So the class java.util.Map is
103
represented as "java.util,Map", and its Entry inner class is represented as
104
"java.util,Map$Entry". No special consideration applies to generic types
105
in this representation: they never appear here.
110
Note: Much of this section is not yet implemented by any existing japitools
113
The Java language is a fully unicode-enabled language with support for multibyte
114
characters at every level of the language. However, the japi file format is
115
strictly 7-bit ASCII, for ease of processing in (for example) older versions of
116
perl, which are not unicode-capable. To support this, characters outside the
117
normal ASCII ranges are escaped. Characters are escaped as follows: The newline
118
character is escaped to "\n", the backslash character is escaped to "\\", and
119
all other characters are escaped as "\uXXXX", where XXXX is the lowercase
120
hexadecimal rendition of the integer value of the "char" type in java.
122
In class names, all characters should be escaped except for A-Z, a-z, 0-9, _ and
123
any metacharacters that have meaning in the particular representation being
124
used. In Java Language representation, the metacharacters are ".$"; in Type
125
Signature representation, they are "/$;"; and in Japi Sortable representation
128
In field and method names, all characters should be escaped except for A-Z, a-z,
131
In constant strings, all characters outside the range from " " to "~" in ASCII
132
value should be escaped, along with the backslash ("\") character.
134
Existing implementations only escape backslash and newline, and only do so in
135
constant strings. This covers the vast majority of what will actually arise.
141
Japi files may be created for any set of classes and interfaces, although
142
typically they will be made for particular complete packages that make up a
143
public API. However, it is not permitted to create a japi file for only part of
144
a class, and it is not permitted to create a japi file for an inner class
145
without it's containing class, or vice versa. Only public and protected classes
146
and interfaces can be included in a japi file.
148
For each class and/or interface that is included, all its public and protected
149
fields, methods and constructors must be included. This includes fields and
150
methods (but not constructors!) inherited from public or protected superclasses.
152
Due to the way generics are implemented in Java there are situations where a
153
method may appear to be present to a generic-aware compiler but not to a
154
pre-generics compiler, or vice versa; or that the same method may appear to
155
take different typed arguments depending on whether the compiler is generic
161
class Sub extends Super<String> {
164
Sub appears to have a meth(String) method on a generic-aware compiler and a
165
meth(Object) method on a pre-generics compiler. In order to support
166
comparisons of either kind of API, the method will be included both ways in
167
the japi file. See the discussion of how methods are represented for more
174
In order to allow efficient comparison of japi files, the items in the file are
175
required to be in a strict order. The items are sorted first by package, then
176
by class (or interface), then by member.
178
Packages are sorted alphabetically by name, except that java.lang and all its
179
subpackages (eg java.lang.reflect and java.lang.ref) are placed first. Classes
180
and interfaces are sorted by name, with inner classes coming after the
181
corresponding outer class, except for java.lang.Object which sorts first of
184
Within a class or interface, the class (or interface) itself comes first,
185
followed by its fields (in alphabetical order by name), followed by its
186
constructors (in alphabetical order by parameter types), followed by its
187
methods (in alphabetical order by name and parameter types).
189
"By parameter types" here means by the result of concatenating the types of all
190
the parameters in type signature format, EXCEPT that all 1.5-specific features
191
are ignored. Specifically:
192
- Generic type parameters (@0, @1 etc) are replaced by the constraining type.
193
- Types that *have* generic type parameters (anything in <>s) have them removed.
194
- Varargs array types (starting with ".") are treated as regular array types
195
(replacing the "." with "[").
197
This resulting string does not appear anywhere in the japi file; it is merely
198
constructed in memory for the purposes of sorting. The purpose of this
199
algorithm is to ensure that the ordering of methods using 1.5-specific
200
constructs is identical to the ordering they would have under an older JDK
201
version. This is necessary to support meaningful comparisons between, for
202
example, non-generic and generic versions of the same API, such as java.util
203
between JDK1.4 and JDK1.5.
205
If any package, class or member names include any escaped characters, it is the
206
escaped string that is compared, rather than the original.
209
File Format: Compression
210
------------------------
212
Japi files may optionally be compressed by gzip. A compressed japi file should
213
be named *.japi.gz; an uncompressed one should be named *.japi. Tools for
214
reading japi files may rely on this naming convention; tools for creating japi
215
files should enforce it where possible.
218
File Format: First Line
219
-----------------------
221
The first line of a japi file indicates the file format version. This line is
222
guaranteed to follow the same format for all future releases. Thus, this section
223
(only!) of the spec covers all japi file format versions, past and future. With
224
this information you can identify whether a file is a japi file, and which
225
version it is. However, this spec does not provide any information about the
226
content of the rest of the file for any version other than 0.9.7. Both past and
227
future versions can be assumed to be entirely incompatible with this spec from
228
the second line onwards.
230
The first line of any japi file since version 0.8.1 is as follows:
231
%%japi <version>[ <info>]
232
<version> indicates the version number of the japi file format. While this
233
version number vaguely correlates with the version number of japitools releases,
234
it is not the same number. Often japitools releases do not require a file format
235
change, and sometimes I mess with the file version number for other reasons,
236
with mixed results. The contents of <info> may vary from version to version,
237
or it and its leading space may be omitted altogether; implementations should
238
parse and ignore it (if present) except when the file format version indicates
239
they can understand it.
241
For completeness, I should mention file format versions prior to 0.8.1. Version
242
0.7 can be recognized by the following regular expression:
243
/^[^ ]+#[^ ]* (public|protected) (abstract|concrete) (static|instance) (final|nonfinal) /
244
Version 0.8 can be recognized by the following regular expression:
245
/^[^ ]+#[^ ]* [Pp][ac][si][fn] /
246
These version numbers were invented retroactively, of course :)
249
File Format: File information
250
-----------------------------
252
The <info> field contains a list of space-separated name=value pairs. Unknown
253
names should be ignored. Neither the name nor value may include spaces. The
254
following names and values are permitted:
255
date=yyyy/mm/dd_hh:mm:ss_TZ
257
origver=<original japi file version that this was updated from>
260
File Format: API Items
261
----------------------
263
Every other line in a japi file represents an individual item in the API.
264
These lines must appear in a specific order; see "Ordering" above for the
265
specific requirements.
266
The format of these lines is as follows:
267
<plus><class>!<member> <modifiers> <typeinfo>
269
<plus> is "++" for all members of java.lang.Object, "+" for all members of all
270
classes in java.lang and its subpackages, or nothing ("") for all other API
271
items. This allows java.lang.Object and java.lang to appear in the right
272
places in a purely alphabetical sort.
274
<class> is the name of the class, in Japi Sortable representation.
276
<member> is one of the following depending on what type of API item is being
278
- The empty string, to refer to the class or interface itself.
279
- #<fieldname>, to refer to a field.
280
- (<argtypes>), to refer to a constructor.
281
- <methodname>(<argtypes>)<gnote>, to refer to a method.
282
<fieldname> and <methodname> are simply the name of the field or method, eg
283
"toString". <argtypes> is a comma-separated list of argument types, with each
284
one listed in Type Signature representation, eg "[B,I,I,Ljava/lang/String;".
286
<gnote> is usually the empty string, but for methods that are only present for
287
a generics-aware compiler, a "+" appears here, and for methods that are only
288
present to a pre-generics compiler, a "-" appears here. Thus the class Sub
289
used as an example under "Inclusion" above would get:
290
,Sub!foo(Ljava/lang/Object;)-
291
,Sub!foo(Ljava/lang/String;)+
293
The algorithm for determining exactly which methods should be annotated this
294
way is complex and attempting to specify it here would almost certainly lead
295
to simply enshrining bugs in the algorithm as spec requirements. The code of
296
Japize gives one possible implementation, but it will be fixed if it turns out
297
not to reflect what's really visible to the two different kinds of compiler.
299
Consumers of Japi files should consider excluding one or the other kind of
300
method in order to get a coherent view of the API.
302
<modifiers> is a five-character string indicating the modifiers of the item:
304
- The first character is either "P" for public or "p" for protected.
305
Package-private (default access) and private items do not appear in japi
308
- The second character is either "a" for abstract or "c" for concrete
309
(non-abstract). Interfaces, and methods on interfaces, are always considered
310
abstract regardless of whether they are explicitly set as such.
312
- The third character is either "s" for static or "i" for instance (non-static).
313
Top-level classes are always considered static.
315
- The fourth character is either "f" for final, "n" for nonfinal, or "e" for
316
the special fields that are the values of an "enum" type. Methods on final
317
classes are always considered final regardless of whether they are explicitly
320
- The fifth character is either "d" for deprecated, "u" for undeprecated, or
321
"?" if the deprecation status is unknown.
323
<typeinfo> is a catch-all for general information about the item. What exactly
324
appears here depends on what type of item this is.
326
- For classes, the <typeinfo> field consists of the following parts:
328
- If the class has any generic parameters, the character "<", followed by a
329
comma-separated list of the bounds of each parameter in Type Signature
330
representation, followed by the character ">". Type parameters with
331
multiple bounds are separated by "&" just as in the Java language.
332
The type parameters of the containing type are not included; it is up to
333
the consumer of the japi file to calculate the total list of parameters
334
if it needs them (eg to determine out which parameter "@n" refers to for
335
some n). Note that the types in this list may include references to
336
other types in the same list - the famous example being Enum<T extends
337
Enum<T>> which renders as "class<Ljava/lang/Enum<@0>;>". Another example
338
would be Foo<T, B extends T> which would render as
339
'class<Ljava/lang/Object;,@0>".
340
- If the class is serializable, the character "#" followed by the
341
SerialVersionUID of the class, represented in decimal. If the class is
342
not serializable, this section does not appear.
343
- For each superclass, in order, the character ":" followed by the
344
name of the superclass in Java Language representation. Superclasses that
345
are neither public nor protected are omitted. In the case of
346
java.lang.Object, which has no superclass, this section does not appear.
347
- For each interface implemented by the class, in *any* order, the character
348
"*" followed by the name of the interface in Java Language representation.
349
Interfaces that are neither public nor protected are omitted. Note that
350
interfaces implemented by superclasses, or by other interfaces, must also be
351
included, even if the superclass that implements the interface isn't
353
For example, for the class java.util.ArrayList, the typeinfo would be something
355
class<java.lang.Object>#99999999999999:java.util.AbstractList<@0>:java.lang.AbstractCollection<@0>:java.lang.Object*java.io.Serializable*java.lang.Cloneable*java.util.List<@0>*java.util.Collection<@0>
357
- For interfaces, the format is the same except that "interface" appears instead
358
of "class", and the SerialVersionUID and superclasses are not applicable.
360
- For enums, the format is identical to that of classes except that "enum"
361
appears instead of "class".
363
- For annotations, the format is identical to that of interfaces except that
364
"annotation" appears instead of "interface".
366
- For fields, the <typeinfo> field consists of the following parts:
367
- The type of the field, in Type Signature format.
368
- If the field is inherited from a superclass, the character "=" followed by
369
the name of the class that declares it, in Java Language representation. Only
370
the name of the class appears, not any of its generic parameters.
371
- If the field is a constant value other than null, the character ":" followed
372
by the value of the field. In the case of a char value, this is the integer
373
value of the character; in the case of a string, it is the escaped string.
374
For all other values it is the result of Java's default conversion of that
375
type to a string. For float and double constants, the stringified value may
376
be followed by "/" and the result of toHexString() called on the result of
377
floatToRawIntBits() or doubleToRawLongBits(), respectively. This is
378
unambigous since there is no way for the stringified value of a float or
379
double to contain "/". This hex string is optional but it is recommended to
380
include it if possible.
382
- For constructors, the <typeinfo> field consists of the following parts:
383
- The word "constructor".
384
- For every exception type thrown by this constructor, the character "*"
385
followed by the name of the exception type, in Java Language representation.
386
These may appear in any order. Subclasses of java.lang.RuntimeException and
387
java.lang.Error must be omitted, as must any exception that is a subclass of
388
another exception that can be thrown (eg if both java.io.IOException and
389
java.io.FileNotFoundException can be thrown, only java.io.IOException should
392
- For methods, the <typeinfo> field consists of the following parts:
393
- If the method is a generic method, the character "<", followed by a
394
comma-separated list of type parameter bounds, followed by the character
395
">". The format of the type parameters is identical to that used for the
396
type parameters of classes, described above.
397
- The return type of the method, in Type Signature format.
398
- For every exception type thrown by this method, the character "*" followed
399
by the name of the exception type, in Java Language representation. These
400
may appear in any order. Subclasses of java.lang.RuntimeException and
401
java.lang.Error must be omitted, as must any exception that is a subclass of
402
another exception that can be thrown (eg if both java.io.IOException and
403
java.io.FileNotFoundException can be thrown, only java.io.IOException should
404
be listed). When a parameter type (@0 etc) is thrown, it should currently be
405
included regardless of whether its bounds guarantee that it is a subclass of
406
RuntimeException, Error or another thrown exception. This is likely to change
407
in a future version of this specification.
408
- If the method is part of an annotation, has a default value, and is of type
409
String, Class, or a primitive type, the character ":" followed by the default
410
value. The format of the default value is identical to that used by constant
411
fields for Strings and primitive types, or the name of the class in Type
412
Signature representation for Class types. Default values on array- and
413
annotation-typed methods are not yet supported.
419
This specification should provide enough information to both read and write
420
japi files. Any questions or comments, especially if anything in this spec is
421
unclear, should be directed to stuart.a.ballard@gmail.com.