~ubuntu-branches/ubuntu/quantal/tiff/quantal

« back to all changes in this revision

Viewing changes to html/TIFFTechNote2.html

Committer: Bazaar Package Importer
Author(s): Jay Berkenbilt
Date: 2009-08-28 15:44:23 UTC
mfrom: (1.1.3 upstream)
Revision ID: james.westby@ubuntu.com-20090828154423-7oisj77n302jrroa

Tags: 3.9.1-1

New upstream release

files added:
COPYRIGHT

ChangeLog

HOWTO-RELEASE

Makefile.am

Makefile.in

Makefile.vc

README

README.vms

RELEASE-DATE

SConstruct

TODO

VERSION

aclocal.m4

autogen.sh

build

build/Makefile.am

build/Makefile.in

build/README

config

config/compile

config/config.guess

config/config.sub

config/depcomp

config/install-sh

config/ltmain.sh

config/missing

config/mkinstalldirs

configure

configure.ac

configure.com

contrib

contrib/Makefile.am

contrib/Makefile.in

contrib/README

contrib/acorn

contrib/acorn/Makefile.acorn

contrib/acorn/Makefile.am

contrib/acorn/Makefile.in

contrib/acorn/ReadMe

contrib/acorn/SetVars

contrib/acorn/cleanlib

contrib/acorn/convert

contrib/acorn/install

contrib/addtiffo

contrib/addtiffo/Makefile.am

contrib/addtiffo/Makefile.in

contrib/addtiffo/Makefile.vc

contrib/addtiffo/README

contrib/addtiffo/addtiffo.c

contrib/addtiffo/tif_overview.c

contrib/addtiffo/tif_ovrcache.c

contrib/addtiffo/tif_ovrcache.h

contrib/dbs

contrib/dbs/Makefile.am

contrib/dbs/Makefile.in

contrib/dbs/README

contrib/dbs/tiff-bi.c

contrib/dbs/tiff-grayscale.c

contrib/dbs/tiff-palette.c

contrib/dbs/tiff-rgb.c

contrib/dbs/xtiff

contrib/dbs/xtiff/Makefile.am

contrib/dbs/xtiff/Makefile.in

contrib/dbs/xtiff/README

contrib/dbs/xtiff/patchlevel.h

contrib/dbs/xtiff/xtiff.c

contrib/dbs/xtiff/xtifficon.h

contrib/iptcutil

contrib/iptcutil/Makefile.am

contrib/iptcutil/Makefile.in

contrib/iptcutil/README

contrib/iptcutil/iptcutil.c

contrib/iptcutil/test.iptc

contrib/iptcutil/test.txt

contrib/mac-cw

contrib/mac-cw/Makefile.am

contrib/mac-cw/Makefile.in

contrib/mac-cw/Makefile.script

contrib/mac-cw/README

contrib/mac-cw/mac_main.c

contrib/mac-cw/mac_main.h

contrib/mac-cw/metrowerks.note

contrib/mac-cw/mkg3_main.c

contrib/mac-cw/version.h

contrib/mac-mpw

contrib/mac-mpw/BUILD.mpw

contrib/mac-mpw/Makefile.am

contrib/mac-mpw/Makefile.in

contrib/mac-mpw/README

contrib/mac-mpw/libtiff.make

contrib/mac-mpw/mactrans.c

contrib/mac-mpw/port.make

contrib/mac-mpw/tools.make

contrib/mac-mpw/top.make

contrib/mfs

contrib/mfs/Makefile.am

contrib/mfs/Makefile.in

contrib/mfs/README

contrib/mfs/mfs_file.c

contrib/pds

contrib/pds/Makefile.am

contrib/pds/Makefile.in

contrib/pds/README

contrib/pds/tif_imageiter.c

contrib/pds/tif_imageiter.h

contrib/pds/tif_pdsdirread.c

contrib/pds/tif_pdsdirwrite.c

contrib/ras

contrib/ras/Makefile.am

contrib/ras/Makefile.in

contrib/ras/README

contrib/ras/ras2tif.c

contrib/ras/tif2ras.c

contrib/stream

contrib/stream/Makefile.am

contrib/stream/Makefile.in

contrib/stream/README

contrib/stream/tiffstream.cpp

contrib/stream/tiffstream.h

contrib/tags

contrib/tags/Makefile.am

contrib/tags/Makefile.in

contrib/tags/README

contrib/tags/listtif.c

contrib/tags/maketif.c

contrib/tags/xtif_dir.c

contrib/tags/xtiffio.h

contrib/tags/xtiffiop.h

contrib/win_dib

contrib/win_dib/Makefile.am

contrib/win_dib/Makefile.in

contrib/win_dib/Makefile.w95

contrib/win_dib/README.Tiffile

contrib/win_dib/README.tiff2dib

contrib/win_dib/Tiffile.cpp

contrib/win_dib/tiff2dib.c

html

html/Makefile.am

html/Makefile.in

html/TIFFTechNote2.html

html/addingtags.html

html/bugs.html

html/build.html

html/contrib.html

html/document.html

html/images

html/images.html

html/images/Makefile.am

html/images/Makefile.in

html/images/back.gif

html/images/bali.jpg

html/images/cat.gif

html/images/cover.jpg

html/images/cramps.gif

html/images/dave.gif

html/images/info.gif

html/images/jello.jpg

html/images/jim.gif

html/images/note.gif

html/images/oxford.gif

html/images/quad.jpg

html/images/ring.gif

html/images/smallliz.jpg

html/images/strike.gif

html/images/warning.gif

html/index.html

html/internals.html

html/intro.html

html/libtiff.html

html/man

html/man/Makefile.am

html/man/Makefile.in

html/man/TIFFClose.3tiff.html

html/man/TIFFDataWidth.3tiff.html

html/man/TIFFError.3tiff.html

html/man/TIFFFlush.3tiff.html

html/man/TIFFGetField.3tiff.html

html/man/TIFFOpen.3tiff.html

html/man/TIFFPrintDirectory.3tiff.html

html/man/TIFFRGBAImage.3tiff.html

html/man/TIFFReadDirectory.3tiff.html

html/man/TIFFReadEncodedStrip.3tiff.html

html/man/TIFFReadEncodedTile.3tiff.html

html/man/TIFFReadRGBAImage.3tiff.html

html/man/TIFFReadRGBAStrip.3tiff.html

html/man/TIFFReadRGBATile.3tiff.html

html/man/TIFFReadRawStrip.3tiff.html

html/man/TIFFReadRawTile.3tiff.html

html/man/TIFFReadScanline.3tiff.html

html/man/TIFFReadTile.3tiff.html

html/man/TIFFSetDirectory.3tiff.html

html/man/TIFFSetField.3tiff.html

html/man/TIFFWarning.3tiff.html

html/man/TIFFWriteDirectory.3tiff.html

html/man/TIFFWriteEncodedStrip.3tiff.html

html/man/TIFFWriteEncodedTile.3tiff.html

html/man/TIFFWriteRawStrip.3tiff.html

html/man/TIFFWriteRawTile.3tiff.html

html/man/TIFFWriteScanline.3tiff.html

html/man/TIFFWriteTile.3tiff.html

html/man/TIFFbuffer.3tiff.html

html/man/TIFFcodec.3tiff.html

html/man/TIFFcolor.3tiff.html

html/man/TIFFmemory.3tiff.html

html/man/TIFFquery.3tiff.html

html/man/TIFFsize.3tiff.html

html/man/TIFFstrip.3tiff.html

html/man/TIFFswab.3tiff.html

html/man/TIFFtile.3tiff.html

html/man/fax2ps.1.html

html/man/fax2tiff.1.html

html/man/gif2tiff.1.html

html/man/index.html

html/man/libtiff.3tiff.html

html/man/pal2rgb.1.html

html/man/ppm2tiff.1.html

html/man/ras2tiff.1.html

html/man/raw2tiff.1.html

html/man/rgb2ycbcr.1.html

html/man/sgi2tiff.1.html

html/man/thumbnail.1.html

html/man/tiff2bw.1.html

html/man/tiff2pdf.1.html

html/man/tiff2ps.1.html

html/man/tiff2rgba.1.html

html/man/tiffcmp.1.html

html/man/tiffcp.1.html

html/man/tiffcrop.1.html

html/man/tiffdither.1.html

html/man/tiffdump.1.html

html/man/tiffgt.1.html

html/man/tiffinfo.1.html

html/man/tiffmedian.1.html

html/man/tiffset.1.html

html/man/tiffsplit.1.html

html/man/tiffsv.1.html

html/misc.html

html/support.html

html/tools.html

html/v3.4beta007.html

html/v3.4beta016.html

html/v3.4beta018.html

html/v3.4beta024.html

html/v3.4beta028.html

html/v3.4beta029.html

html/v3.4beta031.html

html/v3.4beta032.html

html/v3.4beta033.html

html/v3.4beta034.html

html/v3.4beta035.html

html/v3.4beta036.html

html/v3.5.1.html

html/v3.5.2.html

html/v3.5.3.html

html/v3.5.4.html

html/v3.5.5.html

html/v3.5.6-beta.html

html/v3.5.7.html

html/v3.6.0.html

html/v3.6.1.html

html/v3.7.0.html

html/v3.7.0alpha.html

html/v3.7.0beta.html

html/v3.7.0beta2.html

html/v3.7.1.html

html/v3.7.2.html

html/v3.7.3.html

html/v3.7.4.html

html/v3.8.0.html

html/v3.8.1.html

html/v3.8.2.html

html/v3.9.0beta.html

libtiff

libtiff/Makefile.am

libtiff/Makefile.in

libtiff/Makefile.vc

libtiff/SConstruct

libtiff/libtiff.def

libtiff/mkg3states.c

libtiff/t4.h

libtiff/tif_acorn.c

libtiff/tif_apple.c

libtiff/tif_atari.c

libtiff/tif_aux.c

libtiff/tif_close.c

libtiff/tif_codec.c

libtiff/tif_color.c

libtiff/tif_compress.c

libtiff/tif_config.h-vms

libtiff/tif_config.h.in

libtiff/tif_config.vc.h

libtiff/tif_config.wince.h

libtiff/tif_dir.c

libtiff/tif_dir.h

libtiff/tif_dirinfo.c

libtiff/tif_dirread.c

libtiff/tif_dirwrite.c

libtiff/tif_dumpmode.c

libtiff/tif_error.c

libtiff/tif_extension.c

libtiff/tif_fax3.c

libtiff/tif_fax3.h

libtiff/tif_fax3sm.c

libtiff/tif_flush.c

libtiff/tif_getimage.c

libtiff/tif_jbig.c

libtiff/tif_jpeg.c

libtiff/tif_luv.c

libtiff/tif_lzw.c

libtiff/tif_msdos.c

libtiff/tif_next.c

libtiff/tif_ojpeg.c

libtiff/tif_open.c

libtiff/tif_packbits.c

libtiff/tif_pixarlog.c

libtiff/tif_predict.c

libtiff/tif_predict.h

libtiff/tif_print.c

libtiff/tif_read.c

libtiff/tif_stream.cxx

libtiff/tif_strip.c

libtiff/tif_swab.c

libtiff/tif_thunder.c

libtiff/tif_tile.c

libtiff/tif_unix.c

libtiff/tif_version.c

libtiff/tif_warning.c

libtiff/tif_win3.c

libtiff/tif_win32.c

libtiff/tif_write.c

libtiff/tif_zip.c

libtiff/tiff.h

libtiff/tiffconf.h.in

libtiff/tiffconf.vc.h

libtiff/tiffconf.wince.h

libtiff/tiffio.h

libtiff/tiffio.hxx

libtiff/tiffiop.h

libtiff/tiffvers.h

libtiff/uvcode.h

m4/acinclude.m4

m4/libtool.m4

m4/ltoptions.m4

m4/ltsugar.m4

m4/ltversion.m4

m4/lt~obsolete.m4

man/Makefile.am

man/Makefile.in

man/TIFFClose.3tiff

man/TIFFDataWidth.3tiff

man/TIFFError.3tiff

man/TIFFFlush.3tiff

man/TIFFGetField.3tiff

man/TIFFOpen.3tiff

man/TIFFPrintDirectory.3tiff

man/TIFFRGBAImage.3tiff

man/TIFFReadDirectory.3tiff

man/TIFFReadEncodedStrip.3tiff

man/TIFFReadEncodedTile.3tiff

man/TIFFReadRGBAImage.3tiff

man/TIFFReadRGBAStrip.3tiff

man/TIFFReadRGBATile.3tiff

man/TIFFReadRawStrip.3tiff

man/TIFFReadRawTile.3tiff

man/TIFFReadScanline.3tiff

man/TIFFReadTile.3tiff

man/TIFFSetDirectory.3tiff

man/TIFFSetField.3tiff

man/TIFFWarning.3tiff

man/TIFFWriteDirectory.3tiff

man/TIFFWriteEncodedStrip.3tiff

man/TIFFWriteEncodedTile.3tiff

man/TIFFWriteRawStrip.3tiff

man/TIFFWriteRawTile.3tiff

man/TIFFWriteScanline.3tiff

man/TIFFWriteTile.3tiff

man/TIFFbuffer.3tiff

man/TIFFcodec.3tiff

man/TIFFcolor.3tiff

man/TIFFmemory.3tiff

man/TIFFquery.3tiff

man/TIFFsize.3tiff

man/TIFFstrip.3tiff

man/TIFFswab.3tiff

man/TIFFtile.3tiff

man/bmp2tiff.1

man/fax2ps.1

man/fax2tiff.1

man/gif2tiff.1

man/libtiff.3tiff

man/pal2rgb.1

man/ppm2tiff.1

man/ras2tiff.1

man/raw2tiff.1

man/rgb2ycbcr.1

man/sgi2tiff.1

man/thumbnail.1

man/tiff2bw.1

man/tiff2pdf.1

man/tiff2ps.1

man/tiff2rgba.1

man/tiffcmp.1

man/tiffcp.1

man/tiffcrop.1

man/tiffdither.1

man/tiffdump.1

man/tiffgt.1

man/tiffinfo.1

man/tiffmedian.1

man/tiffset.1

man/tiffsplit.1

man/tiffsv.1

nmake.opt

port

port/Makefile.am

port/Makefile.in

port/Makefile.vc

port/dummy.c

port/getopt.c

port/lfind.c

port/strcasecmp.c

port/strtoul.c

test

test/Makefile.am

test/Makefile.in

test/ascii_tag.c

test/check_tag.c

test/long_tag.c

test/short_tag.c

test/strip.c

test/strip_rw.c

test/test_arrays.c

test/test_arrays.h

tools

tools/Makefile.am

tools/Makefile.in

tools/Makefile.vc

tools/bmp2tiff.c

tools/fax2ps.c

tools/fax2tiff.c

tools/gif2tiff.c

tools/pal2rgb.c

tools/ppm2tiff.c

tools/ras2tiff.c

tools/rasterfile.h

tools/raw2tiff.c

tools/rgb2ycbcr.c

tools/sgi2tiff.c

tools/sgisv.c

tools/thumbnail.c

tools/tiff2bw.c

tools/tiff2pdf.c

tools/tiff2ps.c

tools/tiff2rgba.c

tools/tiffcmp.c

tools/tiffcp.c

tools/tiffcrop.c

tools/tiffdither.c

tools/tiffdump.c

tools/tiffgt.c

tools/tiffinfo.c

tools/tiffmedian.c

tools/tiffset.c

tools/tiffsplit.c

tools/ycbcr.c

files removed:
debian/patches/CVE-2006-3459-3465.patch

debian/patches/CVE-2008-2327.patch

debian/patches/CVE-2009-2285-lzw.patch

debian/patches/CVE-2009-2347.patch

debian/patches/tif_print.patch

debian/patches/tiff2pdf-octal-printf.patch

debian/patches/tiff2pdf.patch

debian/patches/tiffsplit-fname-overflow.patch

tiff-3.8.2.tar.gz

files modified:
debian/README.Debian

debian/README.source

debian/changelog

debian/control

debian/patches/man-errors.patch

debian/patches/series

debian/patches/soname.patch

debian/rules

Show diffs side-by-side

added added

removed removed

html/TIFFTechNote2.html

<pre>

DRAFT TIFF Technical Note #2 17-Mar-95

============================

This Technical Note describes serious problems that have been found in

TIFF 6.0's design for embedding JPEG-compressed data in TIFF (Section 22

of the TIFF 6.0 spec of 3 June 1992). A replacement TIFF/JPEG

specification is given. Some corrections to Section 21 are also given.

To permit TIFF implementations to continue to read existing files, the 6.0

JPEG fields and tag values will remain reserved indefinitely. However,

TIFF writers are strongly discouraged from using the 6.0 JPEG design. It

is expected that the next full release of the TIFF specification will not

describe the old design at all, except to note that certain tag numbers

are reserved. The existing Section 22 will be replaced by the

specification text given in the second part of this Tech Note.

Problems in TIFF 6.0 JPEG

=========================

Abandoning a published spec is not a step to be taken lightly. This

section summarizes the reasons that have forced this decision.

TIFF 6.0's JPEG design suffers from design errors and limitations,

ambiguities, and unnecessary complexity.

Design errors and limitations

-----------------------------

The fundamental design error in the existing Section 22 is that JPEG's

various tables and parameters are broken out as separate fields which the

TIFF control logic must manage. This is bad software engineering: that

information should be treated as private to the JPEG codec

(compressor/decompressor). Worse, the fields themselves are specified

without sufficient thought for future extension and without regard to

well-established TIFF conventions. Here are some of the significant

problems:

* The JPEGxxTable fields do not store the table data directly in the

IFD/field structure; rather, the fields hold pointers to information

elsewhere in the file. This requires special-purpose code to be added to

*every* TIFF-manipulating application, whether it needs to decode JPEG

image data or not. Even a trivial TIFF editor, for example a program to

add an ImageDescription field to a TIFF file, must be explicitly aware of

the internal structure of the JPEG-related tables, or else it will probably

break the file. Every other auxiliary field in the TIFF spec contains

data, not pointers, and can be copied or relocated by standard code that

doesn't know anything about the particular field. This is a crucial

property of the TIFF format that must not be given up.

* To manipulate these fields, the TIFF control logic is required to know a

great deal about JPEG details, for example such arcana as how to compute

the length of a Huffman code table --- the length is not supplied in the

field structure and can only be found by inspecting the table contents.

This is again a violation of good software practice. Moreover, it will

prevent easy adoption of future JPEG extensions that might change these

low-level details.

* The design neglects the fact that baseline JPEG codecs support only two

sets of Huffman tables: it specifies a separate table for each color

component. This implies that encoders must waste space (by storing

duplicate Huffman tables) or else violate the well-founded TIFF convention

that prohibits duplicate pointers. Furthermore, baseline decoders must

test to find out which tables are identical, a waste of time and code

space.

* The JPEGInterchangeFormat field also violates TIFF's proscription against

duplicate pointers: the normal strip/tile pointers are expected to point

into the larger data area pointed to by JPEGInterchangeFormat. All TIFF

editing applications must be specifically aware of this relationship, since

they must maintain it or else delete the JPEGInterchangeFormat field. The

JPEGxxTables fields are also likely to point into the JPEGInterchangeFormat

area, creating additional pointer relationships that must be maintained.

* The JPEGQTables field is fixed at a byte per table entry; there is no

way to support 16-bit quantization values. This is a serious impediment

to extending TIFF to use 12-bit JPEG.

* The 6.0 design cannot support using different quantization tables in

different strips/tiles of an image (so as to encode some areas at higher

quality than others). Furthermore, since quantization tables are tied

one-for-one to color components, the design cannot support table switching

options that are likely to be added in future JPEG revisions.

Ambiguities

-----------

Several incompatible interpretations are possible for 6.0's treatment of

JPEG restart markers:

* It is unclear whether restart markers must be omitted at TIFF segment

(strip/tile) boundaries, or whether they are optional.

* It is unclear whether the segment size is required to be chosen as

a multiple of the specified restart interval (if any); perhaps the

JPEG codec is supposed to be reset at each segment boundary as if

there were a restart marker there, even if the boundary does not fall

100

at a multiple of the nominal restart interval.

101

102

* The spec fails to address the question of restart marker numbering:

103

do the numbers begin again within each segment, or not?

104

105

That last point is particularly nasty. If we make numbering begin again

106

within each segment, we give up the ability to impose a TIFF strip/tile

107

structure on an existing JPEG datastream with restarts (which was clearly a

108

goal of Section 22's authors). But the other choice interferes with random

109

access to the image segments: a reader must compute the first restart

110

number to be expected within a segment, and must have a way to reset its

111

JPEG decoder to expect a nonzero restart number first. This may not even

112

be possible with some JPEG chips.

113

114

The tile height restriction found on page 104 contradicts Section 15's

115

general description of tiles. For an image that is not vertically

116

downsampled, page 104 specifies a tile height of one MCU or 8 pixels; but

117

Section 15 requires tiles to be a multiple of 16 pixels high.

118

119

This Tech Note does not attempt to resolve these ambiguities, so

120

implementations that follow the 6.0 design should be aware that

121

inter-application compatibility problems are likely to arise.

122

123

124

Unnecessary complexity

125

----------------------

126

127

The 6.0 design creates problems for implementations that need to keep the

128

JPEG codec separate from the TIFF control logic --- for example, consider

129

using a JPEG chip that was not designed specifically for TIFF. JPEG codecs

130

generally want to produce or consume a standard ISO JPEG datastream, not

131

just raw compressed data. (If they were to handle raw data, a separate

132

out-of-band mechanism would be needed to load tables into the codec.)

133

With such a codec, the TIFF control logic must parse JPEG markers emitted

134

by the codec to create the TIFF table fields (when writing) or synthesize

135

JPEG markers from the TIFF fields to feed the codec (when reading). This

136

means that the control logic must know a great deal more about JPEG details

137

than we would like. The parsing and reconstruction of the markers also

138

represents a fair amount of unnecessary work.

139

140

Quite a few implementors have proposed writing "TIFF/JPEG" files in which

141

a standard JPEG datastream is simply dumped into the file and pointed to

142

by JPEGInterchangeFormat. To avoid parsing the JPEG datastream, they

143

suggest not writing the JPEG auxiliary fields (JPEGxxTables etc) nor even

144

the basic TIFF strip/tile data pointers. This approach is incompatible

145

with implementations that handle the full TIFF 6.0 JPEG design, since they

146

will expect to find strip/tile pointers and auxiliary fields. Indeed this

147

is arguably not TIFF at all, since *all* TIFF-reading applications expect

148

to find strip or tile pointers. A subset implementation that is not

149

upward-compatible with the full spec is clearly unacceptable. However,

150

the frequency with which this idea has come up makes it clear that

151

implementors find the existing Section 22 too complex.

152

153

154

Overview of the solution

155

========================

156

157

To solve these problems, we adopt a new design for embedding

158

JPEG-compressed data in TIFF files. The new design uses only complete,

159

uninterpreted ISO JPEG datastreams, so it should be much more forgiving of

160

extensions to the ISO standard. It should also be far easier to implement

161

using unmodified JPEG codecs.

162

163

To reduce overhead in multi-segment TIFF files, we allow JPEG overhead

164

tables to be stored just once in a JPEGTables auxiliary field. This

165

feature does not violate the integrity of the JPEG datastreams, because it

166

uses the notions of "tables-only datastreams" and "abbreviated image

167

datastreams" as defined by the ISO standard.

168

169

To prevent confusion with the old design, the new design is given a new

170

Compression tag value, Compression=7. Readers that need to handle

171

existing 6.0 JPEG files may read both old and new files, using whatever

172

interpretation of the 6.0 spec they did before. Compression tag value 6

173

and the field tag numbers defined by 6.0 section 22 will remain reserved

174

indefinitely, even though detailed descriptions of them will be dropped

175

from future editions of the TIFF specification.

176

177

178

Replacement TIFF/JPEG specification

179

===================================

180

181

[This section of the Tech Note is expected to replace Section 22 in the

182

next release of the TIFF specification.]

183

184

This section describes TIFF compression scheme 7, a high-performance

185

compression method for continuous-tone images.

186

187

Introduction

188

------------

189

190

This TIFF compression method uses the international standard for image

191

compression ISO/IEC 10918-1, usually known as "JPEG" (after the original

192

name of the standards committee, Joint Photographic Experts Group). JPEG

193

is a joint ISO/CCITT standard for compression of continuous-tone images.

194

195

The JPEG committee decided that because of the broad scope of the standard,

196

no one algorithmic procedure was able to satisfy the requirements of all

197

applications. Instead, the JPEG standard became a "toolkit" of multiple

198

algorithms and optional capabilities. Individual applications may select

199

a subset of the JPEG standard that meets their requirements.

200

201

The most important distinction among the JPEG processes is between lossy

202

and lossless compression. Lossy compression methods provide high

203

compression but allow only approximate reconstruction of the original

204

image. JPEG's lossy processes allow the encoder to trade off compressed

205

file size against reconstruction fidelity over a wide range. Typically,

206

10:1 or more compression of full-color data can be obtained while keeping

207

the reconstructed image visually indistinguishable from the original. Much

208

higher compression ratios are possible if a low-quality reconstructed image

209

is acceptable. Lossless compression provides exact reconstruction of the

210

source data, but the achievable compression ratio is much lower than for

211

the lossy processes; JPEG's rather simple lossless process typically

212

achieves around 2:1 compression of full-color data.

213

214

The most widely implemented JPEG subset is the "baseline" JPEG process.

215

This provides lossy compression of 8-bit-per-channel data. Optional

216

extensions include 12-bit-per-channel data, arithmetic entropy coding for

217

better compression, and progressive/hierarchical representations. The

218

lossless process is an independent algorithm that has little in

219

common with the lossy processes.

220

221

It should be noted that the optional arithmetic-coding extension is subject

222

to several US and Japanese patents. To avoid patent problems, use of

223

arithmetic coding processes in TIFF files intended for inter-application

224

interchange is discouraged.

225

226

All of the JPEG processes are useful only for "continuous tone" data,

227

in which the difference between adjacent pixel values is usually small.

228

Low-bit-depth source data is not appropriate for JPEG compression, nor

229

are palette-color images good candidates. The JPEG processes work well

230

on grayscale and full-color data.

231

232

Describing the JPEG compression algorithms in sufficient detail to permit

233

implementation would require more space than we have here. Instead, we

234

refer the reader to the References section.

235

236

237

What data is being compressed?

238

------------------------------

239

240

In lossy JPEG compression, it is customary to convert color source data

241

to YCbCr and then downsample it before JPEG compression. This gives

242

2:1 data compression with hardly any visible image degradation, and it

243

permits additional space savings within the JPEG compression step proper.

244

However, these steps are not considered part of the ISO JPEG standard.

245

The ISO standard is "color blind": it accepts data in any color space.

246

247

For TIFF purposes, the JPEG compression tag is considered to represent the

248

ISO JPEG compression standard only. The ISO standard is applied to the

249

same data that would be stored in the TIFF file if no compression were

250

used. Therefore, if color conversion or downsampling are used, they must

251

be reflected in the regular TIFF fields; these steps are not considered to

252

be implicit in the JPEG compression tag value. PhotometricInterpretation

253

and related fields shall describe the color space actually stored in the

254

file. With the TIFF 6.0 field definitions, downsampling is permissible

255

only for YCbCr data, and it must correspond to the YCbCrSubSampling field.

256

(Note that the default value for this field is not 1,1; so the default for

257

YCbCr is to apply downsampling!) It is likely that future versions of TIFF

258

will provide additional PhotometricInterpretation values and a more general

259

way of defining subsampling, so as to allow more flexibility in

260

JPEG-compressed files. But that issue is not addressed in this Tech Note.

261

262

Implementors should note that many popular JPEG codecs

263

(compressor/decompressors) provide automatic color conversion and

264

downsampling, so that the application may supply full-size RGB data which

265

is nonetheless converted to downsampled YCbCr. This is an implementation

266

convenience which does not excuse the TIFF control layer from its

267

responsibility to know what is really going on. The

268

PhotometricInterpretation and subsampling fields written to the file must

269

describe what is actually in the file.

270

271

A JPEG-compressed TIFF file will typically have PhotometricInterpretation =

272

YCbCr and YCbCrSubSampling = [2,1] or [2,2], unless the source data was

273

grayscale or CMYK.

274

275

276

Basic representation of JPEG-compressed images

277

----------------------------------------------

278

279

JPEG compression works in either strip-based or tile-based TIFF files.

280

Rather than repeating "strip or tile" constantly, we will use the term

281

"segment" to mean either a strip or a tile.

282

283

When the Compression field has the value 7, each image segment contains

284

a complete JPEG datastream which is valid according to the ISO JPEG

285

standard (ISO/IEC 10918-1). Any sequential JPEG process can be used,

286

including lossless JPEG, but progressive and hierarchical processes are not

287

supported. Since JPEG is useful only for continuous-tone images, the

288

PhotometricInterpretation of the image shall not be 3 (palette color) nor

289

4 (transparency mask). The bit depth of the data is also restricted as

290

specified below.

291

292

Each image segment in a JPEG-compressed TIFF file shall contain a valid

293

JPEG datastream according to the ISO JPEG standard's rules for

294

interchange-format or abbreviated-image-format data. The datastream shall

295

contain a single JPEG frame storing that segment of the image. The

296

required JPEG markers within a segment are:

297

SOI (must appear at very beginning of segment)

298

SOFn

299

SOS (one for each scan, if there is more than one scan)

300

EOI (must appear at very end of segment)

301

The actual compressed data follows SOS; it may contain RSTn markers if DRI

302

is used.

303

304

Additional JPEG "tables and miscellaneous" markers may appear between SOI

305

and SOFn, between SOFn and SOS, and before each subsequent SOS if there is

306

more than one scan. These markers include:

307

DQT

308

DHT

309

DAC (not to appear unless arithmetic coding is used)

310

DRI

311

APPn (shall be ignored by TIFF readers)

312

COM (shall be ignored by TIFF readers)

313

DNL markers shall not be used in TIFF files. Readers should abort if any

314

other marker type is found, especially the JPEG reserved markers;

315

occurrence of such a marker is likely to indicate a JPEG extension.

316

317

The tables/miscellaneous markers may appear in any order. Readers are

318

cautioned that although the SOFn marker refers to DQT tables, JPEG does not

319

require those tables to precede the SOFn, only the SOS. Missing-table

320

checks should be made when SOS is reached.

321

322

If no JPEGTables field is used, then each image segment shall be a complete

323

JPEG interchange datastream. Each segment must define all the tables it

324

references. To allow readers to decode segments in any order, no segment

325

may rely on tables being carried over from a previous segment.

326

327

When a JPEGTables field is used, image segments may omit tables that have

328

been specified in the JPEGTables field. Further details appear below.

329

330

The SOFn marker shall be of type SOF0 for strict baseline JPEG data, of

331

type SOF1 for non-baseline lossy JPEG data, or of type SOF3 for lossless

332

JPEG data. (SOF9 or SOF11 would be used for arithmetic coding.) All

333

segments of a JPEG-compressed TIFF image shall use the same JPEG

334

compression process, in particular the same SOFn type.

335

336

The data precision field of the SOFn marker shall agree with the TIFF

337

BitsPerSample field. (Note that when PlanarConfiguration=1, this implies

338

that all components must have the same BitsPerSample value; when

339

PlanarConfiguration=2, different components could have different bit

340

depths.) For SOF0 only precision 8 is permitted; for SOF1, precision 8 or

341

12 is permitted; for SOF3, precisions 2 to 16 are permitted.

342

343

The image dimensions given in the SOFn marker shall agree with the logical

344

dimensions of that particular strip or tile. For strip images, the SOFn

345

image width shall equal ImageWidth and the height shall equal RowsPerStrip,

346

except in the last strip; its SOFn height shall equal the number of rows

347

remaining in the ImageLength. (In other words, no padding data is counted

348

in the SOFn dimensions.) For tile images, each SOFn shall have width

349

TileWidth and height TileHeight; adding and removing any padding needed in

350

the edge tiles is the concern of some higher level of the TIFF software.

351

(The dimensional rules are slightly different when PlanarConfiguration=2,

352

as described below.)

353

354

The ISO JPEG standard only permits images up to 65535 pixels in width or

355

height, due to 2-byte fields in the SOFn markers. In TIFF, this limits

356

the size of an individual JPEG-compressed strip or tile, but the total

357

image size can be greater.

358

359

The number of components in the JPEG datastream shall equal SamplesPerPixel

360

for PlanarConfiguration=1, and shall be 1 for PlanarConfiguration=2. The

361

components shall be stored in the same order as they are described at the

362

TIFF field level. (This applies both to their order in the SOFn marker,

363

and to the order in which they are scanned if multiple JPEG scans are

364

used.) The component ID bytes are arbitrary so long as each component

365

within an image segment is given a distinct ID. To avoid any possible

366

confusion, we require that all segments of a TIFF image use the same ID

367

code for a given component.

368

369

In PlanarConfiguration 1, the sampling factors given in SOFn markers shall

370

agree with the sampling factors defined by the related TIFF fields (or with

371

the default values that are specified in the absence of those fields).

372

373

When DCT-based JPEG is used in a strip TIFF file, RowsPerStrip is required

374

to be a multiple of 8 times the largest vertical sampling factor, i.e., a

375

multiple of the height of an interleaved MCU. (For simplicity of

376

specification, we require this even if the data is not actually

377

interleaved.) For example, if YCbCrSubSampling = [2,2] then RowsPerStrip

378

must be a multiple of 16. An exception to this rule is made for

379

single-strip images (RowsPerStrip >= ImageLength): the exact value of

380

RowsPerStrip is unimportant in that case. This rule ensures that no data

381

padding is needed at the bottom of a strip, except perhaps the last strip.

382

Any padding required at the right edge of the image, or at the bottom of

383

the last strip, is expected to occur internally to the JPEG codec.

384

385

When DCT-based JPEG is used in a tiled TIFF file, TileLength is required

386

to be a multiple of 8 times the largest vertical sampling factor, i.e.,

387

a multiple of the height of an interleaved MCU; and TileWidth is required

388

to be a multiple of 8 times the largest horizontal sampling factor, i.e.,

389

a multiple of the width of an interleaved MCU. (For simplicity of

390

specification, we require this even if the data is not actually

391

interleaved.) All edge padding required will therefore occur in the course

392

of normal TIFF tile padding; it is not special to JPEG.

393

394

Lossless JPEG does not impose these constraints on strip and tile sizes,

395

since it is not DCT-based.

396

397

Note that within JPEG datastreams, multibyte values appear in the MSB-first

398

order specified by the JPEG standard, regardless of the byte ordering of

399

the surrounding TIFF file.

400

401

402

JPEGTables field

403

----------------

404

405

The only auxiliary TIFF field added for Compression=7 is the optional

406

JPEGTables field. The purpose of JPEGTables is to predefine JPEG

407

quantization and/or Huffman tables for subsequent use by JPEG image

408

segments. When this is done, these rather bulky tables need not be

409

duplicated in each segment, thus saving space and processing time.

410

JPEGTables may be used even in a single-segment file, although there is no

411

space savings in that case.

412

413

JPEGTables:

414

Tag = 347 (15B.H)

415

Type = UNDEFINED

416

N = number of bytes in tables datastream, typically a few hundred

417

JPEGTables provides default JPEG quantization and/or Huffman tables which

418

are used whenever a segment datastream does not contain its own tables, as

419

specified below.

420

421

Notice that the JPEGTables field is required to have type code UNDEFINED,

422

not type code BYTE. This is to cue readers that expanding individual bytes

423

to short or long integers is not appropriate. A TIFF reader will generally

424

need to store the field value as an uninterpreted byte sequence until it is

425

fed to the JPEG decoder.

426

427

Multibyte quantities within the tables follow the ISO JPEG convention of

428

MSB-first storage, regardless of the byte ordering of the surrounding TIFF

429

file.

430

431

When the JPEGTables field is present, it shall contain a valid JPEG

432

"abbreviated table specification" datastream. This datastream shall begin

433

with SOI and end with EOI. It may contain zero or more JPEG "tables and

434

miscellaneous" markers, namely:

435

DQT

436

DHT

437

DAC (not to appear unless arithmetic coding is used)

438

DRI

439

APPn (shall be ignored by TIFF readers)

440

COM (shall be ignored by TIFF readers)

441

Since JPEG defines the SOI marker to reset the DAC and DRI state, these two

442

markers' values cannot be carried over into any image datastream, and thus

443

they are effectively no-ops in the JPEGTables field. To avoid confusion,

444

it is recommended that writers not place DAC or DRI markers in JPEGTables.

445

However readers must properly skip over them if they appear.

446

447

When JPEGTables is present, readers shall load the table specifications

448

contained in JPEGTables before processing image segment datastreams.

449

Image segments may simply refer to these preloaded tables without defining

450

them. An image segment can still define and use its own tables, subject to

451

the restrictions below.

452

453

An image segment may not redefine any table defined in JPEGTables. (This

454

restriction is imposed to allow readers to process image segments in random

455

order without having to reload JPEGTables between segments.) Therefore, use

456

of JPEGTables divides the available table slots into two groups: "global"

457

slots are defined in JPEGTables and may be used but not redefined by

458

segments; "local" slots are available for local definition and use in each

459

segment. To permit random access, a segment may not reference any local

460

tables that it does not itself define.

461

462

463

Special considerations for PlanarConfiguration 2

464

------------------------------------------------

465

466

In PlanarConfiguration 2, each image segment contains data for only one

467

color component. To avoid confusing the JPEG codec, we wish the segments

468

to look like valid single-channel (i.e., grayscale) JPEG datastreams. This

469

means that different rules must be used for the SOFn parameters.

470

471

In PlanarConfiguration 2, the dimensions given in the SOFn of a subsampled

472

component shall be scaled down by the sampling factors compared to the SOFn

473

dimensions that would be used in PlanarConfiguration 1. This is necessary

474

to match the actual number of samples stored in that segment, so that the

475

JPEG codec doesn't complain about too much or too little data. In strip

476

TIFF files the computed dimensions may need to be rounded up to the next

477

integer; in tiled files, the restrictions on tile size make this case

478

impossible.

479

480

Furthermore, all SOFn sampling factors shall be given as 1. (This is

481

merely to avoid confusion, since the sampling factors in a single-channel

482

JPEG datastream have no real effect.)

483

484

Any downsampling will need to happen externally to the JPEG codec, since

485

JPEG sampling factors are defined with reference to the full-precision

486

component. In PlanarConfiguration 2, the JPEG codec will be working on

487

only one component at a time and thus will have no reference component to

488

downsample against.

489

490

491

Minimum requirements for TIFF/JPEG

492

----------------------------------

493

494

ISO JPEG is a large and complex standard; most implementations support only

495

a subset of it. Here we define a "core" subset of TIFF/JPEG which readers

496

must support to claim TIFF/JPEG compatibility. For maximum

497

cross-application compatibility, we recommend that writers confine

498

themselves to this subset unless there is very good reason to do otherwise.

499

500

Use the ISO baseline JPEG process: 8-bit data precision, Huffman coding,

501

with no more than 2 DC and 2 AC Huffman tables. Note that this implies

502

BitsPerSample = 8 for each component. We recommend deviating from baseline

503

JPEG only if 12-bit data precision or lossless coding is required.

504

505

Use no subsampling (all JPEG sampling factors = 1) for color spaces other

506

than YCbCr. (This is, in fact, required with the TIFF 6.0 field

507

definitions, but may not be so in future revisions.) For YCbCr, use one of

508

the following choices:

509

YCbCrSubSampling field JPEG sampling factors

510

1,1 1h1v, 1h1v, 1h1v

511

2,1 2h1v, 1h1v, 1h1v

512

2,2 (default value) 2h2v, 1h1v, 1h1v

513

We recommend that RGB source data be converted to YCbCr for best compression

514

results. Other source data colorspaces should probably be left alone.

515

Minimal readers need not support JPEG images with colorspaces other than

516

YCbCr and grayscale (PhotometricInterpretation = 6 or 1).

517

518

A minimal reader also need not support JPEG YCbCr images with nondefault

519

values of YCbCrCoefficients or YCbCrPositioning, nor with values of

520

ReferenceBlackWhite other than [0,255,128,255,128,255]. (These values

521

correspond to the RGB<=>YCbCr conversion specified by JFIF, which is widely

522

implemented in JPEG codecs.)

523

524

Writers are reminded that a ReferenceBlackWhite field *must* be included

525

when PhotometricInterpretation is YCbCr, because the default

526

ReferenceBlackWhite values are inappropriate for YCbCr.

527

528

If any subsampling is used, PlanarConfiguration=1 is preferred to avoid the

529

possibly-confusing requirements of PlanarConfiguration=2. In any case,

530

readers are not required to support PlanarConfiguration=2.

531

532

If possible, use a single interleaved scan in each image segment. This is

533

not legal JPEG if there are more than 4 SamplesPerPixel or if the sampling

534

factors are such that more than 10 blocks would be needed per MCU; in that

535

case, use a separate scan for each component. (The recommended color

536

spaces and sampling factors will not run into that restriction, so a

537

minimal reader need not support more than one scan per segment.)

538

539

To claim TIFF/JPEG compatibility, readers shall support multiple-strip TIFF

540

files and the optional JPEGTables field; it is not acceptable to read only

541

single-datastream files. Support for tiled TIFF files is strongly

542

recommended but not required.

543

544

545

Other recommendations for implementors

546

--------------------------------------

547

548

The TIFF tag Compression=7 guarantees only that the compressed data is

549

represented as ISO JPEG datastreams. Since JPEG is a large and evolving

550

standard, readers should apply careful error checking to the JPEG markers

551

to ensure that the compression process is within their capabilities. In

552

particular, to avoid being confused by future extensions to the JPEG

553

standard, it is important to abort if unknown marker codes are seen.

554

555

The point of requiring that all image segments use the same JPEG process is

556

to ensure that a reader need check only one segment to determine whether it

557

can handle the image. For example, consider a TIFF reader that has access

558

to fast but restricted JPEG hardware, as well as a slower, more general

559

software implementation. It is desirable to check only one image segment

560

to find out whether the fast hardware can be used. Thus, writers should

561

try to ensure that all segments of an image look as much "alike" as

562

possible: there should be no variation in scan layout, use of options such

563

as DRI, etc. Ideally, segments will be processed identically except

564

perhaps for using different local quantization or entropy-coding tables.

565

566

Writers should avoid including "noise" JPEG markers (COM and APPn markers).

567

Standard TIFF fields provide a better way to transport any non-image data.

568

Some JPEG codecs may change behavior if they see an APPn marker they

569

think they understand; since the TIFF spec requires these markers to be

570

ignored, this behavior is undesirable.

571

572

It is possible to convert an interchange-JPEG file (e.g., a JFIF file) to

573

TIFF simply by dropping the interchange datastream into a single strip.

574

(However, designers are reminded that the TIFF spec discourages huge

575

strips; splitting the image is somewhat more work but may give better

576

results.) Conversion from TIFF to interchange JPEG is more complex. A

577

strip-based TIFF/JPEG file can be converted fairly easily if all strips use

578

identical JPEG tables and no RSTn markers: just delete the overhead markers

579

and insert RSTn markers between strips. Converting tiled images is harder,

580

since the data will usually not be in the right order (unless the tiles are

581

only one MCU high). This can still be done losslessly, but it will require

582

undoing and redoing the entropy coding so that the DC coefficient

583

differences can be updated.

584

585

There is no default value for JPEGTables: standard TIFF files must define all

586

tables that they reference. For some closed systems in which many files will

587

have identical tables, it might make sense to define a default JPEGTables

588

value to avoid actually storing the tables. Or even better, invent a

589

private field selecting one of N default JPEGTables settings, so as to allow

590

for future expansion. Either of these must be regarded as a private

591

extension that will render the files unreadable by other applications.

592

593

594

References

595

----------

596

597

[1] Wallace, Gregory K. "The JPEG Still Picture Compression Standard",

598

Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44.

599

600

This is the best short technical introduction to the JPEG algorithms.

601

It is a good overview but does not provide sufficiently detailed

602

information to write an implementation.

603

604

[2] Pennebaker, William B. and Mitchell, Joan L. "JPEG Still Image Data

605

Compression Standard", Van Nostrand Reinhold, 1993, ISBN 0-442-01272-1.

606

638pp.

607

608

This textbook is by far the most complete exposition of JPEG in existence.

609

It includes the full text of the ISO JPEG standards (DIS 10918-1 and draft

610

DIS 10918-2). No would-be JPEG implementor should be without it.

611

612

[3] ISO/IEC IS 10918-1, "Digital Compression and Coding of Continuous-tone

613

Still Images, Part 1: Requirements and guidelines", February 1994.

614

ISO/IEC DIS 10918-2, "Digital Compression and Coding of Continuous-tone

615

Still Images, Part 2: Compliance testing", final approval expected 1994.

616

617

These are the official standards documents. Note that the Pennebaker and

618

Mitchell textbook is likely to be cheaper and more useful than the official

619

standards.

620

621

622

Changes to Section 21: YCbCr Images

623

===================================

624

625

[This section of the Tech Note clarifies section 21 to make clear the

626

interpretation of image dimensions in a subsampled image. Furthermore,

627

the section is changed to allow the original image dimensions not to be

628

multiples of the sampling factors. This change is necessary to support use

629

of JPEG compression on odd-size images.]

630

631

Add the following paragraphs to the Section 21 introduction (p. 89),

632

just after the paragraph beginning "When a Class Y image is subsampled":

633

634

In a subsampled image, it is understood that all TIFF image

635

dimensions are measured in terms of the highest-resolution

636

(luminance) component. In particular, ImageWidth, ImageLength,

637

RowsPerStrip, TileWidth, TileLength, XResolution, and YResolution

638

are measured in luminance samples.

639

640

RowsPerStrip, TileWidth, and TileLength are constrained so that

641

there are an integral number of samples of each component in a

642

complete strip or tile. However, ImageWidth/ImageLength are not

643

constrained. If an odd-size image is to be converted to subsampled

644

format, the writer should pad the source data to a multiple of the

645

sampling factors by replication of the last column and/or row, then

646

downsample. The number of luminance samples actually stored in the

647

file will be a multiple of the sampling factors. Conversely,

648

readers must ignore any extra data (outside the specified image

649

dimensions) after upsampling.

650

651

When PlanarConfiguration=2, each strip or tile covers the same

652

image area despite subsampling; that is, the total number of strips

653

or tiles in the image is the same for each component. Therefore

654

strips or tiles of the subsampled components contain fewer samples

655

than strips or tiles of the luminance component.

656

657

If there are extra samples per pixel (see field ExtraSamples),

658

these data channels have the same number of samples as the

659

luminance component.

660

661

Rewrite the YCbCrSubSampling field description (pp 91-92) as follows

662

(largely to eliminate possibly-misleading references to

663

ImageWidth/ImageLength of the subsampled components):

664

665

(first paragraph unchanged)

666

667

The two elements of this field are defined as follows:

668

669

Short 0: ChromaSubsampleHoriz:

670

671

1 = there are equal numbers of luma and chroma samples horizontally.

672

673

2 = there are twice as many luma samples as chroma samples

674

horizontally.

675

676

4 = there are four times as many luma samples as chroma samples

677

horizontally.

678

679

Short 1: ChromaSubsampleVert:

680

681

1 = there are equal numbers of luma and chroma samples vertically.

682

683

2 = there are twice as many luma samples as chroma samples

684

vertically.

685

686

4 = there are four times as many luma samples as chroma samples

687

vertically.

688

689

ChromaSubsampleVert shall always be less than or equal to

690

ChromaSubsampleHoriz. Note that Cb and Cr have the same sampling

691

ratios.

692

693

In a strip TIFF file, RowsPerStrip is required to be an integer

694

multiple of ChromaSubSampleVert (unless RowsPerStrip >=

695

ImageLength, in which case its exact value is unimportant).

696

If ImageWidth and ImageLength are not multiples of

697

ChromaSubsampleHoriz and ChromaSubsampleVert respectively, then the

698

source data shall be padded to the next integer multiple of these

699

values before downsampling.

700

701

In a tiled TIFF file, TileWidth must be an integer multiple of

702

ChromaSubsampleHoriz and TileLength must be an integer multiple of

703

ChromaSubsampleVert. Padding will occur to tile boundaries.

704

705

The default values of this field are [ 2,2 ]. Thus, YCbCr data is

706

downsampled by default!

707

</pre>

Older »