~josejuan-sanchez/esajpip/debian

« back to all changes in this revision

Viewing changes to doc/jpeg2000.tex

Committer: José Juan Sánchez Hernández
Date: 2013-04-02 18:14:26 UTC
Revision ID: josejuan.sanchez@gmail.com-20130402181426-07xn3djblburck53

Version for Debian

files added:

Makefile

README

VERSION

doc/Makefile

doc/architecture.fig

doc/client_manager.fig

doc/codestream.fig

doc/doxyfile

doc/doxygen.sty

doc/guide.tex

doc/header.tex

doc/jpeg2000.tex

doc/jpip.tex

doc/jpip_fig.fig

doc/partition.fig

esajpip.tar.gz

server.cfg

src/app_config.cc

src/app_config.h

src/app_info.cc

src/app_info.h

src/args_parser.cc

src/args_parser.h

src/base.cc

src/base.h

src/client_info.cc

src/client_info.h

src/client_manager.cc

src/client_manager.h

src/data

src/data/data.h

src/data/file.cc

src/data/file.h

src/data/file_segment.cc

src/data/file_segment.h

src/data/serialize.cc

src/data/serialize.h

src/data/vint_vector.cc

src/data/vint_vector.h

src/esa_jpip_server.cc

src/http

src/http/header.cc

src/http/header.h

src/http/http.h

src/http/protocol.cc

src/http/protocol.h

src/http/request.cc

src/http/request.h

src/http/response.cc

src/http/response.h

src/ipc

src/ipc/event.cc

src/ipc/event.h

src/ipc/ipc.h

src/ipc/ipc_object.cc

src/ipc/ipc_object.h

src/ipc/mutex.cc

src/ipc/mutex.h

src/ipc/rdwr_lock.cc

src/ipc/rdwr_lock.h

src/jpeg2000

src/jpeg2000/codestream_index.cc

src/jpeg2000/codestream_index.h

src/jpeg2000/coding_parameters.cc

src/jpeg2000/coding_parameters.h

src/jpeg2000/file_manager.cc

src/jpeg2000/file_manager.h

src/jpeg2000/image_index.cc

src/jpeg2000/image_index.h

src/jpeg2000/image_info.cc

src/jpeg2000/image_info.h

src/jpeg2000/index_manager.cc

src/jpeg2000/index_manager.h

src/jpeg2000/jpeg2000.h

src/jpeg2000/meta_data.cc

src/jpeg2000/meta_data.h

src/jpeg2000/packet.cc

src/jpeg2000/packet.h

src/jpeg2000/packet_index.cc

src/jpeg2000/packet_index.h

src/jpeg2000/place_holder.cc

src/jpeg2000/place_holder.h

src/jpeg2000/point.cc

src/jpeg2000/point.h

src/jpeg2000/range.cc

src/jpeg2000/range.h

src/jpip

src/jpip/cache_model.cc

src/jpip/cache_model.h

src/jpip/databin_server.cc

src/jpip/databin_server.h

src/jpip/databin_writer.cc

src/jpip/databin_writer.h

src/jpip/jpip.cc

src/jpip/jpip.h

src/jpip/request.cc

src/jpip/request.h

src/jpip/woi.cc

src/jpip/woi.h

src/jpip/woi_composer.cc

src/jpip/woi_composer.h

src/net

src/net/address.cc

src/net/address.h

src/net/net.h

src/net/poll_table.cc

src/net/poll_table.h

src/net/socket.cc

src/net/socket.h

src/net/socket_stream.cc

src/net/socket_stream.h

src/packet_information.cc

src/trace.cc

src/trace.h

Show diffs side-by-side

added added

removed removed

doc/jpeg2000.tex

Part 1 of the \href{http://www.jpeg.org/jpeg2000/}{JPEG2000} standard describes

a core compression system that is based on the dyadic

\href{http://en.wikipedia.org/wiki/Discrete_wavelet_transform}

{DWT (Discrete Wavelet Transform)}

and the

EBCOT (Embedded Block Coding with Optimal Truncation). Some features of this

compression system are high compression ratios, error-resilience, lossless

and lossy compression, random access to the compressed stream, resolution and

quality scalability, and support for multiple components.

These characteristics make it ideal for the coding and retrieving of

large remote images.

\subsection{Data partitions}

The JPEG2000 standard defines a wide variety of partitions

for the image data, with the aim of exploiting at the maximum the

offered scalability. All of these partitions

allow an efficient manipulation of the image, or a part of it. Fig.

\ref{fig:partitions} shows a graphical example of the main partitions.

\begin{figure}[!b]

\begin{center}

\resizebox{0.95\textwidth}{!}{\input{../partition}}

\end{center}

\caption{Data partition defined by the JPEG2000 standard.}

\label{fig:partitions}

\end{figure}

In order to understand the concept of each partition defined

in the JPEG2000 standard, it is necessary to clarify the concept

of canvas. The canvas is a bidimensional drawing zone where

all the partitions are mapped to form the related image.

Hereinafter, all the used coordinates are in relation to

a canvas, which size, width ($I_{2}$) and height ($I_{1}$), corresponds to

the total size of the associated image. Each partition is

located and mapped over the canvas in a specific way.

An image is composed by one or more components. In the most

of the cases the images have only three components: red, green

and blue (RGB), with a size equals to the canvas.

The JPEG2000 standard allows to divide an image into smaller

rectangular regions called tiles. Each tile is compressed

independently in relation to the rest, hence the compression

parameters can be different among them. By default there is

always one tile as minimum, which equals to the whole image.

One of the possible applications of the tile partitioning is its use

with images that contain different elements and visually separated,

like text, graphics or photographic materials. When this does not

occur, and the images are continuous and homogeneous, the tiling

is not recommended because it produces artifacts in the borders of

the tiles, causing a mosaic effect. Moreover, the size of a

compressed image is larger when the tiles are used.

The DWT transform and all the quantification/coding stages

are applied independently to each tile-component. A tile-component,

of a tile $t$ and a component $c$, is defined by a bidimensional

zone limited by $t$ taking into account the zone occupied by

$c$. This means that, if an

image has only one tile, with three color components, there are

three tile-components, which are compressed independently.

For each tile-component, identified by the tile $t$ and the component

$c$, there are a total of $D_{t,c} + 1$

resolutions, where $D_{t,c}$ is the number

of DWT stages applied. The $r$-nth resolution level of a compressed

tile-component is obtained after applying $r$ times the inverse DWT

transform. The

$r$ value is in the range of $0 \leq r \leq D_{t,c}$.

Each tile-component, after being applied the DWT, is

divided into code-blocks, that are coded independently.

In each resolution $r$ of each tile-component $(t,c)$, the

code-blocks are grouped in precincts. This partition is

defined by the height, $P_{1}^{t,c,r}$,

and the width, $P_{2}^{t,c,r}$,

of each precinct. The number of precincts in vertical,

$N_{1}^{P,t,c,r}$,

as well as in horizontal, $N_{2}^{P,t,c,r}$ are given by

the following expression:

\begin{equation*}

N_{i}^{P,t,c,r} = \left\lceil \frac{I_{i}}{2^{D_{t,c}-r}P_{i}^{t,c,r}} \right\rceil

\end{equation*}

Code-blocks refer to the wavelet coefficients generated by the DWT

transform, thus rectangular regions within the wavelet

domain. However, precincts refer to rectangular regions within the

image domain. The spatial scalability offered by the standard is

carried out with the precincts.

The packet is the fundamental unit for the organization

of the compressed bit-stream of an image. Each precinct contributes

to the bit-stream as many packets as quality layers there are. The

compressed data of each code-block is divided in different segments

called quality layers. All the code-blocks of all the precincts

of the same tile are divided into the same number of quality layers,

although the length of the quality layers between code-blocks can be

different (the length can be even zero). For a certain layer $l$,

100

the set of all the layer $l$ of all the code-blocks related to a

101

precinct form a packet.

102

103

In order to decode a certain region of an image it is necessary

104

decode all the packets related to that region. In the server code,

105

the class \hyperlink{classjpip_1_1WOIComposer}{jpip::WOIComposer} allows to know,

106

for a given region of interest, hereinafter called WOI (Window

107

Of Interest), all the required packets to decode it.

108

109

A packet $\zeta_{t,c,r,p,l}$

110

is identified by the tile $t$, the

111

component $c$, the resolution $r$, the precinct $p$ (in precinct

112

coordinates)

113

and the quality layer $l$. In the server code, the class

114

\hyperlink{classjpeg2000_1_1Packet}{jpeg2000::Packet} is used to

115

identify a packet.

116

117

\subsection{Code-stream organization}

118

119

Part 1 of the JPEG2000 standard defines a basic structure for

120

organizing the image compressed data into code-streams. A code-stream

121

includes all the packets generated by a compression process of an image

122

plus a set of markers, that are used for signaling certain parts, as

123

well as for including information necessary for the decompression.

124

125

The code-stream is itself a simple file format for JPEG2000 image.

126

Any standard decompressor must be able to understand a code-stream

127

stored within a file. This basic format is also called raw, and

128

its most used extension is ``.J2C''.

129

130

The markers have an unique identifier, that consists of an unsigned

131

integer of $16$ bits. These markers can be found alone, that is,

132

only the identifier, or accompanied by additional information,

133

receiving in this case the name of marker segment.

134

135

The marker segment has, after the identifier, another unsigned

136

integer of $16$ bits with the length of the included data, including

137

as well the two bytes of this integer, but without counting the

138

two bytes of the identifier.

139

140

The code-stream always begins with the SOC (Start Of Code-stream)

141

marker, which does not include any additional information.

142

After this marker a set of markers called ``main header'' begins.

143

After the SOC marker there is always a SIZ marker, with global

144

information necessary for decompressing the data, e.g. the image

145

size, the tile size, the anchor point of the tiles, the number

146

of components, the sub-sampling factors, etc.

147

148

There are another two markers that are mandatory in the main header:

149

COD, with information related to the coding of the image, like the

150

number of layers, number of DWT stages, the size of the code-blocks,

151

the progression, etc.; and QCD, which contains the quantization

152

parameters. These two markers can be stored in any position within the

153

main header.

154

155

The rest of the code-stream, until the EOC (End Of Code-stream),

156

located just at the end of it, is organized as it is shown

157

in Fig. \ref{fig:code-stream}. For each image tile, there is

158

a set of data. This data is divided into one or more tile-parts.

159

Each tile-part is composed by a header and a set of packets.

160

The header of the first tile-part is the main header of the tile.

161

The header of each tile-part begins with the SOT (Start Of Tile)

162

marker and ends with the SOD (Start Of Data) marker, starting then

163

the related sequence of packets, according to the last COD or POC

164

marker. The main header ends when the first SOT is found.

165

166

\begin{figure}[!t]

167

\begin{center}

168

\resizebox{0.65\textwidth}{!}{\input{../codestream}}

169

\end{center}

170

\caption{Code-stream organization.}

171

\label{fig:code-stream}

172

\end{figure}

173

174

In order to permit a random access to the data of a code-stream,

175

that by default is not feasible, JPEG2000 offers the possibility

176

of including the TLM, PLM and/or PLT markers. The TLM and PLM markers

177

are included within the main header, whilst the PLT marker goes

178

in the header of a tile or tile-part. The goal of the TLM marker

179

is to store the length of each tile-part that appear within the

180

code-stream. This length includes the header as well

181

as the set of packets, so for knowing where is the beginning of the

182

data it is necessary to analyze firstly the header. The PLM marker

183

stores the length of each packet of each tile-part of the code-stream.

184

Each packet of the code-stream has a certain length, that can

185

not be known a priori. Therefore including this marker facilitates

186

a random access of the packets. The PLT marker has the same function

187

as the PLM marker, but at the level of tile-part, thus it stores

188

the length of all the packets of the belonging tile-part. This

189

marker is commonly most used than PLM.

190

191

The PLM and PLT markers produces an increase of the code-stream

192

length, although the way of coding the packet lengths helps to

193

avoid an excessive overhead: a length $L$ of a certain packet,

194

that can be represented with $B_{L}$ bits, is stored coded

195

with $\left \lceil \frac{B_{L}}{7} \right \rceil$ bytes. For a

196

length $L$ is generated a sequence of bytes where only the

197

less significant $7$ bits are used. The most significant

198

bit of each byte indicates if the belonging byte is ($1$) or

199

not ($0$) the last one of the sequence. This way of numeric

200

encoding is widely used in Part 9 of the standard, specially with

201

the JPIP protocol. With this protocol, to each variable sequence

202

of bytes that represents a number encoded in this way is called

203

VBAS (Variable Byte-Aligned Segment). The class

204

\hyperlink{classjpip_1_1DataBinWriter}{jpip::DataBinWriter}, within the server code,

205

contains methods to generate VBAS coded values.

206

207

\subsection{Progressions}

208

\label{sec:progresiones}

209

210

The packets generated by the JPEG2000 compression process

211

are neither independent nor self-contained. Having a certain packet,

212

it is not possible to figure out to which part of the related image

213

it belongs without additional information. The length of the packet

214

can not be determined before being decoded, and many packets can not

215

be decoded without decoding other packets before. This is why it is

216

necessary to include markers like TLM, PLT or PLM, previously

217

commented, in order to allow a random access without decoding.

218

219

The packets of each tile-part appear according the progression

220

specified by the last COD or POC marker read, before the

221

SOD marker. Part 1 of the JPEG2000 standard defines 5 possible kinds of

222

progressions for ordering the packets within a tile or tile-part.

223

Each progression is identified by means of a combination

224

of four letters: ``L'' for quality layer, ``R'' for resolution

225

level, ``C'' for component and ``P'' for precinct. Each letter identifies

226

the partition of the progression. Hence for the LRCP progression,

227

for example, the packets would be included as follows:\\

228

\\for each layer $l$\\

229

\hspace*{1cm} for each resolution $r$\\

230

\hspace*{2cm} for each component $c$\\

231

\hspace*{3cm} for each precinct $p$\\

232

\hspace*{4cm} include the packet $\zeta_{t,c,r,p,l}$\\

233

234

The different progressions allowed by the standard are: LRCP, RLCP,

235

RPCL, PCRL and CPRL. To choose a progression or another depends on

236

the application to

237

develop, and how the packet must to be decoded. For example,

238

if the packets are going to be accessed randomly, but

239

as minimum disk accesses as possible are required, RPCL would

240

be the ideal progression in this case.

241

In the case of image transmission, the packets must also follow a

242

specific order or progression when they are transmitted.

243

When an image is transmitted from a server to a client, the most

244

desired goal is to allow the client to be able to show reconstructions

245

of the image with a quality that is increased as the data is received.

246

The quality of the reconstruction must be always the maximum possible

247

according to the received data. Under this criteria, the LRCP progression

248

can be confirmed as the best one, and it is the progression used by

249

the class \hyperlink{classjpip_1_1WOIComposer}{jpip::WOIComposer}.

250

251

\subsection{File formats}

252

253

Although the code-stream is completely functional as a basic

254

file format, it does not allow to include additional information

255

that could be necessary in certain applications, e.g. meta-data,

256

257

marker auxiliary information can be included within a

258

code-stream, but it is not classified nor organized in

259

a standard way.

260

261

Part 1 of the standard also defines a file format based on ``boxes'' that

262

allows to include, for example, in the same file, several code-streams

263

and diverse information correctly identified. These files usually have

264

the extension ``.JP2'', extension also used for calling this kind

265

of files.

266

267

The JP2 files are easily extensible. A basic structure of box is defined,

268

which can contains any kind of information. Each box is unequivocally

269

classified by means of a $4$-bytes identifier. A file can contain

270

several boxes with the same identifier. The standard proposes an

271

initial set of boxes, that may be extended according to specific

272

requirements. In fact, the JP2 format is the base of the rest of

273

formats and extensions defined in the rest of parts of the standard.

274

275

Each box has a header of $8$ bytes. The first $4$ bytes, $L$, form

276

an unsigned integer with the length in bytes of the content of the

277

next $4$ bytes, $T$, contain the identifier of the kind of box. This

278

identifier is commonly treated like a string of $4$ ASCII characters.

279

The value of $L$ includes the header, hence the real length of the

280

content of the box is $L - 8$. $L$ can have any value bigger or

281

equal to $8$, but also $1$ or $0$. If $L = 1$ the length of the

282

content of the box is coded as an unsigned integer of $8$ bytes, $X$,

283

located after $T$. In this case the header occupies $16$ bytes and

284

the length of the content is then $X - 16$. If $L = 0$ the

285

length of the box content is undefined, being possible only for the

286

last box of the image file.

287

288

Boxes can contain another boxes inside. It is possible to know

289

whether a box contains or not sub-boxes depending on the value of

290

$T$. If a box contains sub-boxes, it only can contain sub-boxes,

291

so it can not combine sub-boxes with other data.

292

293

Within the server code, the class \hyperlink{classjpeg2000_1_1FileManager}

294

{jpeg2000::FileManager} contains all the necessary code to read and parse

295

JPEG2000 image files, from simple raw J2C files to complex JPX ones with

296

hyperlinks. When this class parses an image file, extract the associated

297

index information and stores it in an object of the class

298

\hyperlink{classjpeg2000_1_1ImageInfo}{jpeg2000::ImageInfo}.

299

Older »