3
<META NAME="Author" CONTENT="Jerry Peek">
4
<TITLE>MH & nmh: Overview of MIME Messages</TITLE>
6
<!-- $Id: ovofmime.htm,v 6.0 1999/10/10 05:14:05 jpeek Exp $ -->
8
<BODY BGCOLOR="#FFFFFF">
10
<H1>Overview of MIME Messages</H1>
11
[<A HREF="ch-itm.htm">previous</A>]
12
[<A HREF="mulmes.htm">next</A>]
13
[<A HREF="tocs/jump.htm">table of contents</A>] [<A HREF="indexes/map.htm">index</A>]
16
This section starts by explaining the overall purposes of MIME.
17
Next it shows a simple MIME message and introduces
18
the basic parts of a MIME message.
19
<H2><A NAME="PuofMI">Purposes of MIME</A></H2>
23
Make "tree-structured" message bodies that have many levels of parts
25
This is like the idea of a UNIX filesystem, an office file cabinet,
27
This lets you send many different things (messages, graphics, sounds,
28
and so on) in the same message.
30
Like non-MIME messages, though, many MIME messages have just one part.
32
Handle multiple character sets, even within the same mail message.
34
Transport many different types of data in the same message.
36
Pass data reliably through standard mail transports.
37
The sender's MIME agent converts text and data into 7-bit <I>us-ascii</I>
38
with line length limits.
39
The recipient's MIME-capable mail program converts the message contents
40
into their original formats.
43
Let's look at a MIME message as it would arrive in your
45
The message in <A HREF="#EncMIMes">the next Example</A>
46
was encoded by the sender; this
47
is how it is actually stored on disk by the recipient's MTA (mail
48
transfer agent), waiting for someone to read it.
49
This is also how the message will look to a person who does
50
<I>not</I> have a MIME-capable MUA (mail user agent):
53
<A NAME="EncMIMes"><EM>Example: Encoded MIME message</EM></A><BR>
55
From: Jerry Peek <jpeek@jpeek.com>
56
To: carlos@entelfam.cl
57
Subject: Un =?iso-8859-1?Q?d=EDa_dif=EDcil?=
59
Content-Type: text/plain; charset=ISO-8859-1
60
Content-Transfer-Encoding: quoted-printable
62
Carlos, estoy en la casa de mi amigo. Pero, =A1qu=E9 d=EDa dif=EDcil!
63
Tom=E9 un taxi entre al aeropuerto y el hotel. "=A1Tenga cuidado!",
64
me dijo el chofer. "=A1Esta parte de la ciudad es peligrosa!" Fue
65
evidente para m=ED. Yo o=EDa que la ciudad tiene partes malas, y esa
66
parte pareci=F3 as=ED. Los edificios ten=EDan barrotes sobre sus
67
ventanas, hubo gente sospechosa en la calle, y todas las puertas estaban
68
cerrada con llave. "=A1Vaya con di=F3s!" =C9l se fue.
72
Before we dig into that message, here are some things you should be aware of.
73
RFC 1521, the MIME specification that MH supports, only tells how to
74
transfer non-ASCII text and graphics in the message <I>body</I>.
75
Although RFC 1521 adds fields like <TT>MIME-Version:</TT> to a message
76
header, it doesn't tell how to put non-ASCII text in a message header.
77
A different specification, RFC 1522,
78
is a standard for non-ASCII text in the header.
79
RFC 1521 has been replaced by RFC 2045, and RFC 1522 has been replaced
81
There are links from the <A HREF="ap-rl.htm">Reference List</A>.
83
MH doesn't support RFC 1522 character encoding in message headers
85
nmh does support it (actually, RFC 2047) in messages you read, but it
86
doesn't yet have automated support for composing messages using those
87
characters; you have to type them in by hand as you compose the message.
89
To make my message <TT>Subject:</TT> more readable, I could have
90
written an ASCII approximation of the non-ASCII characters.
91
That is, instead of <TT>Un día difícil</TT>, I'd write
92
<TT>Un d'ia dif'icil</TT>.
93
Most modern MUAs support RFC 2047, though, so
94
let's hope that nmh can have complete support soon.
95
<A NAME="index12"></A>
97
<H2><A NAME="MIHeFi">MIME Header Fields</A></H2>
98
The header field <TT>MIME-Version: 1.0</TT> in
99
<A HREF="#EncMIMes">the previous Example</A>
100
tells that the message is in MIME format.
101
It's a signal to mail programs that the
102
message meets the requirements of RFC 1521.
103
(Unfortunately, a few mail programs add that header field to messages
104
that aren't in MIME format.)
106
Here's an overview of the other header fields that a MIME message
108
I'll toss in some MIME philosophy along the way.
109
We can't cover everything in the MIME spec; it's close to 100 pages long!
110
Other sections of this chapter, and sections of later Chapters,
111
have more MIME information.
112
The Section <A HREF="moabmi.htm">More About MIME</A>
113
explains how to find all the gory details.
117
<TT>Content-type:</TT> tells you what kind of data is in the message.
118
The content can be <I>text</I>, <I>image</I> (still pictures), <I>video</I>,
120
Another content type is <I>message</I>, which means that the content is
121
structured in the standard RFC 822 format; this can be used for
123
The <I>application</I> content type is designed to be sent to an external
124
program -- for example, text for a PostScript printer or viewer.
126
<A NAME="index4"></A>
127
Finally, a message can be <I>multipart</I>, with several separate body parts.
128
It's possible for the parts to be of different types (the Section on
129
<A HREF="mulmes.htm">Multipart Messages</A>
130
explains the MIME syntax for multipart messages).
131
For example, a message could start with a <I>text</I> part to describe what's
132
in the message, then an <I>audio</I> part with a message from a photographer,
133
followed by five of the photographer's pictures (in five <I>image</I> parts).
134
It's also possible for each of the <I>multipart</I> body parts to be another
135
complete multipart message -- with its own parts.
136
That is, MIME messages can be recursive.
138
A <TT>Content-type:</TT> must have a subtype.
139
The type and subtype names have a slash (<TT>/</TT>) between them.
140
For instance, <I>image/gif</I> is a picture in the GIF format; the type is
141
<I>image</I> and the subtype is <I>gif</I>.
142
The <A HREF="mimerg.htm">MIME Reference Guide</A> lists many of the
143
common content types and subtypes.
145
<A NAME="index13"></A>
146
Finally, a <TT>Content-type:</TT> can have optional parameters at the end,
147
starting with a semicolon (<TT>;</TT>).
148
For example, the <I>charset=</I> parameter in
150
Content-type: text/plain; charset=iso-8859-1
152
says that the message body uses the ISO-8859-1 character set.
153
(The default <I>charset</I> is <I>us-ascii</I>.)
155
The default <TT>Content-type:</TT> is <I>text/plain</I>.
157
The <TT>Content-transfer-encoding:</TT> field tells how the message
158
data was encoded for transfer.
159
Encoding gets the message safely from the person who sent it, across mail
160
transfer agents and gateways, into your mailbox.
161
The Section <A HREF="#MIMEnc">MIME Encoding</A>
162
lists the values that can go in the
163
<TT>Content-transfer-encoding:</TT> field.
165
The <TT>Content-ID:</TT> and <TT>Content-Description:</TT> header fields
166
help to describe what's in the message body.
167
These two fields are most useful for the parts of a multipart message.
169
The <TT>Content-ID:</TT> is a unique string, similar to the RFC 822 field
170
<TT>Message-ID:</TT>, that no other message in the world is likely to have.
171
A typical <TT>Content-ID:</TT> value is <TT><1283.780402430.1@ora.com></TT>.
172
Each part of a multipart message has its own <TT>Content-ID:</TT>.
173
Its main use is identifying
174
<A HREF="../mh/remime.htm#ExtParts">external parts</A>
175
and <A HREF="../mh/remime.htm#CachCont">cached body parts</A>.
177
The <TT>Content-Description:</TT> field describes the content in words, like
178
<TT>Report on Zeta Meeting</TT>.
179
The RFC 822 <TT>Subject:</TT> field describes the whole message.
180
You can add a <TT>Content-Description:</TT> in the message header,
181
but it's more useful for describing a part of a multipart message.
183
The list of content subtypes changes a lot.
184
Quite a few subtypes are for non-UNIX computers.
185
This book doesn't cover every content subtype; examples use some
186
of the most common subtypes from RFC 1521.
187
Your MH setup probably doesn't need to support all subtypes:
190
When you send a message, unless you know that the recipient needs
191
a particular kind of content, it's a good idea to use the simplest
192
(most-widely-available) content type and subtype that you can.
193
For instance, to send a non-text data file, you should probably use
194
<I>application/octet-stream</I> instead of <I>application/mac-binhex40</I>
195
because the BinHex format is generally used for Macintosh files.
197
If you get a message with a content that your MH hasn't been set up to
198
handle, you may be able to make sense of the message by ignoring the
201
<A HREF="../mh/remime.htm#Alttomhn">prevent MIME decoding</A>
202
by setting the MH <I>NOMHNPROC</I> environment variable
203
or using the nmh <I>-nocheckmime</I> switch.
204
Look at the message on your screen.
205
If that isn't enough, and you don't think you'll get many messages of
209
<A HREF="../mh/stormess.htm#DeStMIMe">Store the decoded message content in a file</A>,
212
Take the encoded message apart with a
213
text editor (as in the Section
214
<A HREF="../mh/edmeshm.htm">Edit Messages with show: mhedit</A>)
215
and find a decoder program, or
217
Ask the sender to give you the data in a different format.
219
In my opinion, it doesn't make sense to constantly reconfigure MH for every
220
new content subtype unless you have a lot of spare time. <TT>:-)</TT>
222
<A NAME="index5"></A>
223
<H2><A NAME="MIMEnc">MIME Encoding</A></H2>
224
MIME messages are designed to be readable by all existing
225
RFC 822-compatible mail programs.
226
(Although, of course, MUAs that don't understand MIME won't be
227
able to interpret the MIME-specific parts.)
228
The messages may be sent through all kinds of networks and gateways.
229
So, MIME encodes messages that have non-ASCII parts.
230
The <TT>Content-transfer-encoding:</TT>
231
field tells the recipient the way a message
232
was encoded -- and how to decode it.
233
(Encoding usually isn't required for plain ASCII text.)
236
<A HREF="#EncMIMes">the previous Example message</A>,
237
from me to a friend in Chile, is in Spanish (more or less <TT>:-)</TT>).
238
Spanish uses characters like
239
¡ and ñ that aren't part of the ASCII character set.
240
My MIME-capable MUA could encode the
241
message's non-ASCII characters to pass safely through an
242
ASCII-only mail transfer system.
243
For instance, the character ¡
244
was encoded as the three-character sequence <TT>=A1</TT>.
245
When the message gets to Chile, my friend's MIME-capable
246
mail reader will translate the message's encoded characters to the correct
249
One important feature of MIME's encodings is that they are designed to
250
leave as much of the message in plain ASCII text as possible.
251
In general, MIME only translates the characters that some email transfer
253
So, if the recipient doesn't have a MIME-capable MUA,
254
the encoded text in the message will probably still make some sense.
255
(It's possible to encode text messages so that people can't read
256
them without decoding.
257
But, unless you want to hide words or have another reason, one of the
258
less-severe encodings will probably do the job.
259
In general, MH chooses human-readable encoding for text messages.)
261
Of course, when MIME encodes a binary file (like a digitized picture)
262
that people can't read in the first place, the encoded data won't be
263
any easier for a person to read.
264
MIME encoding is designed to get the data safely through almost every
265
known mail transfer system and gateway.
266
One of the major wins in MIME is that it was designed to work <I>everywhere</I>,
267
including "broken" and "brain-damaged" systems.
268
Instead of trying to impose a new standard on mail transfer systems, MIME
269
works with existing systems -- and adapts to their eccentricities.
271
Although you don't need to understand how encoding works to use MIME,
272
you should have a general idea of the types of encoding.
273
So, if you'd like to skip the technical details in the following
274
section, please do skim it and learn the types of encoding.
275
There are five encodings:
278
<A NAME="index6"></A>
279
<B>7bit</B> is the default.
280
It means that the message contents are plain ASCII text.
281
Lines must be "short" (1000 characters or less) and end with
282
CR-LF (carriage return plus linefeed).
284
<A NAME="index7"></A>
285
<B>quoted-printable</B>
286
is used for text that is mostly 7-bit but which has a
287
small percentage of 8-bit characters.
288
(There's quoted-printable text in
289
<A HREF="#EncMIMes">the next Example</A>.)
290
For instance, characters with the eighth bit on in the ISO-8859-<I>n</I>
291
sets should be encoded as quoted-printable.
292
Each 8-bit character is encoded into three 7-bit characters: <TT>=</TT>
293
(an equal sign) and the hexadecimal value of the character.
294
So the ISO-8859-1 character ñ,
295
which has the hex value F1 (that's 11110001 binary),
296
would be encoded as <TT>=F1</TT>.
298
To keep the message readable on non-MIME readers, characters that don't have
299
the eighth bit set generally shouldn't be encoded.
300
The <TT>=</TT> character itself must be encoded, though; it's encoded as
302
Also, space and tab characters at the ends of lines must be encoded
303
(as <TT>=20</TT> and <TT>=09</TT>, respectively); this keeps broken gateways
305
If a line ends with <TT>=</TT> followed by CR-LF, those characters are
306
ignored; this lets you continue ("wrap") a long line.
308
Lines must be no more than 76 characters long, not counting the final
310
Longer lines will be broken when the message is encoded and joined
313
Quoted-printable text was designed to be (mostly) readable by people
314
with non-MIME mail programs.
316
<A NAME="index8"></A>
318
is used for data and other text that was never meant to be
319
read by humans -- or must be preserved verbatim.
320
Every 3 octets (24 bits) are encoded into a 4-character sequence.
321
The 64-character set was chosen carefully.
322
It comes from ASCII characters that aren't munged by known gateways or
325
<A NAME="index9"></A>
327
is data made of 8-bit characters with "short" lines that end
329
This isn't too useful yet because <I>8bit</I> data can't be shipped reliably
330
over standard SMTP mail transport.
331
The new ESMTP standard (RFC 1651) and <I>8bitMIMEtransport</I>
332
extension (RFC 1652) handle 8-bit MIME messages.
334
<A NAME="index10"></A>
336
is like 8bit, but without CR-LF line boundaries.
338
One of MIME's main goals is to make different email
339
programs work with each other.
340
To make interoperability more likely, the MIME designers tried to avoid
341
having lots of different content-types.
342
They tried even harder to avoid lots of different encodings.
343
MIME content types and subtypes, as well as encodings, are registered with
344
the IANA (Internet Assigned Numbers Authority).
345
"Experimental" unofficial values start with <I>X-</I>, like <I>X-pbm</I>.
346
A few experimental content-types and subtypes are in wide use.
347
But in general, so that as many people as possible can read your message,
348
try to avoid inventing new content types and subtypes.
349
If you get the urge to create, the
350
<A HREF="moabmi.htm">
351
<I>comp.mail.mime</I> newsgroup and the <I>info-mime</I> mailing list</A>
352
are great places to work it out.
356
[<A HREF="tocs/jump.htm">Table of Contents</A>] [<A HREF="indexes/map.htm">Index</A>]
357
[<A HREF="ch-itm.htm">Previous: Caution About MH Files and Newline Characters</A>]
358
[<A HREF="mulmes.htm">Next: Multipart Messages</A>]
361
<STRONG>Revised by Jerry Peek.</STRONG>
362
<EM>Last change $Date: 1999/10/10 05:14:05 $</EM>
364
This file is from the third edition of the book <I>MH & xmh: Email
365
for Users & Programmers</I>, ISBN 1-56592-093-7, by Jerry Peek.
366
Copyright © 1991, 1992, 1995 by O'Reilly & Associates, Inc.
367
This file is freely available; you can redistribute it and/or modify
368
it under the terms of the GNU General Public License as published by
369
the Free Software Foundation. For more information, see
370
<A HREF="../copying.htm">the file <I>copying.htm</I></A>.
373
Suggestions are welcome:
374
<A HREF="http://www.jpeek.com/">Jerry Peek</A>
375
<A HREF="mailto:jpeek@jpeek.com"><jpeek@jpeek.com></A>