1
How to make PDF files using CJK/LaTeX with embedded TrueType fonts
3
Hin-Tak Leung <htl10@users.sourceforge.net>
10
Existing CJK/LaTeX instruction for high-quality print-out tends to suggest
11
permanently converting TrueType fonts (which had a better availability) to
12
postscript sub-fonts; this document covers how to use TrueType fonts
13
directly, and also preparing such PDF documents. Today, the PDF output
14
format is slightly more popular than PostScript; also, even on US English
15
systems, CJK font packs are available for font substitution in Adobe Acrobat
16
Reader (and similar mechanisms exist for xpdf and ghostscript), which allows
17
the generation of PDF files containing only important textual content but no
18
embedded fonts. Such files are small enough to be e-mailed while preserving
19
formatting, provided the recipient has the font packs installed. This
20
document also covers the issues with no-embedded-font PDF files at the very
23
The following steps are discussed below in greater detail:
26
2. Getting and building some software: ttf2tfm, dvipdfmx.
27
Some other nice optional software: oto, the other freetype/freetype2
28
demo tools, ttfm, ttx.
29
3. Using ttf2tfm, generating *.tfm and *.enc files for each font.
30
4. Putting the fonts, the *.tfm files, and the *.enc files into the right
32
5. Configuring dvipdfmx to use the new fonts.
33
6. (optional) Configuring pdflatex to use the new fonts also.
34
7. Configuring CJK/LaTeX to use those fonts.
38
I can read both traditional and simplified Chinese, and a substantial amount
39
of Japanese, but there isn't any Korean-specific info here. Hopefully this
40
is useful enough as a starting point at least for Korean-related
43
The two most important references during this venture was the FreeBSD
44
(Taiwan) Chinese HOWTO (it is substantially better and more up-to-date than
45
the GNU/Linux one), and Edward G. J. Lee's various treatises on the net,
46
particularly his `mycjk' notes. Unfortunately both are available in Chinese
56
Arphic donated 4 high-quality Chinese fonts to the open-source community:
57
two for traditional and two for simplified Chinese, respectively. They are
58
shipped with Redhat 9 (which I used for most of this work) and Debian 3
59
and possibly also other GNU/Linux distributions; they can be downloaded
60
from Arphic's home site and, probably more convenient, from
62
ftp://ftp.gnu.org/gnu/non-gnu/chinese-fonts-truetype/
64
and its mirrors. Tip: Use `unzip -L' to convert file names to lowercase.
66
Redhat 9 also ships zysong, a simplified Chinese font. This font seems to
67
be licensed to Redhat only since it isn't found in other GNU/Linux
68
distributions. It is part of the package "ttfonts-zh_CN-2.12-1.noarch.rpm",
69
together with the two Arphic simplified Chinese fonts, on the 3rd CD
70
of the Redhat 9 CD set.
72
The Ministry of Education in Taiwan released a few fonts for
73
standardization: Currently two are available from the ministry's home page
74
(http://www.edu.tw/mandr/index.htm), but there are old versions with
75
different type faces floating around in the net.
77
CwTeX (a Chinese-enabled LaTeX implementation in Taiwan) ships 5 fonts.
78
(http://ccms.ntu.edu.tw/~ntut019/cwtex/cwtex.html)
80
Still available is the set of 8 TrueType fonts from NTU which were widely
81
used previously for CJK/LaTeX documents (http://input.cpatch.org/font/ntu/).
83
There is also a set of 10 quite fancy and unusual fonts for traditional
84
Chinese, developed by Dr Hann-Tzong Wang
85
(http://140.135.64.77/teacher/htwang/htwang.htm). It is distributed as
86
one of the standard font sets for FreeBSD Taiwan.
87
(http://www.freebsd.org/cgi/pds.cgi?ports/chinese/wangttf).
92
Redhat 9 and SuSE both ship the Kochi Gothic and Mincho fonts; Debian
93
ships Watanabe Mincho and Wadalab Gothic as part of the XTT TrueType font
94
server. The packages are: "ttfonts-ja-1.2-21.noarch.rpm" on the 3rd disc
95
of the Redhat 9 CD set, "ttf-kochi-mincho-0.2.20020727-81.noarch.rpm" and
96
"ttf-kochi-gothic-0.2.20030118-17.noarch.rpm" on SuSe 8.2,
97
"xtt-fonts" for Debian systems.
99
Other source of fonts (e.g. Win2k/WinXP/Win2k3 ships a few as standard,
100
and also localized version of MS Office, etc.) are mostly proprietary.
102
These instructions are known to work on those also, but I don't want to
103
go into specific details...
106
2. ttf2tfm and dvipdfmx
107
=======================
109
The specific details about compiler switches, include paths, are for
110
the Redhat 9 distribution. You may have to adapt them.
115
ttf2tfm is part of ttf2pk package which is itself part of
116
freetype-contrib, a suite of programs depending on the FreeType 1 library.
117
Most GNU/Linux systems ship both FreeType 2 and FreeType 1 (that's the
118
case for RH9, in fact), which are *not* compatible. So I decided to build
119
the latest FreeType 1 static version and made freetype-contrib depend on
120
that to avoid using the out-dated library shipped with my system. The
121
mentioned packages can be downloaded from ftp.freetype.org.
123
Unpack freetype-current (adapt the `/home/hleung' part to suit yourself),
126
cd /home/hleung/freetype-current
127
./configure --enable-static --disable-shared --prefix=/home/hleung
130
Now unpack freetype-contrib-current inside the freetype-current tree, then
133
cd freetype-contrib-current/ttf2pk
134
CFLAGS=-I../../lib/ LDFLAGS=-L../../lib/.libs ./configure \
135
--with-kpathsea-lib=/usr/lib --with-kpathsea-include=/usr/include
139
Important: At the end, you need to manually copy the data/*.sfd files into
140
${TEXMF}/ttf2tfm and also ${TEXMF}/ttf2pk (a soft link from
141
${TEXMF}/ttf2tfm to ${TEXMF}/ttf2pk will do also).
143
[The recent TeX directory structure (TDS), version 1.1, comes with a new
144
subdirectory fonts/sfd, to be accessed with the kpathsea variable
145
$SFDFONTS. ttf2tfm and other programs available in the TeXLive
146
distribution have already been updated to use it.]
148
The man pages of ttf2tfm and ttf2pk give detailed explanation of all
149
command line arguments.
151
Tip: I find a utility called "checkinstall" quite useful. Instead of `make
152
install' one calls `checkinstall' which does the same as `make install'
153
but also integrates the data nicely into the package management system for
154
Redhat/Debian/Slackware; this gives cleaner upgrades and uninstalls.
159
http://project.ktug.or.kr/dvipdfmx/
163
CFLAGS='-I/usr/kerberos/include -O2 -march=i386 -mcpu=i686' ./configure
167
The include path is due to dependency on the kerberos library for PDF
168
encryption. Important: The 10 Wang fonts have some peculiarities; I
169
submitted a preliminary patch which the author has much refined and
170
incoporated into a new release. You need a version newer than 2003-08-11
171
if you want to use this set of fonts. From the ChangLog of dvipdfmx:
173
2003-08-11 Jin-Hwan Cho <chofchof@ktug.or.kr>
174
* A faked font name was used for TrueType fonts without any PS
175
font name as suggested by Hin-Tak Leung.
177
[The recent TeX directory structure (TDS), version 1.1, comes with a new
178
subdirectory fonts/sfd, to be accessed with the kpathsea variable
179
$SFDFONTS. dvipdfmx and other programs available in the TeXLive
180
distribution have already been updated to use it.]
183
3. Generating ttf and enc files
184
===============================
186
OpenType Organizer (oto) : http://sourceforge.net/projects/oto/
187
True Type Font Manager (ttfm):
188
- part of Chinese GNU/Linux Extention http://cle.linux.org.tw/
190
You need to know what cmap (character map) the TrueType font (*.ttf or
191
*.ttc) contains. The utility programs oto, ftdump (two versions! --
192
FreeType 1 and FreeType 2 both have this demo program, showing quite
193
different information), and ttfinfo (part of ttfm) can show this info, and
194
some other information about your font as well. Only ftdump works on
195
TrueType collections (*.ttc), but the other two have their strengths also
196
(ttfinfo gives the most straightforward info, while oto gives some details
197
that ftdump doesn't show).
199
For detailed information on cmaps in a font you can use ttx, a tool to
200
assemble and disassemble OpenType fonts. It is available from
201
http://fonttools.sf.net.
203
If there is a Unicode cmap you can use ttf2tfm's `U*.sfd' files (see the
204
`@...@' argument for ttf2tfm); the command line for ttf2tfm is simpler
205
also. Otherwise you need to specify the platform (-P) and encoding (-E)
208
Here is what works for me for the fonts I mentioned. Important: The font
209
stem name needs to be unique. Additionally, dvipdfmx doesn't like numbers
210
in the font stem name. I use a 4-letter combination. By LaTeX convention
211
it shouldn't be longer than 5 letters.
213
ttf2tfm bkai00mp.ttf -q -w bkai@UBig5@
214
ttf2tfm bsmi00lp.ttf -q -w bsmi@UBig5@
215
ttf2tfm gbsn00lp.ttf -q -w gbsn@UGB@
216
ttf2tfm gkai00mp.ttf -q -w gkai@UGB@
218
ttf2tfm zysong.ttf -q -w zysg@UGB@
220
ttf2tfm kai-linux.ttf -P 3 -E 4 -q -w mekl@Big5@
221
ttf2tfm edustd-15.ttf -P 3 -E 4 -q -w mest@Big5@
222
ttf2tfm moe_kai.ttf -P 3 -E 4 -q -w meko@Big5@
223
ttf2tfm moe_sung.ttf -P 3 -E 4 -q -w meso@Big5@
225
ttf2tfm ntu_li_m.ttf -P 3 -E 4 -q -w ntli@Big5@
226
ttf2tfm ntu_br.ttf -P 3 -E 4 -q -w ntbr@Big5@
227
ttf2tfm ntu_fs_m.ttf -P 3 -E 4 -q -w ntfs@Big5@
228
ttf2tfm ntu_kai.ttf -P 3 -E 4 -q -w ntka@Big5@
229
ttf2tfm ntu_mb.ttf -P 3 -E 4 -q -w ntmb@Big5@
230
ttf2tfm ntu_mm.ttf -P 3 -E 4 -q -w ntmm@Big5@
231
ttf2tfm ntu_mr.ttf -P 3 -E 4 -q -w ntmr@Big5@
232
ttf2tfm ntu_tw.ttf -P 3 -E 4 -q -w nttw@Big5@
234
ttf2tfm mttf.ttf -q -w cwtm@UBig5@
235
ttf2tfm kttf.ttf -q -w cwtk@UBig5@
236
ttf2tfm fttf.ttf -q -w cwtf@UBig5@
237
ttf2tfm bbttf.ttf -q -w cwtb@UBig5@
238
ttf2tfm rttf.ttf -q -w cwtr@UBig5@
240
ttf2tfm kochi-gothic.ttf -w kcgt@UJIS@
241
ttf2tfm kochi-mincho.ttf -w kcmc@UJIS@
243
ttf2tfm wadalab-gothic.ttf -P 3 -E 2 -w wdgt@SJIS@
244
ttf2tfm watanabe-mincho.ttf -P 3 -E 2 -w wnmc@SJIS@
246
The Wang's font set has some unusual properties, and need either
247
a new version of freetype 1 (after 2003-10 from CVS), or a slightly
248
modified "Big5.sfd", called "wcl.sfd" here:
250
ttf2tfm wcl-01.ttf -P 3 -E 4 -q -w wclj@wcl@
251
ttf2tfm wcl-02.ttf -P 3 -E 4 -q -w wclk@wcl@
252
ttf2tfm wcl-03.ttf -P 3 -E 4 -q -w wcll@wcl@
253
ttf2tfm wcl-04.ttf -P 3 -E 4 -q -w wclm@wcl@
254
ttf2tfm wcl-05.ttf -P 3 -E 4 -q -w wcln@wcl@
255
ttf2tfm wcl-06.ttf -P 3 -E 4 -q -w wclp@wcl@
256
ttf2tfm wcl-07.ttf -P 3 -E 4 -q -w wclq@wcl@
257
ttf2tfm wcl-08.ttf -P 3 -E 4 -q -w wclr@wcl@
258
ttf2tfm wcl-09.ttf -P 3 -E 4 -q -w wcls@wcl@
259
ttf2tfm wcl-10.ttf -P 3 -E 4 -q -w wclt@wcl@
261
As an example, here is what I do for a well-known proprietary simplified
262
Chinese font which has only a cmap for simplified Chinese:
264
ttf2tfm gkai00m.ttf -P 3 -E 3 -q -w gkaim@EUC@
266
Here an example for a TrueType collection:
268
ttf2tfm dcai5.ttc -q -w dcaiq@UJIS@
271
4. Putting the files where they should be
272
=========================================
274
This is somewhat related to how kpathsea works and how latex (the program)
275
find its files. It is possible to set individual environment variables for
276
each of these items, but it is easier to set one: $TEXMF to a list of
277
locations, with a tree parallel to the system tree. Then do the following:
279
. Put the *.tfm files into a subdirectory of ${TEXMF}/fonts/tfm.
280
. Put the *.enc files into a subdirectory of ${TEXMF}/dvips.
281
. Put the *.ttf (or *.ttc) files into a subdirectory of
282
${TEXMF}/fonts/truetype.
283
. Put the *.sfd files into ${TEXMF}/ttf2tfm or a subdirectory of it.
284
Don't forget to either copy them into ${TEXMF}/ttf2pk also or to set up
285
a link from ${TEXMF}/ttf2pk to ${TEXMF}/ttf2tfm.
287
Reason: dvipdfmx searches SFD files (which it needs for reassembling)
288
under ${TEXMF}/ttf2pk although we don't use ttf2pk anywhere. ttf2tfm
289
looks for them under its own name, of course.
291
[The recent TeX directory structure (TDS), version 1.1, comes with a new
292
subdirectory fonts/sfd, to be accessed with the kpathsea variable
293
$SFDFONTS. dvipdfmx and other programs available in the TeXLive
294
distribution have already been updated to use it.]
296
Important: Run texhash (mktexlsr) to rebuild the kpathsea database,
297
otherwise files won't be found. You have been warned!
300
5., 6. Configuring dvipdfmx and (optionally) pdflatex
301
=====================================================
303
cid-x.map, dvipdfmx.cfg, *.map
305
See for example, my own "cid-x.map" for the main font config file of
306
dvipdfmx -- all my own customization is at the very end after the line
307
"Hin-Tak Leung's custom setup below:". For each font xxxx, one needs
308
to add a line "f xxxx.map" into "dvipdfmx.cfg", and a fontmap
309
file "xxxx.map" into the dvipdfmx config directory --
310
${TEXMF}/dvipdfm/config/ on my system (the missing
311
"x" is not a typo, as dvipdfmx originally derived from dvipdfm).
312
I have included cwbt.map, for one of the CwTeX fonts, as an example,
313
and my dvipdfmx.cfg as well.
315
Because I have a fair number of fonts I like to add, I wrote a little
316
perl script "gen-map.pl", which generates all the *.map files plus
317
a file called "map.list" which I can simply append to dvipdfmx.cfg,
318
from an internal table at the very top of the script.
320
pdflatex needs the same fontmap files for each new font - copy them into
321
${TEXMF}/dvips/config/. Modify the updmap script which is used for
322
updating both pdflatex.cfg and dvips.cfg, and run the updmap script.
323
On teTeX 1.0.x, one needs to add to the "extra_modules=" entry the
324
*.map files for each font. My modified updmap is included as an
325
example "updmap.my", found as "/usr/share/texmf/dvips/config/updmap" on
326
a RH 9 system. On teTeX 2.0.x, updmap has a separate config file
327
updmap.cfg located in ${TEXMF}/web2c/.
330
7. Configuring CJK/LaTeX
331
========================
333
Copy the whole `texinput' directory of the CJK package into a directory
334
which is in your $TEXINPUTS path. Also create some new *.fd files there.
335
My "c00cwtb.fd" is included as an example; again, since I have quite
336
a few font files, I have created some template fd files as c*tmpl.fd,
337
and duplicating and change every "tmpl" string to "cwtb" inside
339
cp c00tmpl.fd c00cxtb.fd
340
perl -pi -e "s/tmpl/cwtb/;" c00cwtb.fd
342
If you use Big5 or Shift-JIS encoding, compile the bg5conv and
343
sjisconv utilities; under Unix-like systems you can use the bg5pdflatex
344
and sjispdflatex scripts to access them conveniently.
350
Just pick the relevant files in the CJK/examples directory and change the
351
font name to match. Either call pdflatex or call latex followed by
352
dvipdfmx. In general, I found that dvipdfmx generates much smaller files
359
a. files can't be found
361
This is the most frequent problem. Setting the environment variable
362
KPATHSEA_DEBUG to -1 activates full debugging; you can then check
363
how latex/dvipdfmx/pdflatex tries to find those files. See the
364
kpathsea info pages for more details on debugging output.
366
For latex (the program) you only need the new custom-made *.fd files,
367
the files from CJK/texinput, and the tfm files. The *.fd files could be
368
broken -- check their contents. latex (the program) neither needs the
369
*.enc files nor the font files themselves.
371
If latex (the program) works, but dvipdfmx doesn't, then your dvipdfmx
372
configuration probably needs some tuning. Alternatively, the map files
373
or the font files are not found, etc. Note that dvipdfmx neither needs
374
the tfm files, nor the CJK/LaTeX input files, but it does need the
377
pdflatex does everything in one step, so everything needs to be in the
380
b. Acrobat on GNU/Linux doesn't print PDF files generated with dvipdfmx
382
The problem is probably caused by ghostscript version 7.x which chokes
383
on the intermediate postscript file under some command options.
384
Upgrading to ghostscript 8.x should fix this printfilter problem. It is
385
*strongly* recommended to use ghostscript 8.11 or newer due to severe
386
problems with earlier versions.
388
c. no-font-embedded PDF files
390
This is quite simple to do with dvipdfmx: Just put an extra `!'
391
(exclamation mark) in the dvipdfmx configuration file in front of the
392
font which shouldn't be embedded.
394
A problem can arise if the PDF reader is not able to find a proper
395
substitution font if the font specified in the document isn't available.
396
I did some investigation and had a long discussion with the author of
397
dvipdfmx about this. Basically, it seems that win32 Acrobat Reader 6.x
398
will substitute any missing fonts with fonts from the Adobe CJK font
399
packs or from the system. Acrobat reader 5.x for GNU/Linux will only do
400
so -- and only with fonts from the CJK packs, not from the X server --
401
if the font name is one of the well-known ones for that region:
402
SimHei, SimSun (found on most MS Windows boxes), and some fonts of Arphic
403
and Dynafont which are very popular in the far east. Otherwise,
404
it aborts with an error message.
406
Besides the proprietary fonts mentioned in the last paragraph, only
407
Wang's fonts can be configured currently to be not embedded so that
408
acroread on GNU/Linux accepts them. I have spent much time looking
409
into this issue and apparently Acroread on GNU/Linux seems to do
410
font substitutions by looking at the capital letters in the font name.
411
Due to the missing PS name of the Wang's fonts (and our dvipdfmx
412
work-around on 2003-08-11 using the file name -- happened to be all
413
lowercase -- as the missing font name), they work by luck.
415
Both xpdf and ghostscript will substitute any missing fonts with a
416
specific font per language, if suitably configured. On Redhat 9, the
417
heavily adapted ghostscript will substitute automatically if some named
418
fonts from the CD are installed (without any extra effort); for xpdf it
419
is an extra few lines of configuration in ${HOME}/.xpdfrc to tell it
420
what font to use from the X server for substituting a missing font for a
421
particular language. So ghostscript works out of the box for a full RH
422
installation, whereas xpdf doesn't, but xpdf is more configurable and
423
the setting of what fall-back font to use can differ per user.