~ubuntu-branches/debian/squeeze/liblouis/squeeze : revision 9

1

Table of Contents

2

*****************

3

4

Liblouis User's and Programmer's Manual

5

1 Introduction

6

2 Test Programs

7

2.1 lou_debug

8

2.2 lou_checktable

9

2.3 lou_allround

10

2.4 lou_translate

11

2.5 lou_checkhyphens

12

3 How to Write Translation Tables

13

3.1 Hyphenation Tables

14

3.2 Character-Definition Opcodes

15

3.3 Braille Indicator Opcodes

16

3.4 Emphasis Opcodes

17

3.5 Special Symbol Opcodes

18

3.6 Special Processing Opcodes

19

3.7 Translation Opcodes

20

3.8 Character-Class Opcodes

21

3.9 Swap Opcodes

22

3.10 The Context and Multipass Opcodes

23

3.11 The correct Opcode

24

3.12 Miscellaneous Opcodes

25

4 Notes on Back-Translation

26

5 Programming with liblouis

27

5.1 License

28

5.2 Overview

29

5.3 Data structure of liblouis tables

30

5.4 lou_version

31

5.5 lou_translateString

32

5.6 lou_translate

33

5.7 lou_backTranslateString

34

5.8 lou_backTranslate

35

5.9 lou_hyphenate

36

5.10 lou_logFile

37

5.11 lou_logPrint

38

5.12 lou_getTable

39

5.13 lou_readCharFromFile

40

5.14 lou_free

41

5.15 Python bindings

42

Opcode Index

43

Function Index

44

Program Index

45

46

47

Liblouis User's and Programmer's Manual

48

***************************************

49

50

This manual is for liblouis (version 1.8.0, 18 November 2009), a

51

Braille Translation and Back-Translation Library derived from the Linux

52

screenreader BRLTTY.

53

54

55

56

57

58

59

60

This file is free software; you can redistribute it and/or modify

61

it under the terms of the GNU Lesser (or library) General Public

62

License (LGPL) as published by the Free Software Foundation;

63

either version 3, or (at your option) any later version.

64

65

This file is distributed in the hope that it will be useful, but

66

WITHOUT ANY WARRANTY; without even the implied warranty of

67

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU

68

Lesser (or Library) General Public License LGPL for more details.

69

70

You should have received a copy of the GNU Lesser (or Library)

71

General Public License (LGPL) along with this program; see the

72

file COPYING. If not, write to the Free Software Foundation, 51

73

Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

74

75

1 Introduction

76

**************

77

78

Liblouis is an open-source braille translator and back-translator

79

derived from the translation routines in the BRLTTY screenreader for

80

Linux. It has, however, gone far beyond these routines. It is named in

81

honor of Louis Braille. In Linux and Mac OSX it is a shared library,

82

and in Windows it is a DLL. For installation instructions see the

83

README file. Please report bugs and oddities to the maintainer,

84

<john.boyer@abilitiessoft.com>

85

86

This documentation is derived from Chapter 7 of the BRLTTY manual,

87

but it has been extensively rewritten to cover new features.

88

89

Please read the following copyright and warranty information. Note

90

that this information also applies to all source code, tables and other

91

files in this distribution of liblouis. It applies similarly to the

92

sister library liblouisxml.

93

94

This file is maintained by John J. Boyer

95

<john.boyer@abilitiessoft.com>.

96

97

Persons who wish to program with liblouis but will not be writing

98

translation tables may want to skip ahead to *note Programming with

99

liblouis::.

100

101

2 Test Programs

102

***************

103

104

Five test programs are provided as part of the liblouis package. They

105

are intended for testing liblouis and for debugging tables. None of

106

them is suitable for braille transcription. An application that can be

107

used for transcription is `xml2brl', which is part of the liblouisxml

108

package (*note Introduction: (liblouisxml)Top.). The source code of the

109

test programs can be studied to learn how to use the liblouis library

110

and they can be used to perform the following functions.

111

112

All of these programs recognize the `--help' and `--version' options.

113

114

`--help'

115

`-h'

116

Print a usage message listing all available options, then exit

117

successfully.

118

119

`--version'

120

`-v'

121

Print the version number, then exit successfully.

122

123

124

2.1 lou_debug

125

=============

126

127

The lou_debug tool is intended for debugging liblouis translation

128

tables. The command line for lou_debug is:

129

130

lou_debug [OPTIONS] TABLE[,TABLE,...]

131

132

The command line options that are accepted by lou_debug are described

133

in *note common options::.

134

135

The table (or comma-separated list of tables) is compiled. If no

136

errors are found a brief command summary is printed, then the prompt

137

`Command:'. You can then input one of the command letters and get

138

output, as described below.

139

140

Most of the commands print information in the various arrays of

141

`TranslationTableHeader'. Since these arrays are pointers to chains of

142

hashed items, the commands first print the hash number, then the first

143

item, then the next item chained to it, and so on. After each item

144

there is a prompt indicated by `=>'. You can then press enter (`<RET>')

145

to see the next item in the chain or the first item in the next chain.

146

Or you can press `h' (for next-(h)ash) to skip to the next hash chain.

147

You can also press `e' to exit the command and go back to the

148

`command:' prompt.

149

150

`h'

151

Brings up a screen of somewhat more extensive help.

152

153

`f'

154

Display the first forward-translation rule in the first non-empty

155

hash bucket. The number of the bucket is displayed at the

156

beginning of the chain. Each rule is identified by the word

157

`Rule:'. The fields are displayed by phrases consisting of the

158

name of the field, an equal sign, and its value. The before and

159

after fields are displayed only if they are nonzero. Special

160

opcodes such as the `correct' opcode (*note correct: correct

161

opcode.) and the multipass opcodes are shown with the code that

162

instructs the virtual machine that interprets them. If you want to

163

see only the rules for a particular character string you can type

164

`p' at the `command:' prompt. This will take you to the

165

`particular:' prompt, where you can press `f' and then type in the

166

string. The whole hash chain containing the string will be

167

displayed.

168

169

`b'

170

Display back-translation rules. This display is very similar to

171

that of forward translation rules except that the dot pattern is

172

displayed before the character string.

173

174

`c'

175

Display character definitions, again within their hash chains.

176

177

`d'

178

Displays single-cell dot definitions. If a character-definition

179

opcode gives a multi-cell dot pattern, it is displayed among the

180

back-translation rules.

181

182

`C'

183

Display the character-to-dots map. This is set up by the

184

character-definition opcodes and can also be influenced by the

185

`display' opcode (*note display: display opcode.).

186

187

`D'

188

Display the dot to character map, which shows which single-cell dot

189

patterns map to which characters.

190

191

`z'

192

Show the multi-cell dot patterns which have been assigned to the

193

characters from 0 to 255 to comply with computer braille codes

194

such as a 6-dot code. Note that the character-definition opcodes

195

should use 8-dot computer braille.

196

197

`p'

198

Bring up a secondary (`particular:') prompt from which you can

199

examine particular character strings, dot patterns, etc. The

200

commands (given in its own command summary) are very similar to

201

those of the main `command:' prompt, but you can type a character

202

string or dot pattern. They include `h', `f', `b', `c', `d', `C',

203

`D', `z' and `x' (to exit this prompt), but not `p', `i' and `m'.

204

205

`i'

206

Show braille indicators. This shows the dot patterns for various

207

opcodes such as the `capsign' opcode (*note capsign: capsign

208

opcode.) and the `numsign' opcode (*note numsign: numsign opcode.).

209

It also shows emphasis dot patterns, such as those for the

210

`italword', the `firstletterbold' opcode (*note firstletterbold:

211

firstletterbold opcode.), etc. If a given opcode has not been used

212

nothing is printed for it.

213

214

`m'

215

Display various miscellaneous information about the table, such as

216

the number of passes, whether certain opcodes have been used, and

217

whether there is a hyphenation table.

218

219

`q'

220

Exit the program.

221

222

2.2 lou_checktable

223

==================

224

225

To use this program type the following:

226

227

lou_checktable [OPTIONS] TABLE

228

229

The command line options that are accepted by lou_checktable are

230

described in *note common options::.

231

232

If the table contains errors, appropriate messages will be displayed.

233

If there are no errors the message `no errors found.' will be shown.

234

235

2.3 lou_allround

236

================

237

238

This program tests every capability of the liblouis library. It is

239

completely interactive. Invoke it as follows:

240

241

lou_allround [OPTIONS]

242

243

The command line options that are accepted by lou_debug are described

244

in *note common options::.

245

246

You will see a few lines telling you how to use the program. Pressing

247

one of the letters in parentheses and then enter will take you to a

248

message asking for more information or for the answer to a yes/no

249

question. Typing the letter `r' and then <RET> will take you to a

250

screen where you can enter a line to be processed by the library and

251

then view the results.

252

253

2.4 lou_translate

254

=================

255

256

This program translates whatever is on the standard input unit and

257

prints it on the standard output unit. It is intended for large-scale

258

testing of the accuracy of translation and back-translation. The

259

command line for lou_translate is:

260

261

lou_translate [OPTION] TABLE

262

263

Aside from the standard options (*note common options::) this program

264

also accepts the following options:

265

266

`--forward'

267

`-f'

268

Do a forward translation.

269

270

`--backward'

271

`-b'

272

Do a backward translation.

273

274

275

To use it to translate or back-translate a file use a line like

276

277

lou_translate --forward en-us-g2.ctb <liblouis.txt >testtrans

278

279

2.5 lou_checkhyphens

280

====================

281

282

This program checks the accuracy of hyphenation in Braille translation

283

for both translated and untranslated words. It is completely

284

interactive. Invoke it as follows:

285

286

lou_checkhyphens [OPTIONS]

287

288

The command line options that are accepted by lou_checkhyphens are

289

described in *note common options::.

290

291

You will see a few lines telling you how to use the program.

292

293

3 How to Write Translation Tables

294

*********************************

295

296

Many translation (contraction) tables have already been made up. They

297

are included in this distribution in the tables directory and should be

298

studied as part of the documentation. The most helpful (and normative)

299

are listed in the following table:

300

301

`chardefs.cti'

302

Character definitions for U.S. tables

303

304

`compress.ctb'

305

Remove excessive white-space

306

307

`en-us-g1.ctb'

308

Uncontracted American English

309

310

`en-us-g2.ctb'

311

Contracted or Grade 2 American English

312

313

`en-us-brf.dis'

314

Make liblouis output conform to BRF standard

315

316

`en-us-comp8.ctb'

317

8-dot computer braille for use in coding examples

318

319

`en-us-comp6.ctb'

320

6-dot computer braille

321

322

`nemeth.ctb'

323

Nemeth Code translasion for use with liblouisxml

324

325

`nemeth_edit.ctb'

326

Fixes errors at the boundaries of math and text

327

328

329

The names used for files containing translation tables are completely

330

arbitrary. They are not interpreted in any way by the translator.

331

Contraction tables may be 8-bit ASCII files, 16-bit big-endian Unicode

332

files or 16-bit little-endian Unicode files. Blank lines are ignored.

333

Any leading and trailing white-space (any number of blanks and/or tabs)

334

is ignored. Lines which begin with a number sign or hatch mark (`#')

335

are ignored, i.e. they are comments. If the number sign is not the

336

first non-blank character in the line, it is treated as an ordinary

337

character. If the first non-blank character is less-than (`<') the line

338

is also treated as a comment. This makes it possible to mark up tables

339

as xhtml documents. Lines which are not blank or comments define table

340

entries. The general format of a table entry is:

341

342

opcode operands comments

343

344

Table entries may not be split between lines. The opcode is a

345

mnemonic that specifies what the entry does. The operands may be

346

character sequences, braille dot patterns or occasionally something

347

else. They are described for each opcode. With some exceptions, opcodes

348

expect a certain number of operands. Any text on the line after the last

349

operand is ignored, and may be a comment. A few opcodes accept a

350

variable number of operands. In this case a number sign begins a

351

comment unless it is preceded by a backslash (`\'). *Note Opcode

352

Index::, for a list of opcodes, with a link to each one.

353

354

Here are some examples of table entries.

355

356

# This is a comment.

357

always world 456-2456 A word and the dot pattern of its contraction

358

359

Most opcodes have both a "characters" operand and a "dots" operand,

360

though some have only one and a few have other types.

361

362

The characters operand consists of any combination of characters and

363

escape sequences proceeded and followed by whitespace. Escape sequences

364

are used to represent difficult characters. They begin with a backslash

365

(`\`). They are:

366

367

`\'

368

backslash

369

370

`\f'

371

form feed

372

373

`\n'

374

new line

375

376

`\r'

377

carriage return

378

379

`\s'

380

blank (space)

381

382

`\t'

383

horizontal tab

384

385

`\v'

386

vertical tab

387

388

`\e'

389

"escape" character (hex 1b, dec 27)

390

391

`\xhhhh'

392

4-digit hexadecimal value of a character

393

394

395

If liblouis has been compiled for 32-bit Unicode the following are

396

also recognized.

397

398

`\yhhhhh'

399

5-digit (20 bit) character

400

401

`\zhhhhhhhh'

402

Full 32-bit value.

403

404

405

The dots operand is a braille dot pattern. The real braille dots, 1

406

through 8, must be specified with their standard numbers. liblouis

407

recognizes "virtual dots," which are used for special purposes, such as

408

distinguishing accent marks. There are seven virtual dots. They are

409

specified by the number 9 and the letters `a' through `f'. For a

410

multi-cell dot pattern, the cell specifications must be separated from

411

one another by a dash (`-'). For example, the contraction for the

412

English word `lord' (the letter `l' preceded by dot 5) would be

413

specified as 5-123. A space may be specified with the special dot

414

number 0.

415

416

An opcode which is helpful in writing translation tables is

417

`include'. Its format is:

418

419

include filename

420

421

It reads the file indicated by `filename' and incorporates or

422

includes its entries into the table. Included files can include other

423

files, which can include other files, etc. For an example, see what

424

files are included by the entry `include en-us-g1.ctb' in the table

425

`en-us-g2.ctb'. If the included file is not in the same directory as

426

the main table, use a full pathname for filename.

427

428

The order of the various types of opcodes or table entries is

429

important. Character-definition opcodes should come first. However, if

430

the optional `display' opcode (*note display: display opcode.) is used

431

it should precede character-definition opcodes. Braille-indicator

432

opcodes should come next. Translation opcodes should follow. The

433

`context' opcode (*note context: context opcode.) is a translation

434

opcode, even though it is considered along with the multipass opcodes.

435

These latter should follow the translation opcodes. The `correct'

436

opcode (*note correct: correct opcode.) can be used anywhere after the

437

character-definition opcodes, but it is probably a good idea to group

438

all `correct' opcodes together. The `include' opcode (*note include:

439

include opcode.) can be used anywhere, but the order of entries in the

440

combined table must conform to the order given above. Within each type

441

of opcode, the order of entries is generally unimportant. Thus the

442

translation entries can be grouped alphabetically or in any other order

443

that is convenient.

444

445

3.1 Hyphenation Tables

446

======================

447

448

Hyphenation tables are necessary to make opcodes such as the `nocross'

449

opcode (*note nocross: nocross opcode.) function properly. There are no

450

opcodes for hyphenation table entries because these tables have a

451

special format. Therefore, they cannot be specified as part of an

452

ordinary table. Rather, they must be included using the `include'

453

opcode (*note include: include opcode.). Hyphenation tables must

454

follow character definitions. For an example of a hyphenation table,

455

see `hyph_en_US.dic'.

456

457

3.2 Character-Definition Opcodes

458

================================

459

460

These opcodes are needed to define attributes such as digit,

461

punctuation, letter, etc. for all characters and their dot patterns.

462

liblouis has no built-in character definitions, but such definitions

463

are essential to the operation of the `context' opcode (*note context:

464

context opcode.), the `correct' opcode (*note correct: correct

465

opcode.), the multipass opcodes and the back-translator. If the dot

466

pattern is a single cell, it is used to define the mapping between dot

467

patterns and characters, unless a `display' opcode (*note display:

468

display opcode.) for that character-dot-pattern pair has been used

469

previously. If only a single-cell dot pattern has been given for a

470

character, that dot pattern is defined with the character's own

471

attributes. If more than one cell is given and some of them have not

472

previously been defined as single cells, the undefined cells are

473

entered into the dots table with the space attribute. This is done for

474

backward compatibility with old tables, but it may cause problems with

475

the above opcodes or back-translation. For this reason, every

476

single-cell dot pattern should be defined before it is used in a

477

multi-cell character representation. The best way to do this is to use

478

the 8-dot computer braille representation for the particular braille

479

code. If a character or dot pattern used in any rule, except those with

480

the `display' opcode, the `repeated' opcode (*note repeated: repeated

481

opcode.) or the `replace' opcode (*note replace: replace opcode.), is

482

not defined by one of the character-definition opcodes, liblouis will

483

give an error message and refuse to continue until the problem is

484

fixed. If the translator or back-translator encounters an undefined

485

character in its input it produces a succinct error indication in its

486

output, and the character is treated as a space.

487

488

`space character dots'

489

Defines a character as a space and also defines the dot pattern as

490

such. for example:

491

492

space \s 0 \s is the escape sequence for blank; 0 means no dots.

493

494

`punctuation character dots'

495

Associates a punctuation mark in the particular language with a

496

braille representation and defines the character and dot pattern as

497

punctuation. For example:

498

499

punctuation . 46 dot pattern for period in NAB computer braille

500

501

`digit character dots'

502

Associates a digit with a dot pattern and defines the character as

503

a digit. For example:

504

505

digit 0 356 NAB computer braille

506

507

`uplow characters dots [,dots]'

508

The characters operand must be a pair of letters, of which the

509

first is uppercase and the second lowercase. The first dots

510

suboperand indicates the dot pattern for the upper-case letter. It

511

may have more than one cell. The second dots suboperand must be

512

separated from the first by a comma and is optional, as indicated

513

by the square brackets. If present, it indicates the dot pattern

514

for the lower-case letter. It may also have more than one cell. If

515

the second dots suboperand is not present the first is used for

516

the lower-case letter as well as the upper-case letter. This

517

opcode is needed because not all languages follow a consistent

518

pattern in assigning Unicode codes to upper and lower case

519

letters. It should be used even for languages that do. The

520

distinction is important in the forward translator. for example:

521

522

uplow Aa 17,1

523

524

`grouping name characters dots ,dots'

525

This opcode is used to indicate pairs of grouping symbols used in

526

processing mathematical expressions. These symbols are usually

527

generated by the MathML interpreter in liblouisxml. They are used

528

in multipass opcodes. The name operand must contain only letters,

529

but they may be upper- or lower-case. The characters operand must

530

contain exactly two Unicode characters. The dots operand must

531

contain exactly two braille cells, separated by a comma. Note that

532

grouping dot patterns also need to be declared with the exactdots

533

opcode. The characters may need to be declared with the math

534

opcode.

535

536

grouping mrow \x0001\x0002 1e,2e

537

grouping mfrac \x0003\x0004 3e,4e

538

539

`letter character dots'

540

Associates a letter in the language with a braille representation

541

and defines the character as a letter. This is intended for

542

letters which are neither uppercase nor lowercase.

543

544

`lowercase character dots'

545

Associates a character with a dot pattern and defines the

546

character as a lowercase letter. Both the character and the dot

547

pattern have the attributes lowercase and letter.

548

549

`uppercase character dots'

550

Associates a character with a dot pattern and defines the

551

character as an uppercase letter. Both the character and the dot

552

pattern have the attributes uppercase and letter. `lowercase' and

553

`uppercase' should be used when a letter has only one case.

554

Otherwise use the `uplow' opcode (*note uplow: uplow opcode.).

555

556

`litdigit digit dots'

557

Associates a digit with the dot pattern which should be used to

558

represent it in literary texts. For example:

559

560

litdigit 0 245

561

litdigit 1 1

562

563

`sign character dots'

564

Associates a character with a dot pattern and defines both as a

565

sign. This opcode should be used for things like at sign (`@'),

566

percent (`%'), dollar sign (`$'), etc. Do not use it to define

567

ordinary punctuation such as period and comma. For example:

568

569

sign % 4-25-1234 literary percent sign

570

571

`math character dots'

572

Associates a character and a dot pattern and defines them as a

573

mathematical symbol. It should be used for less than (`<'),

574

greater than(`>'), equals(`='), plus(`+'), etc. For example:

575

576

math + 346 plus

577

578

579

3.3 Braille Indicator Opcodes

580

=============================

581

582

Braille indicators are dot patterns which are inserted into the braille

583

text to indicate such things as capitalization, italic type, computer

584

braille, etc. The opcodes which define them are followed only by a dot

585

pattern, which may be one or more cells.

586

587

`capsign dots'

588

The dot pattern which indicates capitalization of a single letter.

589

In English, this is dot 6. For example:

590

591

capsign 6

592

593

`begcaps dots'

594

The dot pattern which begins a block of capital letters. For

595

example:

596

597

begcaps 6-6

598

599

`endcaps dots'

600

The dot pattern which ends a block of capital letters within a

601

word. For example:

602

603

endcaps 6-3

604

605

`letsign dots'

606

This indicator is needed in Grade 2 to show that a single letter is

607

not a contraction. It is also used when an abbreviation happens to

608

be a sequence of letters that is the same as a contraction. For

609

example:

610

611

letsign 56

612

613

`noletsign letters'

614

The letters in the operand will not be proceeded by a letter sign.

615

More than one `noletsign' opcode can be used. This is equivalent

616

to a single entry containing all the letters. In addition, if a

617

single letter, such as `a' in English, is defined as a `word'

618

(*note word: word opcode.) or `largesign' (*note largesign:

619

largesign opcode.), it will be treated as though it had also been

620

specified in a `noletsign' entry.

621

622

`noletsignbefore characters'

623

If any of the characters proceeds a single letter without a space a

624

letter sign is not used. By default the characters apostrophe

625

(`'') and period (`.') have this property. Use of a

626

`noletsignbefore' entry cancels the defaults. If more than one

627

`noletsignbefore' entry is used, the characters in all entries are

628

combined.

629

630

`noletsignafter characters'

631

If any of the characters follows a single letter without a space a

632

letter sign is not used. By default the characters apostrophe

633

(`'') and period (`.') have this property. Use of a

634

`noletsignafter' entry cancels the defaults. If more than one

635

`noletsignafter' entry is used the characters in all entries are

636

combined.

637

638

`numsign dots'

639

The translator inserts this indicator before numbers made up of

640

digits defined with the `litdigit' opcode (*note litdigit:

641

litdigit opcode.) to show that they are a number and not letters

642

or some other symbols. For example:

643

644

numsign 3456

645

646

647

3.4 Emphasis Opcodes

648

====================

649

650

These also define braille indicators, but they require more

651

explanation. There are four sets, for italic, bold, underline and

652

computer braille. In each of the first three sets there are seven

653

opcodes, for use before the first word of a phrase, for use before the

654

last word, for use after the last word, for use before the first letter

655

(or character) if emphasis starts in the middle of a word, for use

656

after the last letter (or character) if emphasis ends in the middle of

657

a word, before a single letter (or character), and to specify the

658

length of a phrase to which the first-word and last-word-before

659

indicators apply. This rather elaborate set of emphasis opcodes was

660

devised to try to meet all contingencies. It is unlikely that a

661

translation table will contain all of them. The translator checks for

662

their presence. If they are present, it first looks to see if the

663

single-letter indicator should be used. Then it looks at the word (or

664

phrase) indicators and finally at the multi-letter indicators.

665

666

The translator will apply up to two emphasis indicators to each

667

phrase or string of characters, depending on what the `typeform'

668

parameter in its calling sequence indicates (*note Programming with

669

liblouis::).

670

671

For computer braille there are only two braille indicators, for the

672

beginning and end of a sequence of characters to be rendered in

673

computer braille. Such a sequence may also have other emphasis. The

674

computer braille indicators are applied not only when computer braille

675

is indicated in the `typeform' parameter, but also when a sequence of

676

characters is determined to be computer braille because it contains a

677

subsequence defined by the `compbrl' opcode (*note compbrl: compbrl

678

opcode.) or the `literal' opcode (*note literal: literal opcode.).

679

680

Here are the various emphasis opcodes.

681

682

`firstwordital dots'

683

This is the braille indicator to be placed before the first word

684

of an italicized phrase that is longer than the value given in the

685

`lenitalphrase' opcode (*note lenitalphrase: lenitalphrase

686

opcode.). For example:

687

688

firstwordital 46-46 English indicator

689

690

`lastworditalbefore dots'

691

`italsign dots'

692

These two opcodes are synonyms. This is the braille indicator to be

693

placed before the last word of an italicized phrase. In addition,

694

if `firstwordital' is not used, this braille indicator is doubled

695

and placed before the first word. Do not use `lastworditalbefore'

696

and `lastworditalafter' in the same table. For example:

697

698

lastworditalbefore 4-6

699

700

`lastworditalafter dots'

701

This is the braille indicator to be placed after the last word of

702

an italicized phrase. Do not use `lastworditalbefore' and

703

`lastworditalafter' in the same table. See also the

704

`lenitalphrase' opcode (*note lenitalphrase: lenitalphrase

705

opcode.) for more information.

706

707

`firstletterital dots'

708

`begital dots'

709

These two opcodes are synonyms. This is the braille indicator to be

710

placed before the first letter (or character) if italicization

711

begins in the middle of a word.

712

713

`lastletterital dots'

714

`endital dots'

715

These two opcodes are synonyms. This is the braille indicator to be

716

placed after the last letter (or character) when italicization

717

ends in the middle of a word.

718

719

`singleletterital dots'

720

This braille indicator is used if only a single letter (or

721

character) is italicized.

722

723

`lenitalphrase number'

724

If `lastworditalbefore' is used, an italicized phrase is checked

725

to see how many words it contains. If this number is less than or

726

equal to the number given in the `lenitalphrase' opcode, the

727

`lastworditalbefore' sign is placed in front of each word. If it

728

is greater, the `firstwordital' indicator is placed before the

729

first word and the `lastworditalbefore' indicator is placed after

730

the last word. Note that if the `firstwordital' opcode is not used

731

its indicator is made up by doubling the dot pattern given in the

732

`lastworditalbefore' entry. For example:

733

734

lenitalphrase 4

735

736

`firstwordbold dots'

737

This is the braille indicator to be placed before the first word

738

of a bold phrase. For example:

739

740

firstwordbold 456-456

741

742

`lastwordboldbefore dots'

743

`boldsign dots'

744

These two opcodes are synonyms. This is the braille indicator to be

745

placed before the last word of a bold phrase. In addition, if

746

`firstwordbold' is not used, this braille indicator is doubled and

747

placed before the first word. Do not use `lastwordboldbefore' and

748

`lastwordboldafter' in the same table. For example:

749

750

lastwordboldbefore 456

751

752

`lastwordboldafter dots'

753

This is the braille indicator to be placed after the last word of a

754

bold phrase. Do not use `lastwordboldbefore' and

755

`lastwordboldafter' in the same table.

756

757

`firstletterbold dots'

758

`begbold dots'

759

These two opcodes are synonyms. This is the braille indicator to be

760

placed before the first letter (or character) if bold emphasis

761

begins in the middle of a word.

762

763

`lastletterbold dots'

764

`endbold dots'

765

These two opcodes are synonyms. This is the braille indicator to be

766

placed after the last letter (or character) when bold emphasis

767

ends in the middle of a word.

768

769

`singleletterbold dots'

770

This braille indicator is used if only a single letter (or

771

character) is in boldface.

772

773

`lenboldphrase number'

774

If `lastwordboldbefore' is used, a bold phrase is checked to see

775

how many words it contains. If this number is less than or equal to

776

the number given in the `lenboldphrase' opcode, the

777

`lastwordboldbefore' sign is placed in front of each word. If it

778

is greater, the `firstwordbold' indicator is placed before the

779

first word and the `lastwordboldbefore' indicator is placed after

780

the last word. Note that if the `firstwordbold' opcode is not used

781

its indicator is made up by doubling the dot pattern given in the

782

`lastwordboldbefore' entry.

783

784

`firstwordunder dots'

785

This is the braille indicator to be placed before the first word

786

of an underlined phrase.

787

788

`lastwordunderbefore dots'

789

`undersign dots'

790

These two opcodes are synonyms. This is the braille indicator to be

791

placed before the last word of an underlined phrase. In addition,

792

if `firstwordunder' is not used, this braille indicator is doubled

793

and placed before the first word.

794

795

`lastwordunderafter dots'

796

This is the braille indicator to be placed after the last word of

797

an underlined phrase.

798

799

`firstletterunder dots'

800

`begunder dots'

801

These two opcodes are synonyms. This is the braille indicator to be

802

placed before the first letter (or character) if underline emphasis

803

begins in the middle of a word.

804

805

`lastletterunder dots'

806

`endunder dots'

807

These two opcodes are synonyms. This is the braille indicator to be

808

placed after the last letter (or character) when underline emphasis

809

ends in the middle of a word.

810

811

`singleletterunder dots'

812

This braille indicator is used if only a single letter (or

813

character) is underlined.

814

815

`lenunderphrase number'

816

If `lastwordunderbefore' is used, an underlined phrase is checked

817

to see how many words it contains. If this number is less than or

818

equal to the number given in the `lenunderphrase' opcode, the

819

`lastwordunderbefore' sign is placed in front of each word. If it

820

is greater, the `firstwordunder' indicator is placed before the

821

first word and the `lastwordunderbefore' indicator is placed after

822

the last word. Note that if the `firstwordunder' opcode is not

823

used its indicator is made up by doubling the dot pattern given in

824

the `lastwordunderbefore' entry.

825

826

`begcomp dots'

827

This braille indicator is placed before a sequence of characters

828

translated in computer braille, whether this sequence is indicated

829

in the `typeform' parameter (*note Programming with liblouis::) or

830

inferred because it contains a subsequence specified by the

831

`compbrl' opcode (*note compbrl: compbrl opcode.).

832

833

`endcomp dots'

834

This braille indicator is placed after a sequence of characters

835

translated in computer braille, whether this sequence is indicated

836

in the `typeform' parameter (*note Programming with liblouis::) or

837

inferred because it contains a subsequence specified by the

838

`compbrl' opcode (*note compbrl: compbrl opcode.).

839

840

841

3.5 Special Symbol Opcodes

842

==========================

843

844

These opcodes define certain symbols, such as the decimal point, which

845

require special treatment.

846

847

`decpoint character dots'

848

This opcode defines the decimal point. The character operand must

849

have only one character. For example, in `en-us-g1.ctb' we have:

850

851

decpoint . 46

852

853

`hyphen character dots'

854

This opcode defines the hyphen, that is, the character used in

855

compound words such as have-nots. The back-translator uses it to

856

determine the end of individual words.

857

858

859

3.6 Special Processing Opcodes

860

==============================

861

862

These opcodes cause special processing to be carried out.

863

864

`capsnocont'

865

This opcode has no operands. If it is specified words or parts of

866

words in all caps are not contracted. This is needed for languages

867

such as Norwegian.

868

869

870

3.7 Translation Opcodes

871

=======================

872

873

These opcodes define the braille representations for character

874

sequences. Each of them defines an entry within the contraction table.

875

These entries may be defined in any order except, as noted below, when

876

they define alternate representations for the same character sequence.

877

878

Each of these opcodes specifies a condition under which the

879

translation is legal, and each also has a characters operand and a dots

880

operand. The text being translated is processed strictly from left to

881

right, character by character, with the most eligible entry for each

882

position being used. If there is more than one eligible entry for a

883

given position in the text, then the one with the longest character

884

string is used. If there is more than one eligible entry for the same

885

character string, then the one defined first is is tested for legality

886

first. (This is the only case in which the order of the entries makes a

887

difference.)

888

889

The characters operand is a sequence or string of characters preceded

890

and followed by whitespace. Each character can be entered in the normal

891

way, or it can be defined as a four-digit hexadecimal number preceded

892

by `\x'.

893

894

The dots operand defines the braille representation for the

895

characters operand. It may also be specified as an equals sign (`=').

896

This means that the the default representation for each character

897

(*note Character-Definition Opcodes::) within the sequence is to be

898

used.

899

900

In what follows the word `characters' means a sequence of one or

901

more consecutive letters between spaces and/or punctuation marks.

902

903

`noback opcode ...'

904

This is an opcode prefix, that is to say, it modifies the

905

operation of the opcode that follows it on the same line. noback

906

specifies that no back-translation is to be done using this line.

907

908

noback always ;\s; 0

909

910

`nofor opcode ...'

911

This is an opcode prefix which modifies the opration of the opcode

912

following it on the same line. nofor specifies that forward

913

translation is not to use the information on this line.

914

915

`compbrl characters'

916

`literal characters'

917

These two opcodes are synonyms. If the characters are found within

918

a block of text surrounded by whitespace the entire block is

919

translated according to the default braille representations

920

defined by the *note Character-Definition Opcodes::, if 8-dot

921

computer braille is enabled or according to the dot patterns given

922

in the `comp6' opcode (*note comp6: comp6 opcode.), if 6-dot

923

computer braille is enabled. For example:

924

925

compbrl www translate URLs in computer braille

926

927

`comp6 character dots'

928

This opcode specifies the translation of characters in 6-dot

929

computer braille. It is necessary because the translation of a

930

single character may require more than one cell. The first operand

931

must be a character with a decimal representation from 0 to 255

932

inclusive. The second operand may specify as many cells as

933

necessary. The opcode is somewhat of a misnomer, since any dots,

934

not just dots 1 through 6, can be specified. This even includes

935

virtual dots.

936

937

`nocont characters'

938

Like `compbrl', except that the string is uncontracted. `prepunc'

939

opcode (*note prepunc: prepunc opcode.) and `postpunc' opcode

940

(*note postpunc: postpunc opcode.) rules are applied, however.

941

This is useful for specifying that foreign words should not be

942

contracted in an entire document.

943

944

`replace characters {characters}'

945

Replace the first set of characters, no matter where they appear,

946

with the second. Note that the second operand is _NOT_ a dot

947

pattern. It is also optional. If it is omitted the character(s)

948

in the first operand will be discarded. This is useful for

949

ignoring characters. It is possible that the "ignored" characters

950

may still affect the translation indirectly. Therefore, it is

951

preferable to use `correct' opcode (*note correct: correct

952

opcode.).

953

954

`always characters dots'

955

Replace the characters with the dot pattern no matter where they

956

appear. Do _NOT_ use an entry such as `always a 1'. Use the

957

`uplow', `letter', etc. character definition opcodes instead. For

958

example:

959

960

always world 456-2456 unconditional translation

961

962

`repeated characters dots'

963

Replace the characters with the dot pattern no matter where they

964

appear. Ignore any consecutive repetitions of the same character

965

sequence. This is useful for shortening long strings of spaces or

966

hyphens or periods. For example:

967

968

repeated --- 36-36-36 shorten separator lines made with hyphens

969

970

`repword characters dots'

971

When characters are encountered check to see if the word before

972

this string matches the word after it. If so, replace characters

973

with dots and eliminate the second word and any word following

974

another occurence of characters that is the same. This opcode is

975

used in Malaysian braille. In this case the rule is:

976

977

repword - 123456

978

979

`largesign characters dots'

980

Replace the characters with the dot pattern no matter where they

981

appear. In addition, if two words defined as large signs follow

982

each other, remove the space between them. For example, in

983

`en-us-g2.ctb' the words `and' and `the' are both defined as large

984

signs. Thus, in the phrase `the cat and the dog' the space would

985

be deleted between `and' and `the', with the result `the cat

986

andthe dog'. Of course, `and' and `the' would be properly

987

contracted. The term `largesign' is a bit of braille jargon that

988

pleases braille experts.

989

990

`word characters dots'

991

Replace the characters with the dot pattern if they are a word,

992

that is, are surrounded by whitespace and/or punctuation.

993

994

`syllable characters dots'

995

As its name indicates, this opcode defines a "syllable" which must

996

be represented by exactly the dot patterns given. Contractions may

997

not cross the boundaries of this "syllable" either from left or

998

right. The character string defined by this opcode need not be a

999

lexical syllable, though it usually will be. The equal sign in the

1000

following example means that the the default representation for

1001

each character within the sequence is to be used (*note

1002

Translation Opcodes::):

1003

1004

syllable horse = sawhorse, horseradish

1005

1006

`nocross characters dots'

1007

Replace the characters with the dot pattern if the characters are

1008

all in one syllable (do not cross a syllable boundary). For this

1009

opcode to work, a hyphenation table must be included. If this is

1010

not done, `nocross' behaves like the `always' opcode (*note

1011

always: always opcode.). For example, if the English Grade 2 table

1012

is being used and the appropriate hyphenation table has been

1013

included `nocross sh 146' will cause the `sh' in `monkshood' not

1014

to be contracted.

1015

1016

`joinword characters dots'

1017

Replace the characters with the dot pattern if they are a word

1018

which is followed by whitespace and a letter. In addition remove

1019

the whitespace. For example, `en-us-g2.ctb' has `joinword to 235'.

1020

This means that if the word `to' is followed by another word the

1021

contraction is to be used and the space is to be omitted. If these

1022

conditions are not met, the word is translated according to any

1023

other opcodes that may apply to it.

1024

1025

`lowword characters dots'

1026

Replace the characters with the dot pattern if they are a word

1027

preceded and followed by whitespace. No punctuation either before

1028

or after the word is allowed. The term `lowword' derives from the

1029

fact that in English these contractions are written in the lower

1030

part of the cell. For example:

1031

1032

lowword were 2356

1033

1034

`contraction characters'

1035

If you look at `en-us-g2.ctb' you will see that some words are

1036

actually contracted into some of their own letters. A famous

1037

example among braille transcribers is `also', which is contracted

1038

as `al'. But this is also the name of a person. To take another

1039

example, `altogether' is contracted as `alt', but this is the

1040

abbreviation for the alternate key on a computer keyboard.

1041

Similarly `could' is contracted into `cd', but this is the

1042

abbreviation for compact disk. To prevent confusion in such cases,

1043

the letter sign (see `letsign' opcode (*note letsign: letsign

1044

opcode.)) is placed before such letter combinations when they

1045

actually are abbreviations, not contractions. The `contraction'

1046

opcode tells the translator to do this.

1047

1048

`sufword characters dots'

1049

Replace the characters with the dot pattern if they are either a

1050

word or at the beginning of a word.

1051

1052

`prfword characters dots'

1053

Replace the characters with the dot pattern if they are either a

1054

word or at the end of a word.

1055

1056

`begword characters dots'

1057

Replace the characters with the dot pattern if they are at the

1058

beginning of a word.

1059

1060

`begmidword characters dots'

1061

Replace the characters with the dot pattern if they are either at

1062

the beginning or in the middle of a word.

1063

1064

`midword characters dots'

1065

Replace the characters with the dot pattern if they are in the

1066

middle of a word.

1067

1068

`midendword characters dots'

1069

Replace the characters with the dot pattern if they are either in

1070

the middle or at the end of a word.

1071

1072

`endword characters dots'

1073

Replace the characters with the dot pattern if they are at the end

1074

of a word.

1075

1076

`partword characters dots'

1077

Replace the characters with the dot pattern if the characters are

1078

anywhere in a word, that is, if they are proceeded or followed by a

1079

letter.

1080

1081

`exactdots @dots'

1082

Note that the operand must begin with an at sign (`@'). The dot

1083

pattern following it is evaluated for validity. If it is valid,

1084

whenever an at sign followed by this dot pattern appears in the

1085

source document it is replaced by the characters corresponding to

1086

the dot pattern in the output. This opcode is intended for use in

1087

liblouisxml semantic-action files to specify exact dot patterns,

1088

as in mathematical codes. For example:

1089

1090

exactdots @4-46-12356

1091

will produce the characters with these dot patterns in the output.

1092

1093

`prepunc characters dots'

1094

Replace the characters with the dot pattern if they are part of

1095

punctuation at the beginning of a word.

1096

1097

`postpunc characters dots'

1098

Replace the characters with the dot pattern if they are part of

1099

punctuation at the end of a word.

1100

1101

`begnum characters dots'

1102

Replace the characters with the dot pattern if they are at the

1103

beginning of a number, that is, before all its digits. For

1104

example, in `en-us-g1.ctb' we have `begnum # 4'.

1105

1106

`midnum characters dots'

1107

Replace the characters with the dot pattern if they are in the

1108

middle of a number. For example, `en-us-g1.ctb' has `midnum . 46'.

1109

This is because the decimal point has a different dot pattern than

1110

the period.

1111

1112

`endnum characters dots'

1113

Replace the characters with the dot pattern if they are at the end

1114

of a number. For example `en-us-g1.ctb' has `endnum th 1456'.

1115

This handles things like `4th'. A letter sign is _NOT_ inserted.

1116

1117

`joinnum characters dots'

1118

Replace the characters with the dot pattern. In addition, if

1119

whitespace and a number follows omit the whitespace.

1120

1121

1122

3.8 Character-Class Opcodes

1123

===========================

1124

1125

These opcodes define and use character classes. A character class

1126

associates a set of characters with a name. The name then refers to any

1127

character within the class. A character may belong to more than one

1128

class.

1129

1130

The basic character classes correspond to the character definition

1131

opcodes, with the exception of the `uplow' opcode (*note uplow: uplow

1132

opcode.), which defines characters belonging to the two classes

1133

`uppercase' and `lowercase'. These classes are:

1134

1135

`space'

1136

White-space characters such as blank and tab

1137

1138

`digit'

1139

Numeric characters

1140

1141

`letter'

1142

Both uppercase and lowercase alphabetic characters

1143

1144

`lowercase'

1145

Lowercase alphabetic characters

1146

1147

`uppercase'

1148

Uppercase alphabetic characters

1149

1150

`punctuation'

1151

Punctuation marks

1152

1153

`sign'

1154

Signs such as percent (`%')

1155

1156

`math'

1157

Mathematical symbols

1158

1159

`litdigit'

1160

Literary digit

1161

1162

`undefined'

1163

Not properly defined

1164

1165

1166

The opcodes which define and use character classes are shown below.

1167

For examples see `fr-abrege.ctb'.

1168

1169

`class name characters'

1170

Define a new character class. The characters operand must be

1171

specified as a string. A character class may not be used until it

1172

has been defined.

1173

1174

`after class opcode ...'

1175

The specified opcode is further constrained in that the matched

1176

character sequence must be immediately preceded by a character

1177

belonging to the specified class. If this opcode is used more than

1178

once on the same line then the union of the characters in all the

1179

classes is used.

1180

1181

`before class opcode ...'

1182

The specified opcode is further constrained in that the matched

1183

character sequence must be immediately followed by a character

1184

belonging to the specified class. If this opcode is used more than

1185

once on the same line then the union of the characters in all the

1186

classes is used.

1187

1188

1189

3.9 Swap Opcodes

1190

================

1191

1192

The swap opcodes are needed to tell the `context' opcode (*note

1193

context: context opcode.), the `correct' opcode (*note correct: correct

1194

opcode.) and multipass opcodes which dot patterns to swap for which

1195

characters. There are three, `swapcd', `swapdd' and `swapcc'. The first

1196

swaps dot patterns for characters. The second swaps dot patterns for

1197

dot patterns and the third swaps characters for characters. The first

1198

is used in the `context' opcode and the second is used in the multipass

1199

opcodes. Dot patterns are separated by commas and may contain more than

1200

one cell.

1201

1202

`swapcd name characters dots, dots, dots, ...'

1203

See above paragraph for explanation. For example:

1204

1205

swapcd dropped 0123456789 356,2,23,...

1206

1207

`swapdd name dots, dots, dots ... dotpattern1, dotpattern2, dotpattern3, ...'

1208

The `swapdd' opcode defines substitutions for the multipass

1209

opcodes. In the second operand the dot patterns must be single

1210

cells, but in the third operand multi-cell dot patterns are

1211

allowed. This is because multi-cell patterns in the second operand

1212

would lead to ambiguities.

1213

1214

`swapcc name characters characters'

1215

The `swapcc' opcode swaps characters in its second operand for

1216

characters in the corresponding places in its third operand. It is

1217

intended for use with `correct' opcodes and can solve problems

1218

such as formatting phone numbers.

1219

1220

1221

3.10 The Context and Multipass Opcodes

1222

======================================

1223

1224

`context test action'

1225

`pass2 test action'

1226

`pass3 test action'

1227

`pass4 test action'

1228

The `context' and multipass opcodes (`pass2', `pass3' and `pass4')

1229

provide translation capabilities beyond those of the basic

1230

translation opcodes (*note Translation Opcodes::) discussed

1231

previously. The multipass opcodes cause additional passes to be

1232

made over the string to be translated. The number after the word

1233

`pass' indicates in which pass the entry is to be applied. If no

1234

multipass opcodes are given, only the first translation pass is

1235

made. The `context' opcode is basically a multipass opcode for the

1236

first pass. It differs slightly from the multipass opcodes per se.

1237

The format of all these opcodes is:

1238

1239

opcode test action

1240

1241

The `test' and `action' operands have suboperands. Each suboperand

1242

begins with a non-alphanumeric character and ends when another

1243

non-alphanumeric character is encountered. The suboperands and

1244

their initial characters are as follows.

1245

1246

`" (double quote)'

1247

a string of characters. This string must be terminated by

1248

another double quote. It may contain any characters. If a

1249

double quote is needed within the string, it must be preceded

1250

by a backslash (`\'). If a space is needed, it must be

1251

represented by the escape sequence \s. This suboperand is

1252

valid only in the test part of the `context' opcode.

1253

1254

`@ (at sign)'

1255

a sequence of dot patterns. Cells are separated by hyphens as

1256

usual. This suboperand is not valid in the test part of the

1257

context and correct opcodes.

1258

1259

`$ (dollar sign)'

1260

a string of attributes, such as `d' for digit, `l' for

1261

letter, etc. More than one attribute can be given. If you

1262

wish to check characters with any attribute, use the letter

1263

`a'. Input characters are checked to see if they have at

1264

least one of the attributes. The attribute string can be

1265

followed by numbers specifying how many characters are to be

1266

checked. If no numbers are given, 1 is assumed. If two

1267

numbers separated by a hyphen are given, the input is checked

1268

to make sure that at least the first number of characters with

1269

the attributes are present, but no more than the second

1270

number. If only one number is present, then exactly that many

1271

characters must have the attributes. A period instead of the

1272

numbers indicates an indefinite number of characters. This

1273

suboperand is valid in all test parts but not in action

1274

parts. For the characters which can be used in attribute

1275

strings, see the following table.

1276

1277

`! (exclamation point)'

1278

reverses the logical meaning of the suboperand which follows.

1279

For example, !$d is true only if the character is _NOT_ a

1280

digit. This suboperand is valid in test parts only.

1281

1282

`% (percent sign)'

1283

the name of a class defined by the `class' opcode (*note

1284

class: class opcode.) or the name of a swap set defined by

1285

the swap opcodes (*note Swap Opcodes::). Names may contain

1286

only letters. The letters may be upper or lower-case. The

1287

case matters. Class names may be used in test parts only.

1288

Swap names are valid everywhere.

1289

1290

`{ (left brace)'

1291

Name: the name of a grouping pair. The left brace indicates

1292

that the first (or left) member of the pair is to be used in

1293

matching. If this is between replacement brackets it must be

1294

the only item. This is also valid in the action part.

1295

1296

`} (right brace)'

1297

Name: the name of a grouping pair. The right brace indicates

1298

that the second (or right) member is to be used in matching.

1299

See the remarks on the left brace immediately above.

1300

1301

`/ (slash)'

1302

Search the input for the expression following the slash and

1303

return true if found. This can be used to set a variable.

1304

1305

`_ (underscore)'

1306

Move backward. If a number follows, move backward that number

1307

of characters. The program never moves backward beyond the

1308

beginning of the input string. This suboperand is valid only

1309

in test parts.

1310

1311

`[ (left bracket)'

1312

start replacement here. This suboperand must always be paired

1313

with a right bracket and is valid only in test parts.

1314

1315

`] (right bracket)'

1316

end replacement here. This suboperand must always be paired

1317

with a left bracket and is valid only in test parts.

1318

1319

`# (number sign or crosshatch)'

1320

test or set a variable. Variables are referred to by numbers

1321

1 to 50, for example, `#1', `#2', `#25'. Variables may be set

1322

by one `context' or multipass opcode and tested by another.

1323

Thus, an operation that occurs at one place in a translation

1324

can tell an operation that occurs later about itself. This

1325

feature will be used in math translation, and it may also

1326

help to alleviate the need for new opcodes. This suboperand

1327

is valid everywhere.

1328

1329

Variables are set in the action part. To set a variable use an

1330

expression like `#1=1', `#2=5', etc. Variables are also

1331

incremented and decremented in the action part with

1332

expressions like `#1+', `#3-', etc. These operators increment

1333

or decrement the variable by 1.

1334

1335

Variables are tested in the test part with expressions like

1336

`#1=2', `#3<4', `#5>6', etc.

1337

1338

`* (asterisk)'

1339

Copy the characters or dot patterns in the input within the

1340

replacement brackets into the output and discard anything

1341

else that may match. This feature is used, for example, for

1342

handling numeric subscripts in Nemeth. This suboperand is

1343

valid only in action parts.

1344

1345

`? (question mark)'

1346

Valid only in the action part. The characters to be replaced

1347

are simply ignored. That is, they are replaced with nothing.

1348

If either membar of a grouping pair is in the replace

1349

brackets the other member at the same level is also removed.

1350

1351

1352

The characters which can be used in attribute strings are as

1353

follows:

1354

1355

`a any attribute'

1356

1357

`d digit'

1358

1359

`D literary digit'

1360

1361

`l letter'

1362

1363

`m math'

1364

1365

`p punctuation'

1366

1367

`S sign'

1368

1369

`s space'

1370

1371

`U uppercase'

1372

1373

`u lowercase'

1374

1375

`w first user-defined class'

1376

1377

`x second user-defined class'

1378

1379

`y third user-defined class'

1380

1381

`z fourth user-defined class'

1382

1383

Note that if any multipass opcode or the correct opcode is used

1384

and the `pass1Only' mode bit (*note lou_translateString::) is not

1385

set input and output positions may be incorrect.

1386

1387

1388

3.11 The correct Opcode

1389

=======================

1390

1391

`correct test action'

1392

Because some input (such as that from an OCR program) may contain

1393

systematic errors, it is sometimes advantageous to use a

1394

pre-translation pass to remove them. The errors and their

1395

corrections are specified by the `correct' opcode. If there are no

1396

`correct' opcodes in a table, the pre-translation pass is not

1397

used. The format of the `correct' opcode is very similar to that

1398

of the `context' opcode (*note context: context opcode.). The only

1399

difference is that in the action part strings may be used and dot

1400

patterns may not be used. Some examples of `correct' opcode

1401

entries are:

1402

1403

correct "\\" ? Eliminate backslashes

1404

correct "cornf" "comf" fix a common "scano"

1405

correct "cornm" "comm"

1406

correct "cornp" "comp"

1407

correct "*" ? Get rid of stray asterisks

1408

correct "|" ? ditto for vertical bars

1409

correct "\s?" "?" drop space before question mark

1410

1411

Note that if the `correct' opcode is used and the `pass1Only' mode

1412

bit (*note lou_translateString::) is not set input and output

1413

positions may be incorrect.

1414

1415

1416

3.12 Miscellaneous Opcodes

1417

==========================

1418

1419

`include filename'

1420

Read the file indicated by `filename' and incorporate or include

1421

its entries into the table. Included files can include other files,

1422

which can include other files, etc. For an example, see what files

1423

are included by the entry include `en-us-g1.ctb' in the table

1424

`en-us-g2.ctb'. If the included file is not in the same directory

1425

as the main table, use a full pathname for filename.

1426

1427

`locale characters'

1428

Not implemented, but recognized and ignored for backward

1429

compatibility.

1430

1431

`display character dots'

1432

Associates dot patterns with the characters which will be sent to a

1433

braille embosser, display or screen font. The character must be in

1434

the range 0-255 and the dots must specify a single cell. Here are

1435

some examples:

1436

1437

display a 1 When the character a is sent to the embosser or display,

1438

it # will produce a dot 1.

1439

1440

display L 123 When the character L is sent to the display or embosser

1441

# produces dots 1-2-3.

1442

1443

The display opcode is optional. It is used when the embosser or

1444

display has a different mapping of characters to dot patterns than

1445

that given in *note Character-Definition Opcodes::. If used,

1446

display entries must proceed character-definition entries.

1447

1448

`multind dots opcode opcode ...'

1449

the multind opcode tells the back-translator that a sequence of

1450

braille cells represents more than one braille indicator. For

1451

example, in `en-us-g1.ctb' we have `multind 56-6 letsign capsign'.

1452

The back-translator can generally handle single braille indicators,

1453

but it cannot apply them when they immediately follow each other.

1454

It recognizes the letter sign if it is followed by a letter and

1455

takes appropriate action. It also recognizes the capital sign if

1456

it is followed by a letter. But when there is a letter sign

1457

followed by a capital sign it fails to recognize the letter sign

1458

unless the sequence has been defined with `multind'. A `multind'

1459

entry may not contain a comment because liblouis would attempt to

1460

interpret it as an opcode.

1461

1462

1463

4 Notes on Back-Translation

1464

***************************

1465

1466

Back-translation is carried out by the function

1467

`lou_backTranslateString'. Its calling sequence is described in *note

1468

Programming with liblouis::. Tables containing no `context' opcode

1469

(*note context: context opcode.), `correct' opcode (*note correct:

1470

correct opcode.) or multipass opcodes can be used for both forward and

1471

backward translation. If these opcodes are needed different tables will

1472

be required. `lou_backTranslateString' first performs `pass4', if

1473

present, then `pass3', then `pass2', then the backtranslation, then

1474

corrections. Note that this is exactly the inverse of forward

1475

translation.

1476

1477

5 Programming with liblouis

1478

***************************

1479

1480

5.1 License

1481

===========

1482

1483

Liblouis may contain code borrowed from the Linux screenreader BRLTTY,

1484

1485

1486

1487

1488

1489

1490

Liblouis is free software: you can redistribute it and/or modify it

1491

under the terms of the GNU Lesser General Public License as published

1492

by the Free Software Foundation, either version 3 of the License, or

1493

(at your option) any later version.

1494

1495

Liblouis is distributed in the hope that it will be useful, but

1496

WITHOUT ANY WARRANTY; without even the implied warranty of

1497

MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser

1498

General Public License for more details.

1499

1500

You should have received a copy of the GNU Lesser General Public

1501

License along with Liblouis. If not, see `http://www.gnu.org/licenses/'.

1502

1503

5.2 Overview

1504

============

1505

1506

You use the liblouis library by calling eleven functions,

1507

`lou_translateString', `lou_backTranslateString', `lou_logFile',

1508

`lou_logPrint', `lou_getTable', `lou_translate', `lou_backTranslate',

1509

`lou_hyphenate', `lou_readCharFromFile', `lou_version' and `lou_free'.

1510

These are described below. The header file, `liblouis.h', also contains

1511

brief descriptions. Liblouis is written in straight C. It has just

1512

three code modules, `compileTranslationTable.c',

1513

`lou_translateString.c' and `lou_backTranslateString.c'. In addition,

1514

there are two header files, `liblouis.h', which defines the API, and

1515

`louis.h', used only internally. The latter includes `liblouis.h'.

1516

1517

Persons who wish to use liblouis from Python may want to skip ahead

1518

to *note Python bindings::.

1519

1520

`compileTranslationTable.c' keeps track of all translation tables

1521

which an application has used. It is called by the translation,

1522

hyphenation and checking functions when they start. If a table has not

1523

yet been compiled `compileTranslationTable.c' checks it for correctness

1524

and compiles it into an efficient internal representation. The main

1525

entry point is `lou_getTable'. Since it is the module that keeps track

1526

of memory usage, it also contains the `lou_free' function. In addition,

1527

it contains the `lou_logFile' and `lou_logPrint' functions, plus some

1528

utility functions which are used by the other modules.

1529

1530

By default, liblouis handles all characters internally as 16-bit

1531

unsigned integers. It can be compiled for 32-bit characters as

1532

explained below. The meanings of these integers are not hard-coded.

1533

Rather they are defined by the character-definition opcodes. However,

1534

the standard printable characters, from decimal 32 to 126 are

1535

recognized for the purpose of processing the opcodes. Hence, the

1536

following definition is included in `liblouis.h'. It is correct for

1537

computers with at least 32-bit processors.

1538

1539

#define widechar unsigned short int

1540

1541

To make liblouis handle 32-bit Unicode simply remove the word

1542

`short' in the above `define'. This will cause the translate and

1543

back-translate functions to expect input in 32-bit form and to deliver

1544

their output in this form. The input to the compiler (tables) is

1545

unaffected except that two new escape sequences for 20-bit and 32-bit

1546

characters are recognized.

1547

1548

Here are the definitions of the eleven liblouis functions and their

1549

parameters. They are given in terms of 16-bit Unicode. If liblouis has

1550

been compiled for 32-bit Unicode simply read 32 instead of 16.

1551

1552

5.3 Data structure of liblouis tables

1553

=====================================

1554

1555

The data structure `TranslationTableHeader' is defined by a `typedef'

1556

statement in `louis.h'. To find the beginning, search for the word

1557

`header'. As its name implies, this is actually the table header. Data

1558

are placed in the `ruleArea' array, which is the last item defined in

1559

this structure. This array is declared with a length of 1 and is

1560

expanded as needed. The table header consists mostly of arrays of

1561

pointers of size `HASHNUM'. These pointers are actually offsets into

1562

`ruleArea' and point to chains of items which have been placed in the

1563

same hash bucket by a simple hashing algorithm. `HASHNUM' should be a

1564

prime and is currently 1123. The structure of the table was chosen to

1565

optimize speed rather than memory usage.

1566

1567

The first part of the table contains miscellaneous information, such

1568

as the number of passes and whether various opcodes have been used. It

1569

also contains the amount of memory allocated to the table and the

1570

amount actually used.

1571

1572

The next section contains pointers to various braille indicators and

1573

begins with `capitalSign'. The rules pointed to contain the dot pattern

1574

for the indicator and an opcode which is used by the back-translator

1575

but does not appear in the list of opcodes. The braille indicators also

1576

include various kinds of emphasis, such as italic and bold and

1577

information about the length of emphasized phrases. The latter is

1578

contained directly in the table item instead of in a rule.

1579

1580

After the braille indicators comes information about when a letter

1581

sign should be used.

1582

1583

Next is an array of size `HASHNUM' which points to character

1584

definitions. These are created by the character-definition opcodes.

1585

1586

Following this is a similar array pointing to definitions of

1587

single-cell dot patterns. This is also created from the

1588

character-definition opcodes. If a character definition contains a

1589

multi-cell dot pattern this is compiled into ordinary forward and

1590

backward rules. If such a multi-cell dot pattern contains a single cell

1591

which has not previously been defined that cell is placed in this

1592

array, but is given the attribute `undefined'.

1593

1594

Next come arrays that map characters to single-cell dot patterns and

1595

dots to characters. These are created from both character-definition

1596

opcodes and display opcodes.

1597

1598

Next is an array of size 256 which maps characters in this range to

1599

dot patterns which may consist of multiple cells. It is used, for

1600

example, to map `{' to dots 456-246. These mappings are created by the

1601

`compdots' or the `comp6' opcode (*note comp6: comp6 opcode.).

1602

1603

Next are two small arrays that held pointers to chains of rules

1604

produced by the `swapcd' opcode (*note swapcd: swapcd opcode.) and the

1605

`swapdd' opcode (*note swapdd: swapdd opcode.) and by some multipass,

1606

context and correct opcodes.

1607

1608

Now we get to an array of size `HASHNUM' which points to chains of

1609

rules for forward translation.

1610

1611

Following this is a similar array for back-translation.

1612

1613

Finally is the `ruleArea', an array of variable size to which

1614

various structures are mapped and to which almost everything else

1615

points.

1616

1617

5.4 lou_version

1618

===============

1619

1620

char *lou_version ()

1621

1622

This function returns a pointer to a character string containing the

1623

version of liblouis, plus other information, such as the release date

1624

and perhaps notable changes.

1625

1626

5.5 lou_translateString

1627

=======================

1628

1629

int lou_translateString (

1630

const char *const trantab,

1631

const widechar *const inbuf,

1632

int *inlen,

1633

widechar *outbuf,

1634

int *outlen,

1635

char *typeform,

1636

char *spacing,

1637

int mode);

1638

1639

This function takes a string of 16-bit Unicode characters in `inbuf'

1640

and translates it into a string of 16-bit characters in `outbuf'. Each

1641

16-bit character produces a particular dot pattern in one braille cell

1642

when sent to an embosser or braille display or to a screen typefont.

1643

Which 16-bit character represents which dot pattern is indicated by the

1644

character-definition and display opcodes in the translation table.

1645

1646

The `trantab' parameter points to a list of translation tables

1647

separated by commas. If only one table is given, no comma should be

1648

used after it. It is these tables which control just how the

1649

translation is made, whether in Grade 2, Grade 1, or something else.

1650

The first table in the list must be a full pathname, unless the tables

1651

are in the current directory. The pathname is extracted up to the

1652

filename. The first table is then compiled. The pathname is then added

1653

to the name of the second table, which is compiled, and so on. The

1654

tables in a list are all compiled into the same internal table. The

1655

list is then regarded as the name of this table. As explained in *note

1656

How to Write Translation Tables::, each table is a file which may be

1657

plain text, big-endian Unicode or little-endian Unicode. A table (or

1658

list of tables) is compiled into an internal representation the first

1659

time it is used. Liblouis keeps track of which tables have been

1660

compiled. For this reason, it is essential to call the lou_free

1661

function at the end of your application to avoid memory leaks. Do _NOT_

1662

call `lou_free' after each translation. This will force liblouis to

1663

compile the translation tables each time they are used, leading to

1664

great inefficiency.

1665

1666

Note that both the `*inlen' and `*outlen' parameters are pointers to

1667

integers. When the function is called, these integers contain the

1668

maximum input and output lengths, respectively. When it returns, they

1669

are set to the actual lengths used.

1670

1671

The `typeform' parameter is used to indicate italic type, boldface

1672

type, computer braille, etc. It is a string of characters with the same

1673

length as the input buffer pointed to by `*inbuf'. However, it is used

1674

to pass back character-by-character results, so enough space must be

1675

provided to match the `*outlen' parameter. Each character indicates

1676

the typeform of the corresponding character in the input buffer. The

1677

values are as follows: 0 plain-text; 1 italic; 2 bold; 4 underline; 8

1678

computer braille. These values can be added for multiple emphasis. If

1679

this parameter is `NULL', no checking for typeforms is done. In

1680

addition, if this parameter is not `NULL', it is set on return to have

1681

an 8 at every position corresponding to a character in `outbuf' which

1682

was defined to have a dot representation containing dot 7, dot 8 or

1683

both, and to 0 otherwise.

1684

1685

The `spacing' parameter is used to indicate differences in spacing

1686

between the input string and the translated output string. It is also

1687

of the same length as the string pointed to by `*inbuf'. If this

1688

parameter is `NULL', no spacing information is computed.

1689

1690

The `mode' parameter specifies how the translation should be done.

1691

The valid values of mode are listed in `liblouis.h'. They are all

1692

powers of 2, so that a combined mode can be specified by adding up

1693

different values.

1694

1695

The function returns 1 if no errors were encountered and 0 if a

1696

complete translation could not be done.

1697

1698

5.6 lou_translate

1699

=================

1700

1701

int lou_translate (

1702

const char *const trantab,

1703

const widechar * const inbuf,

1704

int *inlen,

1705

widechar * outbuf,

1706

int *outlen,

1707

char *typeform,

1708

char *spacing,

1709

int *outputPos,

1710

int *inputPos,

1711

int *cursorPos,

1712

int mode);

1713

1714

This function adds the parameters `outputPos', `inputPos' and

1715

`cursorPos', to facilitate use in screenreader programs. The

1716

`outputPos' parameter must point to an array of integers with at least

1717

`outlen' elements. On return, this array will contain the position in

1718

`inbuf' corresponding to each output position. Similarly, `inputPos'

1719

must point to an array of integers of at least `inlen' elements. On

1720

return, this array will contain the position in `outbuf' corresponding

1721

to each position in `inbuf'. `cursorPos' must point to an integer

1722

containing the position of the cursor in the input. On return, it will

1723

contain the cursor position in the output. Any parameter after `outlen'

1724

may be `NULL'. In this case, the actions corresponding to it will not

1725

be carried out. The `mode' parameter, however, must be present and must

1726

be an integer, not a pointer to an integer. If the `compbrlAtCursor'

1727

bit is set in the `mode' parameter the space-bounded characters

1728

containing the cursor will be translated in computer braille.

1729

1730

5.7 lou_backTranslateString

1731

===========================

1732

1733

int lou_backTranslateString (

1734

const char *const trantab,

1735

const widechar *const inbuf,

1736

int *inlen,

1737

widechar *outbuf,

1738

int *outlen,

1739

char *typeform,

1740

char *spacing,

1741

int mode);

1742

1743

This is exactly the opposite of `lou_translateString'. `inbuf' is a

1744

string of 16-bit Unicode characters representing braille. `outbuf' will

1745

contain a string of 16-bit Unicode characters. `typeform' will indicate

1746

any emphasis found in the input string, while `spacing' will indicate

1747

any differences in spacing between the input and output strings. The

1748

`typeform' and `spacing' parameters may be `NULL' if this information is

1749

not needed. `mode' again specifies how the back-translation should be

1750

done.

1751

1752

5.8 lou_backTranslate

1753

=====================

1754

1755

int lou_backTranslate (

1756

const char *const trantab,

1757

const widechar *const inbufx,

1758

int *inlen,

1759

widechar * outbuf,

1760

int *outlen,

1761

char *typeform,

1762

char *spacing,

1763

int *outputPos,

1764

int *inputPos,

1765

int *cursorPos,

1766

int mode);

1767

1768

This function is exactly the inverse of `lou_translate'.

1769

1770

5.9 lou_hyphenate

1771

=================

1772

1773

int lou_hyphenate (

1774

const char *const trantab,

1775

const widechar * const inbuf,

1776

int inlen,

1777

char *hyphens,

1778

int mode);

1779

1780

This function looks at the characters in `inbuf' and if it finds a

1781

sequence of letters attempts to hyphenate it as a word. Note that

1782

lou_hyphenate operates on single words only, and spaces or punctuation

1783

marks between letters are not allowed. Leading and trailing punctuation

1784

marks are ignored. The table named by the `trantab' parameter must

1785

contain a hyphenation table. If it does not, the function does nothing.

1786

`inlen' is the length of the character string in `inbuf'. `hyphens' is

1787

an array of characters and must be of size `inlen'. If hyphenation is

1788

successful it will have a 1 at the beginning of each syllable and a 0

1789

elsewhere. If the `mode' parameter is 0 `inbuf' is assumed to contain

1790

untranslated characters. Any nonzero value means that `inbuf' contains

1791

a translation. In this case, it is back-translated, hyphenation is

1792

performed, and it is retranslated so that the hyphens can be placed

1793

correctly. The `lou_translate' and `lou_backTranslate' functions are

1794

used in this process. `lou_hyphenate' returns 1 if hyphenation was

1795

successful and 0 otherwise. In the latter case, the contents of the

1796

`hyphens' parameter are undefined. This function was provided for use in

1797

liblouisxml.

1798

1799

5.10 lou_logFile

1800

================

1801

1802

void lou_logFile (char *fileName);

1803

1804

This function is used when it is not convenient either to let

1805

messages be printed on stderr or to use redirection, as when liblouis

1806

is used in a GUI application or in liblouisxml. Any error messages

1807

generated will be printed to the file given in this call. The entire

1808

pathname of the file must be given.

1809

1810

5.11 lou_logPrint

1811

=================

1812

1813

void lou_logPrint (char *format, ...);

1814

1815

This function is called like `fprint'. It can be used by other

1816

libraries to print messages to the file specified by the call to

1817

`lou_logFile'. In particular, it is used by the companion library

1818

liblouisxml.

1819

1820

5.12 lou_getTable

1821

=================

1822

1823

void *lou_getTable (char *tablelist);

1824

1825

`tablelist' is a list of names of table files separated by commas,

1826

as explained previously (*note `trantab' parameter in

1827

`lou_translateString': translation-tables.). If no errors are found

1828

this function returns a pointer to the compiled table. If errors are

1829

found messages are printed to the log file, which is stderr unless a

1830

different filename has been given using the `lou_logFile' function.

1831

Errors result in a `NULL' pointer being returned.

1832

1833

5.13 lou_readCharFromFile

1834

=========================

1835

1836

int lou_readCharFromFile (const char *fileName, int *mode);

1837

1838

This function is provided for situations where it is necessary to

1839

read a file which may contain little-endian or big-endian 16-bit Unicode

1840

characters or ASCII8 characters. The return value is a little-endian

1841

character, encoded as an integer. The `fileName' parameter is the name

1842

of the file to be read. The `mode' parameter is a pointer to an integer

1843

which must be set to 1 on the first call. After that, the function

1844

takes care of it. On end-of-file the function returns `EOF'.

1845

1846

5.14 lou_free

1847

=============

1848

1849

void lou_free ();

1850

1851

This function should be called at the end of the application to free

1852

all memory allocated by liblouis. Failure to do so will result in

1853

memory leaks. Do _NOT_ call `lou_free' after each translation. This

1854

will force liblouis to compile the translation tables every time they

1855

are used, resulting in great inefficiency.

1856

1857

5.15 Python bindings

1858

====================

1859

1860

There are Python bindings for `lou_translateString', `lou_translate'

1861

and `lou_version'. For installation instructions see the the `README'

1862

file in the `python' directory. Usage information is included in the

1863

Python module itself.

1864

1865

Opcode Index

1866

************

1867

1868

after: See 3.8. (line 1174)

1869

always: See 3.7. (line 954)

1870

before: See 3.8. (line 1181)

1871

begbold: See 3.4. (line 757)

1872

begcaps: See 3.3. (line 593)

1873

begcomp: See 3.4. (line 826)

1874

begital: See 3.4. (line 707)

1875

begmidword: See 3.7. (line 1060)

1876

begnum: See 3.7. (line 1101)

1877

begunder: See 3.4. (line 799)

1878

begword: See 3.7. (line 1056)

1879

boldsign: See 3.4. (line 742)

1880

capsign: See 3.3. (line 587)

1881

capsnocont: See 3.6. (line 864)

1882

class: See 3.8. (line 1169)

1883

comp6: See 3.7. (line 927)

1884

compbrl: See 3.7. (line 915)

1885

context: See 3.10. (line 1224)

1886

contraction: See 3.7. (line 1034)

1887

correct: See 3.11. (line 1391)

1888

decpoint: See 3.5. (line 847)

1889

digit: See 3.2. (line 501)

1890

display: See 3.12. (line 1431)

1891

endbold: See 3.4. (line 763)

1892

endcaps: See 3.3. (line 599)

1893

endcomp: See 3.4. (line 833)

1894

endital: See 3.4. (line 713)

1895

endnum: See 3.7. (line 1112)

1896

endunder: See 3.4. (line 805)

1897

endword: See 3.7. (line 1072)

1898

exactdots: See 3.7. (line 1081)

1899

firstletterbold: See 3.4. (line 757)

1900

firstletterital: See 3.4. (line 707)

1901

firstletterunder: See 3.4. (line 799)

1902

firstwordbold: See 3.4. (line 736)

1903

firstwordital: See 3.4. (line 682)

1904

firstwordunder: See 3.4. (line 784)

1905

grouping: See 3.2. (line 524)

1906

hyphen: See 3.5. (line 853)

1907

include: See 3.12. (line 1419)

1908

italsign: See 3.4. (line 690)

1909

joinnum: See 3.7. (line 1117)

1910

joinword: See 3.7. (line 1016)

1911

largesign: See 3.7. (line 979)

1912

lastletterbold: See 3.4. (line 763)

1913

lastletterital: See 3.4. (line 713)

1914

lastletterunder: See 3.4. (line 805)

1915

lastwordboldafter: See 3.4. (line 752)

1916

lastwordboldbefore: See 3.4. (line 742)

1917

lastworditalafter: See 3.4. (line 700)

1918

lastworditalbefore: See 3.4. (line 690)

1919

lastwordunderafter: See 3.4. (line 795)

1920

lastwordunderbefore: See 3.4. (line 788)

1921

lenboldphrase: See 3.4. (line 773)

1922

lenitalphrase: See 3.4. (line 723)

1923

lenunderphrase: See 3.4. (line 815)

1924

letsign: See 3.3. (line 605)

1925

letter: See 3.2. (line 539)

1926

litdigit: See 3.2. (line 556)

1927

literal: See 3.7. (line 915)

1928

locale: See 3.12. (line 1427)

1929

lowercase: See 3.2. (line 544)

1930

lowword: See 3.7. (line 1025)

1931

math: See 3.2. (line 571)

1932

midendword: See 3.7. (line 1068)

1933

midnum: See 3.7. (line 1106)

1934

midword: See 3.7. (line 1064)

1935

multind: See 3.12. (line 1448)

1936

noback: See 3.7. (line 903)

1937

nocont: See 3.7. (line 937)

1938

nocross: See 3.7. (line 1006)

1939

nofor: See 3.7. (line 910)

1940

noletsign: See 3.3. (line 613)

1941

noletsignafter: See 3.3. (line 630)

1942

noletsignbefore: See 3.3. (line 622)

1943

numsign: See 3.3. (line 638)

1944

partword: See 3.7. (line 1076)

1945

pass2: See 3.10. (line 1224)

1946

pass3: See 3.10. (line 1224)

1947

pass4: See 3.10. (line 1224)

1948

postpunc: See 3.7. (line 1097)

1949

prepunc: See 3.7. (line 1093)

1950

prfword: See 3.7. (line 1052)

1951

punctuation: See 3.2. (line 494)

1952

repeated: See 3.7. (line 962)

1953

replace: See 3.7. (line 944)

1954

repword: See 3.7. (line 970)

1955

sign: See 3.2. (line 563)

1956

singleletterbold: See 3.4. (line 769)

1957

singleletterital: See 3.4. (line 719)

1958

singleletterunder: See 3.4. (line 811)

1959

space: See 3.2. (line 488)

1960

sufword: See 3.7. (line 1048)

1961

swapcc: See 3.9. (line 1214)

1962

swapcd: See 3.9. (line 1202)

1963

swapdd: See 3.9. (line 1207)

1964

syllable: See 3.7. (line 994)

1965

undersign: See 3.4. (line 788)

1966

uplow: See 3.2. (line 507)

1967

uppercase: See 3.2. (line 549)

1968

word: See 3.7. (line 990)

1969

Function Index

1970

**************

1971

1972

lou_backTranslate: See 5.8. (line 1755)

1973

lou_backTranslateString: See 5.7. (line 1733)

1974

lou_free: See 5.14. (line 1849)

1975

lou_getTable: See 5.12. (line 1823)

1976

lou_hyphenate: See 5.9. (line 1773)

1977

lou_logFile: See 5.10. (line 1802)

1978

lou_logPrint: See 5.11. (line 1813)

1979

lou_readCharFromFile: See 5.13. (line 1836)

1980

lou_translate: See 5.6. (line 1701)

1981

lou_translateString: See 5.5. (line 1629)

1982

lou_version: See 5.4. (line 1620)

1983

Program Index

1984

*************

1985

1986

lou_allround: See 2.3. (line 238)

1987

lou_checkhyphens: See 2.5. (line 282)

1988

lou_checktable: See 2.2. (line 225)

1989

lou_debug: See 2.1. (line 127)

1990

lou_translate: See 2.4. (line 256)