~ubuntu-branches/ubuntu/trusty/agrep/trusty

Viewing changes to .pc/15-manpage-url.patch/agrep.1

Committer: Package Import Robot
Author(s): Jari Aalto
Date: 2012-02-23 09:26:12 UTC
Revision ID: package-import@ubuntu.com-20120223092612-vj6qjuczp5inztqg

Tags: 4.17-8

* debian/compat
  - Update to 9
* debian/control
  - (Build-Depends): update to debhelper 9, dpkg-dev 1.16.1. Remove
    dpatch.
* debian/install
  - New file.
* debian/copyright:
  - Update to DEP5.
* debian/patches
  - Convert all patches to quilt.
  - Renumber patches: number 10-19 are used for manual pages.
* debian/rules
  - Update to dh(1).
  - Use hardened CFLAGS.
    http://wiki.debian.org/ReleaseGoals/SecurityHardeningBuildFlags
* debian/source/format
  - New. Update to 3.0.
* debian/debian-vars.mk
  - Delete. No longer needed.

files added:
.pc

.pc/.version

.pc/01-makefile.patch

.pc/01-makefile.patch/Makefile

.pc/10-manpage-th.patch

.pc/10-manpage-th.patch/agrep.1

.pc/12-manpage-hyphen.patch

.pc/12-manpage-hyphen.patch/agrep.1

.pc/15-manpage-url.patch

.pc/15-manpage-url.patch/agrep.1

.pc/applied-patches

debian/install

debian/patches/01-makefile.patch

debian/patches/10-manpage-th.patch

debian/patches/12-manpage-hyphen.patch

debian/patches/15-manpage-url.patch

debian/patches/series

debian/source

debian/source/format

files removed:
debian/debian-vars.mk

debian/patches/00list

debian/patches/01-makefile.dpatch

debian/patches/02-manpage-hyphen.dpatch

debian/patches/02-manpage-th.dpatch

debian/patches/03-manpage-fhs-dir-dict.dpatch

debian/patches/03-manpage-fhs-dir-dict.dpatch.b

files modified:
Makefile

agrep.1

debian/changelog

debian/compat

debian/control

debian/copyright

debian/rules

Show diffs side-by-side

added added

removed removed

.pc/15-manpage-url.patch/agrep.1

.TH AGREP 1 "Jan 17, 1992"

.SH NAME

agrep \- search a file for a string or regular expression, with approximate matching capabilities

.SH SYNOPSIS

.B agrep

[

.B \-#cdehiklnpstvwxBDGIS

]

.I pattern

[ \-f

.I patternfile

]

[

.IR filename ".\|.\|. ]"

.SH DESCRIPTION

.B agrep

searches the input

.IR filenames

(standard input is the default, but see a warning under LIMITATIONS)

for records containing strings which either

\fIexactly\fP or \fIapproximately\fP match a pattern.

A record is by default a line, but it can be defined differently using

the \-d option (see below).

Normally, each record found is copied to the standard output.

Approximate matching allows finding records that contain the pattern

with several errors including substitutions, insertions, and

deletions.

For example, Massechusets matches Massachusetts with two errors

(one substitution and one insertion). Running

.B agrep

\-2 Massechusets foo outputs all lines in foo containing any string with

at most 2 errors from Massechusets.

.LP

.B agrep

supports many kinds of queries including

arbitrary wild cards, sets of patterns, and in general,

regular expressions.

See PATTERNS below.

It supports most of the options supported by the

.B grep

family plus several more (but it is not 100% compatible with grep).

For more information on the algorithms used by agrep see

Wu and Manber,

"Fast Text Searching With Errors,"

Technical report #91-11, Department of Computer Science, University

of Arizona, June 1991 (available by anonymous ftp from cs.arizona.edu

in agrep/agrep.ps.1), and

Wu and Manber,

"Agrep -- A Fast Approximate Pattern Searching Tool",

To appear in USENIX Conference 1992 January (available by anonymous ftp

from cs.arizona.edu in agrep/agrep.ps.2).

.LP

As with the rest of the \fBgrep\fP family, the characters

.RB ` $ ',

.RB `^ ',

.RB ` \(** ',

.RB ` [ ' ,

.RB ` ] ' ,

.RB ` \s+2^\s0 ',

.RB ` | ',

.RB ` ( ',

.RB ` ) ',

.RB ` ! ',

and

.RB ` \e '

can cause unexpected results when included in the

.IR pattern ,

as these characters are also meaningful

to the shell. To avoid these problems, one should always enclose the entire

pattern argument in single quotes, i.e., 'pattern'.

Do not use double quotes (").

.LP

When

.B agrep

is applied to more than one input

file, the name of the file is displayed

preceding each line which matches

the pattern. The filename is not displayed

when processing a single

file, so if you actually want the filename

to appear, use

.B /dev/null

as a second file in the list.

.SH OPTIONS

.TP

.B \-\fI#\fP

\fI#\fP is a non-negative integer (at most 8)

specifying the maximum number of errors

permitted in finding the approximate matches (defaults to zero).

Generally, each insertion, deletion, or substitution counts as one error.

It is possible to adjust the relative cost of insertions,

deletions and substitutions (see \-I \-D and \-S options).

.TP

.B \-c

Display only the count of matching records.

.TP

.B \-d "'\fIdelim\fP'"

Define \fIdelim\fP to be the separator between two records.

The default value is '$', namely a record is by default

100

a line.

101

\fIdelim\fP can be a string of size at most 8

102

(with possible use of ^ and $), but not

103

a regular expression.

104

Text between two \fIdelim\fP's, before the first \fIdelim\fP,

105

and after the last \fIdelim\fP is considered as one record.

106

For example, \-d '$$' defines paragraphs as records and \-d '^From\ '

107

defines mail messages as records.

108

.B agrep

109

matches each record separately.

110

This option does not currently work with regular expressions.

111

.TP

112

.BI \-e " pattern"

113

Same as a simple

114

.I pattern

115

argument, but useful when the

116

.I pattern

117

begins with a

118

.RB ` \- '.

119

.TP

120

.BI \-f " patternfile"

121

.I patternfile

122

contains a set of (simple) patterns.

123

The output is all lines that match at least one of the patterns in

124

.I patternfile.

125

Currently, the \-f option works only for exact match and for simple

126

patterns (any meta symbol is interpreted as a regular character);

127

it is compatible only with \-c, \-h, \-i, \-l, \-s, \-v, \-w, and \-x options.

128

see LIMITATIONS for size bounds.

129

.TP

130

.B \-h

131

Do not display filenames.

132

.TP

133

.B \-i

134

Case-insensitive search \(em e.g., "A" and "a" are considered equivalent.

135

.TP

136

.B \-k

137

No symbol in the pattern is treated as a meta character.

138

For example, agrep \-k 'a(b|c)*d' foo will find

139

the occurrences of a(b|c)*d in foo whereas agrep 'a(b|c)*d' foo

140

will find substrings in foo that match the regular expression 'a(b|c)*d'.

141

.TP

142

.B \-l

143

List only the files that contain a match.

144

This option is useful for looking for files containing a certain pattern.

145

For example, " agrep \-l 'wonderful' * " will list the names of those

146

files in current directory that contain the word 'wonderful'.

147

.TP

148

.B \-n

149

Each line that is printed is prefixed by its record number in the file.

150

.TP

151

.B \-p

152

Find records in the text that contain a supersequence of the pattern.

153

For example,

154

\fB agrep \-p DCS foo

155

will match "Department of Computer Science."

156

.TP

157

.B \-s

158

Work silently, that is, display nothing except error messages.

159

This is useful for checking the error status.

160

.TP

161

.B \-t

162

Output the record starting from the end of

163

.I delim

164

to (and including) the next

165

.I delim.

166

This is useful for cases where

167

.I delim

168

should come at the end of the record.

169

.TP

170

.B \-v

171

Inverse mode \(em display only those records that

172

.I do not

173

contain the pattern.

174

.TP

175

.B \-w

176

Search for the pattern as a word \(em i.e., surrounded by non-alphanumeric

177

characters. The non-alphanumeric

178

.B must

179

surround the match; they cannot be counted as errors.

180

For example,

181

.B agrep

182

\-w \-1 car will match cars, but not characters.

183

.TP

184

.B \-x

185

The pattern must match the whole line.

186

.TP

187

.B \-y

188

Used with \-B option. When \-y is on, agrep will always

189

output the best matches without giving a prompt.

190

.TP

191

.B \-B

192

Best match mode.

193

When \-B is specified and no exact matches are found, agrep

194

will continue to search until the closest matches (i.e., the ones

195

with minimum number of errors)

196

are found, at which point the following message will be shown:

197

"the best match contains x errors, there are y matches, output them? (y/n)"

198

The best match mode is not supported for standard input, e.g.,

199

pipeline input.

200

When the \-#, \-c, or \-l options are specified, the \-B option is ignored.

201

In general, \-B may be slower than \-#, but not by very much.

202

.TP

203

.B \-D\fIk\fP

204

Set the cost of a deletion to \fIk\fP (\fIk\fP is a positive integer).

205

This option does not currently work with regular expressions.

206

.TP

207

.B \-G

208

Output the files that contain a match.

209

.TP

210

.B \-I\fIk\fP

211

Set the cost of an insertion to \fIk\fP (\fIk\fP is a positive integer).

212

This option does not currently work with regular expressions.

213

.TP

214

.B \-S\fIk\fP

215

Set the cost of a substitution to \fIk\fP (\fIk\fP is a positive integer).

216

This option does not currently work with regular expressions.

217

.ne 4

218

.SH PATTERNS

219

.LP

220

\fIagrep\fP

221

supports a large variety of patterns, including simple

222

strings, strings with classes of characters, sets of strings,

223

wild cards, and regular expressions.

224

.TP

225

\fBStrings\fP

226

any sequence of characters, including the special symbols

227

`^' for beginning of line and `$' for end of line.

228

The special characters listed above (

229

.RB ` $ ',

230

.RB `^ ',

231

.RB ` \(** ',

232

.RB ` [ ' ,

233

.RB ` \s+2^\s0 ',

234

.RB ` | ',

235

.RB ` ( ',

236

.RB ` ) ',

237

.RB ` ! ',

238

and

239

.RB ` \e '

240

) should be preceded by `\\' if they are to be matched as regular

241

characters. For example, \\^abc\\\\ corresponds to the string ^abc\\,

242

whereas ^abc corresponds to the string abc at the beginning of a

243

line.

244

.TP

245

\fBClasses of characters\fP

246

a list of characters inside [] (in order) corresponds to any character

247

from the list. For example, [a-ho-z] is any character between a and h

248

or between o and z. The symbol `^' inside [] complements the list.

249

For example, [^i-n] denote any character in the character set except

250

character 'i' to 'n'.

251

The symbol `^' thus has two meanings, but this is consistent with

252

egrep.

253

The symbol `.' (don't care) stands for any symbol (except for the

254

newline symbol).

255

.TP

256

\fBBoolean operations\fP

257

.B agrep

258

supports an `and' operation `;'

259

and an `or' operation `,',

260

but not a combination of both. For example, 'fast;network' searches

261

for all records containing both words.

262

.TP

263

\fBWild cards\fP

264

The symbol '#' is used to denote a wild card. # matches zero or any

265

number of arbitrary characters. For example,

266

ex#e matches example. The symbol # is equivalent to .* in egrep.

267

In fact, .* will work too, because it is a valid regular expression

268

(see below), but unless this is part of an actual regular expression,

269

# will work faster.

270

.TP

271

\fBCombination of exact and approximate matching\fP

272

any pattern inside angle brackets <> must match the text exactly even

273

if the match is with errors. For example, <mathemat>ics matches

274

mathematical with one error (replacing the last s with an a), but

275

mathe<matics> does not match mathematical no matter how many errors we

276

allow.

277

.TP

278

\fBRegular expressions\fP

279

The syntax of regular expressions in \fBagrep\fP is in general the same as

280

that for \fBegrep\fP. The union operation `|', Kleene closure `*',

281

and parentheses () are all supported.

282

Currently '+' is not supported.

283

Regular expressions are currently limited to approximately 30

284

characters (generally excluding meta characters). Some options

285

(\-d, \-w, \-f, \-t, \-x, \-D, \-I, \-S) do not

286

currently work with regular expressions.

287

The maximal number of errors for regular expressions that use '*'

288

or '|' is 4.

289

.SH EXAMPLES

290

.LP

291

.TP

292

agrep \-2 \-c ABCDEFG foo

293

gives the number of lines in file foo that contain ABCDEFG

294

within two errors.

295

.TP

296

agrep \-1 \-D2 \-S2 'ABCD#YZ' foo

297

outputs the lines containing ABCD followed, within arbitrary

298

distance, by YZ, with up to one additional insertion

299

(\-D2 and \-S2 make deletions and substitutions too "expensive").

300

.TP

301

agrep \-5 \-p abcdefghij /usr/dict/words

302

outputs the list of all words containing at least 5 of the first 10

303

letters of the alphabet \fIin order\fR. (Try it: any list starting

304

with academia and ending with sacrilegious must mean something!)

305

.TP

306

agrep \-1 'abc[0-9](de|fg)*[x-z]' foo

307

outputs the lines containing, within up to one error, the string

308

that starts with abc followed by one digit, followed by zero or more

309

repetitions of either de or fg, followed by either x, y, or z.

310

.TP

311

agrep \-d '^From\ ' 'breakdown;internet' mbox

312

outputs all mail messages (the pattern '^From\ ' separates mail messages

313

in a mail file) that contain keywords 'breakdown' and 'internet'.

314

.TP

315

agrep \-d '$$' \-1 '<word1> <word2>' foo

316

finds all paragraphs that contain word1 followed by word2 with one

317

error in place of the blank.

318

In particular, if word1 is the last word in a line and word2

319

is the first word in the next line, then the space will be

320

substituted by a newline symbol and it will match.

321

Thus, this is a way to overcome separation by a newline.

322

Note that \-d '$$' (or another delim which spans more than one line)

323

is necessary, because otherwise agrep searches

324

only one line at a time.

325

.TP

326

agrep '^agrep' <this manual>

327

outputs all the examples of the use of agrep in this man pages.

328

.PD

329

.SH "SEE ALSO"

330

.BR ed (1),

331

.BR ex (1),

332

.BR grep (1V),

333

.BR sh (1),

334

.BR csh (1).

335

.SH BUGS/LIMITATIONS

336

Any bug reports or comments will be appreciated!

337

Please mail them to sw@cs.arizona.edu or udi@cs.arizona.edu

338

.LP

339

Regular expressions do not support the '+' operator (match 1 or more

340

instances of the preceding token). These can be searched for by using

341

this syntax in the pattern:

342

.sp

343

.in 1.0i

344

\&'\fIpattern\fB(\fIpattern\fB)*\fR'

345

.in

346

.sp

347

(search for strings containing one instance of the pattern, followed by 0 or

348

more instances of the pattern).

349

.LP

350

The following can cause an infinite loop:

351

.B agrep

352

pattern * > output_file.

353

If the number of matches is high, they may be deposited in

354

output_file before it is completely read leading to more matches of

355

the pattern within output_file (the matches are against the whole

356

directory). It's not clear whether this is a "bug" (grep will do the

357

same), but be warned.

358

.LP

359

The maximum size of the

360

.I patternfile

361

is limited to be 250Kb, and the maximum number of patterns

362

is limited to be 30,000.

363

.LP

364

Standard input is the default if no input file is given.

365

However, if standard input is keyed in directly (as opposed to through

366

a pipe, for example) agrep may not work for some non-simple patterns.

367

.LP

368

There is no size limit for simple patterns.

369

More complicated patterns are currently limited to approximately 30 characters.

370

Lines are limited to 1024 characters.

371

Records are limited to 48K, and may be truncated if they are larger

372

than that.

373

The limit of record length can be

374

changed by modifying the parameter Max_record in agrep.h.

375

.SH DIAGNOSTICS

376

Exit status is 0 if any matches are found,

377

1 if none, 2 for syntax errors or inaccessible files.

378

.SH AUTHORS

379

Sun Wu and Udi Manber, Department of Computer Science, University of

380

Arizona, Tucson, AZ 85721. {sw|udi}@cs.arizona.edu.

381

382

Older »