~slub.team/goobi-indexserver/3.x

« back to all changes in this revision

Viewing changes to lucene/contrib/analyzers/common/src/resources/org/apache/lucene/analysis/snowball/english_stop.txt

  • Committer: Sebastian Meyer
  • Date: 2012-08-03 09:12:40 UTC
  • Revision ID: sebastian.meyer@slub-dresden.de-20120803091240-x6861b0vabq1xror
Remove Lucene and Solr source code and add patches instead
Fix Bug #985487: Auto-suggestion for the search interface

Show diffs side-by-side

added added

removed removed

Lines of Context:
1
 
 | From svn.tartarus.org/snowball/trunk/website/algorithms/english/stop.txt
2
 
 | This file is distributed under the BSD License.
3
 
 | See http://snowball.tartarus.org/license.php
4
 
 | Also see http://www.opensource.org/licenses/bsd-license.html
5
 
 |  - Encoding was converted to UTF-8.
6
 
 |  - This notice was added.
7
 
 
8
 
 | An English stop word list. Comments begin with vertical bar. Each stop
9
 
 | word is at the start of a line.
10
 
 
11
 
 | Many of the forms below are quite rare (e.g. "yourselves") but included for
12
 
 |  completeness.
13
 
 
14
 
           | PRONOUNS FORMS
15
 
             | 1st person sing
16
 
 
17
 
i              | subject, always in upper case of course
18
 
 
19
 
me             | object
20
 
my             | possessive adjective
21
 
               | the possessive pronoun `mine' is best suppressed, because of the
22
 
               | sense of coal-mine etc.
23
 
myself         | reflexive
24
 
             | 1st person plural
25
 
we             | subject
26
 
 
27
 
| us           | object
28
 
               | care is required here because US = United States. It is usually
29
 
               | safe to remove it if it is in lower case.
30
 
our            | possessive adjective
31
 
ours           | possessive pronoun
32
 
ourselves      | reflexive
33
 
             | second person (archaic `thou' forms not included)
34
 
you            | subject and object
35
 
your           | possessive adjective
36
 
yours          | possessive pronoun
37
 
yourself       | reflexive (singular)
38
 
yourselves     | reflexive (plural)
39
 
             | third person singular
40
 
he             | subject
41
 
him            | object
42
 
his            | possessive adjective and pronoun
43
 
himself        | reflexive
44
 
 
45
 
she            | subject
46
 
her            | object and possessive adjective
47
 
hers           | possessive pronoun
48
 
herself        | reflexive
49
 
 
50
 
it             | subject and object
51
 
its            | possessive adjective
52
 
itself         | reflexive
53
 
             | third person plural
54
 
they           | subject
55
 
them           | object
56
 
their          | possessive adjective
57
 
theirs         | possessive pronoun
58
 
themselves     | reflexive
59
 
             | other forms (demonstratives, interrogatives)
60
 
what
61
 
which
62
 
who
63
 
whom
64
 
this
65
 
that
66
 
these
67
 
those
68
 
 
69
 
           | VERB FORMS (using F.R. Palmer's nomenclature)
70
 
             | BE
71
 
am             | 1st person, present
72
 
is             | -s form (3rd person, present)
73
 
are            | present
74
 
was            | 1st person, past
75
 
were           | past
76
 
be             | infinitive
77
 
been           | past participle
78
 
being          | -ing form
79
 
             | HAVE
80
 
have           | simple
81
 
has            | -s form
82
 
had            | past
83
 
having         | -ing form
84
 
             | DO
85
 
do             | simple
86
 
does           | -s form
87
 
did            | past
88
 
doing          | -ing form
89
 
 
90
 
 | The forms below are, I believe, best omitted, because of the significant
91
 
 | homonym forms:
92
 
 
93
 
 |  He made a WILL
94
 
 |  old tin CAN
95
 
 |  merry month of MAY
96
 
 |  a smell of MUST
97
 
 |  fight the good fight with all thy MIGHT
98
 
 
99
 
 | would, could, should, ought might however be included
100
 
 
101
 
 |          | AUXILIARIES
102
 
 |            | WILL
103
 
 |will
104
 
 
105
 
would
106
 
 
107
 
 |            | SHALL
108
 
 |shall
109
 
 
110
 
should
111
 
 
112
 
 |            | CAN
113
 
 |can
114
 
 
115
 
could
116
 
 
117
 
 |            | MAY
118
 
 |may
119
 
 |might
120
 
 |            | MUST
121
 
 |must
122
 
 |            | OUGHT
123
 
 
124
 
ought
125
 
 
126
 
           | COMPOUND FORMS, increasingly encountered nowadays in 'formal' writing
127
 
              | pronoun + verb
128
 
 
129
 
i'm
130
 
you're
131
 
he's
132
 
she's
133
 
it's
134
 
we're
135
 
they're
136
 
i've
137
 
you've
138
 
we've
139
 
they've
140
 
i'd
141
 
you'd
142
 
he'd
143
 
she'd
144
 
we'd
145
 
they'd
146
 
i'll
147
 
you'll
148
 
he'll
149
 
she'll
150
 
we'll
151
 
they'll
152
 
 
153
 
              | verb + negation
154
 
 
155
 
isn't
156
 
aren't
157
 
wasn't
158
 
weren't
159
 
hasn't
160
 
haven't
161
 
hadn't
162
 
doesn't
163
 
don't
164
 
didn't
165
 
 
166
 
              | auxiliary + negation
167
 
 
168
 
won't
169
 
wouldn't
170
 
shan't
171
 
shouldn't
172
 
can't
173
 
cannot
174
 
couldn't
175
 
mustn't
176
 
 
177
 
             | miscellaneous forms
178
 
 
179
 
let's
180
 
that's
181
 
who's
182
 
what's
183
 
here's
184
 
there's
185
 
when's
186
 
where's
187
 
why's
188
 
how's
189
 
 
190
 
              | rarer forms
191
 
 
192
 
 | daren't needn't
193
 
 
194
 
              | doubtful forms
195
 
 
196
 
 | oughtn't mightn't
197
 
 
198
 
           | ARTICLES
199
 
a
200
 
an
201
 
the
202
 
 
203
 
           | THE REST (Overlap among prepositions, conjunctions, adverbs etc is so
204
 
           | high, that classification is pointless.)
205
 
and
206
 
but
207
 
if
208
 
or
209
 
because
210
 
as
211
 
until
212
 
while
213
 
 
214
 
of
215
 
at
216
 
by
217
 
for
218
 
with
219
 
about
220
 
against
221
 
between
222
 
into
223
 
through
224
 
during
225
 
before
226
 
after
227
 
above
228
 
below
229
 
to
230
 
from
231
 
up
232
 
down
233
 
in
234
 
out
235
 
on
236
 
off
237
 
over
238
 
under
239
 
 
240
 
again
241
 
further
242
 
then
243
 
once
244
 
 
245
 
here
246
 
there
247
 
when
248
 
where
249
 
why
250
 
how
251
 
 
252
 
all
253
 
any
254
 
both
255
 
each
256
 
few
257
 
more
258
 
most
259
 
other
260
 
some
261
 
such
262
 
 
263
 
no
264
 
nor
265
 
not
266
 
only
267
 
own
268
 
same
269
 
so
270
 
than
271
 
too
272
 
very
273
 
 
274
 
 | Just for the record, the following words are among the commonest in English
275
 
 
276
 
    | one
277
 
    | every
278
 
    | least
279
 
    | less
280
 
    | many
281
 
    | now
282
 
    | ever
283
 
    | never
284
 
    | say
285
 
    | says
286
 
    | said
287
 
    | also
288
 
    | get
289
 
    | go
290
 
    | goes
291
 
    | just
292
 
    | made
293
 
    | make
294
 
    | put
295
 
    | see
296
 
    | seen
297
 
    | whether
298
 
    | like
299
 
    | well
300
 
    | back
301
 
    | even
302
 
    | still
303
 
    | way
304
 
    | take
305
 
    | since
306
 
    | another
307
 
    | however
308
 
    | two
309
 
    | three
310
 
    | four
311
 
    | five
312
 
    | first
313
 
    | second
314
 
    | new
315
 
    | old
316
 
    | high
317
 
    | long