Links to resources

Quick Introduction
An account of Snowball
How You Can Help

Snowball
the manual
how to run it

Tar gzipped files of Snowball sources

stemmers
English (porter)
English (porter2)
A note on early English
Romance stemmers:
French
Spanish
Portuguese
Italian
Romanian
Germanic stemmers
German
(German variant)
Dutch
Scandinavian stemmers
Swedish
Norwegian
Danish
Russian
Finnish
Character codes

Contributed stemmers in other programming languages

Wrappers

External Contributions
Object Pascal codegenerator for Snowball
Two stemmers for Romanian
Hungarian
Turkish
Armenian
Basque (Euskera)
Catalan

Other work
The Schinke Latin stemmer
The Lovins English stemmer
The Kraaij/Pohlmann Dutch stemmer


Snowball is a small string processing language designed for creating stemming algorithms for use in Information Retrieval. This site describes Snowball, and presents several useful stemmers which have been implemented using it.



(Since it effectively provides a ‘suffix STRIPPER GRAMmar’, I had toyed with the idea of calling it ‘strippergram’, but good sense has prevailed, and so it is ‘Snowball’ named as a tribute to SNOBOL, the excellent string handling language of Messrs Farber, Griswold, Poage and Polonsky from the 1960s.

- Martin Porter)


Please address all Snowball-related mail to snowball-discuss@lists.tartarus.org. Any such mail sent directly to Martin Porter or Richard Boulton may be answered less speedily, and in any case they reserve the right to post their answers on snowball-discuss.