|
Carrot2 v3.5.2
API Documentation |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline
public class BasicPreprocessingPipeline
Performs basic preprocessing steps on the provided documents. The preprocessing consists of the following steps:
Tokenizer.tokenize(PreprocessingContext)CaseNormalizer.normalize(PreprocessingContext)LanguageModelStemmer.stem(PreprocessingContext)StopListMarker.mark(PreprocessingContext)
| Field Summary | |
|---|---|
CaseNormalizer |
caseNormalizer
Case normalizer used by the algorithm, contains bindable attributes. |
LanguageModelStemmer |
languageModelStemmer
Stemmer used by the algorithm, contains bindable attributes. |
ILexicalDataFactory |
lexicalDataFactory
Lexical data factory. |
IStemmerFactory |
stemmerFactory
Stemmer factory. |
StopListMarker |
stopListMarker
Stop list marker used by the algorithm, contains bindable attributes. |
Tokenizer |
tokenizer
Tokenizer used by the algorithm, contains bindable attributes. |
ITokenizerFactory |
tokenizerFactory
Tokenizer factory. |
| Constructor Summary | |
|---|---|
BasicPreprocessingPipeline()
|
|
| Method Summary | |
|---|---|
PreprocessingContext |
preprocess(List<Document> documents,
String query,
LanguageCode language)
Performs preprocessing on the provided list of documents. |
void |
preprocess(PreprocessingContext context)
Performs preprocessing on the provided PreprocessingContext. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public final Tokenizer tokenizer
public final CaseNormalizer caseNormalizer
public final LanguageModelStemmer languageModelStemmer
public final StopListMarker stopListMarker
public ITokenizerFactory tokenizerFactory
public IStemmerFactory stemmerFactory
public ILexicalDataFactory lexicalDataFactory
| Constructor Detail |
|---|
public BasicPreprocessingPipeline()
| Method Detail |
|---|
public PreprocessingContext preprocess(List<Document> documents,
String query,
LanguageCode language)
PreprocessingContext.
preprocess in interface IPreprocessingPipelinepublic void preprocess(PreprocessingContext context)
PreprocessingContext.
preprocess in interface IPreprocessingPipeline
|
Please refer to project documentation at
http://project.carrot2.org |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||