org.carrot2.text.preprocessing
Class Tokenizer

java.lang.Object
  extended by org.carrot2.text.preprocessing.Tokenizer

public final class Tokenizer
extends Object

Performs tokenization of documents.

This class saves the following results to the PreprocessingContext:


Field Summary
 Collection<String> documentFields
          Textual fields of documents that should be tokenized and parsed for clustering.
 
Constructor Summary
Tokenizer()
           
 
Method Summary
 void tokenize(PreprocessingContext context)
          Performs tokenization and saves the results to the context.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

documentFields

public Collection<String> documentFields
Textual fields of documents that should be tokenized and parsed for clustering.

Attribute label:
Document fields
Attribute level:
Advanced
Attribute group:
Preprocessing
Constructor Detail

Tokenizer

public Tokenizer()
Method Detail

tokenize

public void tokenize(PreprocessingContext context)
Performs tokenization and saves the results to the context.



Copyright (c) Dawid Weiss, Stanislaw Osinski