|
Carrot2 v3.6.0-SNAPSHOT
API Documentation |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.carrot2.text.preprocessing.PreprocessingContext
public final class PreprocessingContext
Document preprocessing context provides low-level (usually integer-coded) data structures useful for further processing.

| Nested Class Summary | |
|---|---|
static class |
PreprocessingContext.AllFields
Information about all fields processed for the input documents. |
class |
PreprocessingContext.AllLabels
Information about words and phrases that might be good cluster label candidates. |
class |
PreprocessingContext.AllPhrases
Information about all frequently appearing sequences of words found in the input documents. |
class |
PreprocessingContext.AllStems
Information about all unique stems found in the input documents. |
class |
PreprocessingContext.AllTokens
Information about all tokens of the input documents. |
class |
PreprocessingContext.AllWords
Information about all unique words found in the input documents. |
| Field Summary | |
|---|---|
PreprocessingContext.AllFields |
allFields
Information about all fields processed for the input documents. |
PreprocessingContext.AllLabels |
allLabels
Information about words and phrases that might be good cluster label candidates. |
PreprocessingContext.AllPhrases |
allPhrases
Information about all frequently appearing sequences of words found in the input documents. |
PreprocessingContext.AllStems |
allStems
Information about all unique stems found in the input documents. |
PreprocessingContext.AllTokens |
allTokens
Information about all tokens of the input documents. |
PreprocessingContext.AllWords |
allWords
Information about all unique words found in the input documents. |
List<Document> |
documents
A list of documents to process. |
LanguageModel |
language
Language model to be used |
String |
query
Query used to perform processing, may be null |
| Constructor Summary | |
|---|---|
PreprocessingContext(LanguageModel languageModel,
List<Document> documents,
String query)
Creates a preprocessing context for the provided documents and with
the provided languageModel. |
|
| Method Summary | |
|---|---|
boolean |
hasLabels()
Returns true if this context contains any label candidates. |
boolean |
hasWords()
Returns true if this context contains any words. |
char[] |
intern(MutableCharArray chs)
Return a unique char buffer representing a given character sequence. |
void |
preprocessingFinished()
This method should be invoked after all preprocessing contributors have been executed to release temporary data structures. |
static int[] |
toFieldIndexes(byte b)
Convert the selected bits in a byte to an array of indexes. |
String |
toString()
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public final String query
null
public final List<Document> documents
public final LanguageModel language
public final PreprocessingContext.AllTokens allTokens
documents.
public final PreprocessingContext.AllFields allFields
documents.
public final PreprocessingContext.AllWords allWords
documents.
public final PreprocessingContext.AllStems allStems
documents.
public PreprocessingContext.AllPhrases allPhrases
documents.
public final PreprocessingContext.AllLabels allLabels
| Constructor Detail |
|---|
public PreprocessingContext(LanguageModel languageModel,
List<Document> documents,
String query)
documents and with
the provided languageModel.
| Method Detail |
|---|
public boolean hasWords()
true if this context contains any words.
public boolean hasLabels()
true if this context contains any label candidates.
public String toString()
toString in class Objectpublic static int[] toFieldIndexes(byte b)
public void preprocessingFinished()
public char[] intern(MutableCharArray chs)
|
Please refer to project documentation at
http://project.carrot2.org |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||