org.carrot2.text.preprocessing
Class PreprocessingContext.AllPhrases

java.lang.Object
  extended by org.carrot2.text.preprocessing.PreprocessingContext.AllPhrases
Enclosing class:
PreprocessingContext

public class PreprocessingContext.AllPhrases
extends Object

Information about all frequently appearing sequences of words found in the input PreprocessingContext.documents. Each entry in each array corresponds to one sequence.

All arrays in this class have the same length and values across different arrays correspond to each other for the same index.


Field Summary
 int[] tf
          Term frequency of the phrase.
 int[][] tfByDocument
          Term frequency of the phrase for each document.
 int[][] wordIndices
          Pointers to PreprocessingContext.AllWords for each word in the phrase sequence.
 
Constructor Summary
PreprocessingContext.AllPhrases()
           
 
Method Summary
 CharSequence getPhrase(int index)
          Returns space-separated words that constitute this phrase.
 int size()
          Returns length of all arrays in this PreprocessingContext.AllPhrases.
 String toString()
          For debugging purposes.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

wordIndices

public int[][] wordIndices
Pointers to PreprocessingContext.AllWords for each word in the phrase sequence.

This array is produced by PhraseExtractor.


tf

public int[] tf
Term frequency of the phrase.

This array is produced by PhraseExtractor.


tfByDocument

public int[][] tfByDocument
Term frequency of the phrase for each document. The encoding of this array is similar to PreprocessingContext.AllWords.tfByDocument: consecutive pairs of: document index, frequency.

This array is produced by PhraseExtractor.

Constructor Detail

PreprocessingContext.AllPhrases

public PreprocessingContext.AllPhrases()
Method Detail

toString

public String toString()
For debugging purposes.

Overrides:
toString in class Object

getPhrase

public CharSequence getPhrase(int index)
Returns space-separated words that constitute this phrase.


size

public int size()
Returns length of all arrays in this PreprocessingContext.AllPhrases.



Copyright (c) Dawid Weiss, Stanislaw Osinski