org.carrot2.clustering.lingo
Class LingoClusteringAlgorithm

java.lang.Object
  extended by org.carrot2.core.ProcessingComponentBase
      extended by org.carrot2.clustering.lingo.LingoClusteringAlgorithm
All Implemented Interfaces:
IClusteringAlgorithm, IProcessingComponent

public class LingoClusteringAlgorithm
extends ProcessingComponentBase
implements IClusteringAlgorithm

Lingo clustering algorithm.


Field Summary
 ClusterBuilder clusterBuilder
          Cluster label builder, contains bindable attributes.
 List<Cluster> clusters
           
 List<Document> documents
          Documents to cluster.
 LabelFormatter labelFormatter
          Cluster label formatter, contains bindable attributes.
 TermDocumentMatrixBuilder matrixBuilder
          Term-document matrix builder for the algorithm, contains bindable attributes.
 TermDocumentMatrixReducer matrixReducer
          Term-document matrix reducer for the algorithm, contains bindable attributes.
 MultilingualClustering multilingualClustering
          A helper for performing multilingual clustering.
 boolean nativeMatrixUsed
          Indicates whether Lingo used fast native matrix computation routines.
 CompletePreprocessingPipeline preprocessingPipeline
          Common preprocessing tasks handler.
 String query
          Query that produced the documents.
 double scoreWeight
          Balance between cluster score and size during cluster sorting.
 
Constructor Summary
LingoClusteringAlgorithm()
           
 
Method Summary
 void init(IControllerContext context)
          Invoked after component's attributes marked with Init and Input annotations have been bound, but before calls to any other methods of this component.
 void process()
          Performs Lingo clustering of documents.
 
Methods inherited from class org.carrot2.core.ProcessingComponentBase
afterProcessing, beforeProcessing, dispose, getContext, getSharedExecutor
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.carrot2.core.IProcessingComponent
afterProcessing, beforeProcessing, dispose
 

Field Detail

query

public String query
Query that produced the documents. The query will help the algorithm to create better clusters. Therefore, providing the query is optional but desirable.


documents

public List<Document> documents
Documents to cluster.


clusters

public List<Cluster> clusters

nativeMatrixUsed

public boolean nativeMatrixUsed
Indicates whether Lingo used fast native matrix computation routines. Value of this attribute is equal to NNIInterface.isNativeBlasAvailable() at the time of running the algorithm.

Attribute label:
Native matrix operations used
Attribute group:
Matrix model

scoreWeight

public double scoreWeight
Balance between cluster score and size during cluster sorting. Value equal to 0.0 will cause Lingo to sort clusters based only on cluster size. Value equal to 1.0 will cause Lingo to sort clusters based only on cluster score.

Attribute label:
Size-Score sorting ratio
Attribute level:
Medium
Attribute group:
Clusters

preprocessingPipeline

public CompletePreprocessingPipeline preprocessingPipeline
Common preprocessing tasks handler.


matrixBuilder

public TermDocumentMatrixBuilder matrixBuilder
Term-document matrix builder for the algorithm, contains bindable attributes.


matrixReducer

public TermDocumentMatrixReducer matrixReducer
Term-document matrix reducer for the algorithm, contains bindable attributes.


clusterBuilder

public ClusterBuilder clusterBuilder
Cluster label builder, contains bindable attributes.


labelFormatter

public LabelFormatter labelFormatter
Cluster label formatter, contains bindable attributes.


multilingualClustering

public MultilingualClustering multilingualClustering
A helper for performing multilingual clustering.

Constructor Detail

LingoClusteringAlgorithm

public LingoClusteringAlgorithm()
Method Detail

init

public void init(IControllerContext context)
Description copied from interface: IProcessingComponent
Invoked after component's attributes marked with Init and Input annotations have been bound, but before calls to any other methods of this component. After a call to this method completes without an exception, attributes marked with Init Output will be collected. In this method, components should perform initializations based on the initialization-time attributes. This method is called once in the life time of a processing component instance.

Specified by:
init in interface IProcessingComponent
Overrides:
init in class ProcessingComponentBase
Parameters:
context - An instance of IControllerContext of the controller to which this component instance will be bound.

process

public void process()
             throws ProcessingException
Performs Lingo clustering of documents.

Specified by:
process in interface IProcessingComponent
Overrides:
process in class ProcessingComponentBase
Throws:
ProcessingException - when processing failed. If thrown, the IProcessingComponent.afterProcessing() method will be called and the component will be ready to accept further requests or to be disposed of. Finally, the exception will be rethrown from the controller method that caused the component to perform processing.


Copyright (c) Dawid Weiss, Stanislaw Osinski