org.carrot2.clustering.synthetic
Class ByFieldClusteringAlgorithm

java.lang.Object
  extended by org.carrot2.core.ProcessingComponentBase
      extended by org.carrot2.clustering.synthetic.ByFieldClusteringAlgorithm
All Implemented Interfaces:
IClusteringAlgorithm, IProcessingComponent

public class ByFieldClusteringAlgorithm
extends ProcessingComponentBase
implements IClusteringAlgorithm

Clusters documents into a flat structure based on the values of some field of the documents. By default the Document.SOURCES field is used.

Attribute label:
By Attribute Clustering

Field Summary
 List<Cluster> clusters
          Clusters created by the algorithm.
 List<Document> documents
          Documents to cluster.
 String fieldName
          Name of the field to cluster by.
 
Constructor Summary
ByFieldClusteringAlgorithm()
           
 
Method Summary
protected  String buildClusterLabel(Object fieldValue)
          Builds cluster label based on the field value.
 void process()
          Performs by URL clustering.
 
Methods inherited from class org.carrot2.core.ProcessingComponentBase
afterProcessing, beforeProcessing, dispose, getContext, getSharedExecutor, init
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.carrot2.core.IProcessingComponent
afterProcessing, beforeProcessing, dispose, init
 

Field Detail

documents

public List<Document> documents
Documents to cluster.


clusters

public List<Cluster> clusters
Clusters created by the algorithm.


fieldName

public String fieldName
Name of the field to cluster by. Each non-null scalar field value with distinct hash code will give rise to a single cluster, named using the value returned by buildClusterLabel(Object). If the field value is a collection, the document will be assigned to all clusters corresponding to the values in the collection. Note that arrays will not be 'unfolded' in this way.

Attribute label:
Field name
Attribute level:
Basic
Attribute group:
Field
Constructor Detail

ByFieldClusteringAlgorithm

public ByFieldClusteringAlgorithm()
Method Detail

process

public void process()
             throws ProcessingException
Performs by URL clustering.

Specified by:
process in interface IProcessingComponent
Overrides:
process in class ProcessingComponentBase
Throws:
ProcessingException - when processing failed. If thrown, the IProcessingComponent.afterProcessing() method will be called and the component will be ready to accept further requests or to be disposed of. Finally, the exception will be rethrown from the controller method that caused the component to perform processing.

buildClusterLabel

protected String buildClusterLabel(Object fieldValue)
Builds cluster label based on the field value. This implementation returns fieldValue.toString().



Copyright (c) Dawid Weiss, Stanislaw Osinski