org.carrot2.output.metrics
Class ContaminationMetric

java.lang.Object
  extended by org.carrot2.output.metrics.ContaminationMetric
All Implemented Interfaces:
IClusteringMetric

public class ContaminationMetric
extends Object

Computes cluster contamination. If a cluster groups documents found in the same Document.PARTITIONS, its contamination is 0. If a cluster groups an equally distributed mix of all partitions, its contamination is 1.0. For a full definition, please see section 4.4.1 of this work.

Contamination is calculated for top-level clusters only, taking into account documents from the cluster and all subclusters. Finally, contamination will be calculated only if all input documents have non-blank Document.PARTITIONSs.


Field Summary
 List<Cluster> clusters
           
static String CONTAMINATION
          Key for the contamination value of a cluster.
 List<Document> documents
           
 boolean enabled
          Calculate contamination metric.
 String partitionIdFieldName
          Partition id field name.
 double weightedAverageContamination
          Average contamination of the whole cluster set, weighted by the size of cluster.
 
Constructor Summary
ContaminationMetric()
           
 
Method Summary
 void calculate()
          Triggers calculation of the metric.
 boolean isEnabled()
          Return true if this metric should be calculated.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CONTAMINATION

public static final String CONTAMINATION
Key for the contamination value of a cluster.

See Also:
Constant Field Values

weightedAverageContamination

public double weightedAverageContamination
Average contamination of the whole cluster set, weighted by the size of cluster.


enabled

public boolean enabled
Calculate contamination metric.


documents

public List<Document> documents

clusters

public List<Cluster> clusters

partitionIdFieldName

public String partitionIdFieldName
Partition id field name.

Constructor Detail

ContaminationMetric

public ContaminationMetric()
Method Detail

calculate

public void calculate()
Description copied from interface: IClusteringMetric
Triggers calculation of the metric. All Processing Input attributes will have been bound before a call to this method.


isEnabled

public boolean isEnabled()
Description copied from interface: IClusteringMetric
Return true if this metric should be calculated.



Copyright (c) Dawid Weiss, Stanislaw Osinski