org.carrot2.text.preprocessing.filter
Class CompleteLabelFilterDescriptor.AttributeBuilder

java.lang.Object
  extended by org.carrot2.text.preprocessing.filter.CompleteLabelFilterDescriptor.AttributeBuilder
Enclosing class:
CompleteLabelFilterDescriptor

public static class CompleteLabelFilterDescriptor.AttributeBuilder
extends Object

Attribute map builder for the CompleteLabelFilter component. You can use this builder as a type-safe alternative to populating the attribute map using attribute keys.


Field Summary
 Map<String,Object> map
          The attribute map populated by this builder.
 
Constructor Summary
protected CompleteLabelFilterDescriptor.AttributeBuilder(Map<String,Object> map)
          Creates a builder backed by the provided map.
 
Method Summary
 CompleteLabelFilterDescriptor.AttributeBuilder enabled(boolean value)
          Remove truncated phrases.
 CompleteLabelFilterDescriptor.AttributeBuilder labelOverrideThreshold(double value)
          Truncated label threshold.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

map

public final Map<String,Object> map
The attribute map populated by this builder.

Constructor Detail

CompleteLabelFilterDescriptor.AttributeBuilder

protected CompleteLabelFilterDescriptor.AttributeBuilder(Map<String,Object> map)
Creates a builder backed by the provided map.

Method Detail

enabled

public CompleteLabelFilterDescriptor.AttributeBuilder enabled(boolean value)
Remove truncated phrases. Tries to remove "incomplete" cluster labels. For example, in a collection of documents related to Data Mining, the phrase Conference on Data is incomplete in a sense that most likely it should be Conference on Data Mining or even Conference on Data Mining in Large Databases. When truncated phrase removal is enabled, the algorithm would try to remove the "incomplete" phrases like the former one and leave only the more informative variants.

See Also:
CompleteLabelFilter.enabled

labelOverrideThreshold

public CompleteLabelFilterDescriptor.AttributeBuilder labelOverrideThreshold(double value)
Truncated label threshold. Determines the strength of the truncated label filter. The lowest value means strongest truncated labels elimination, which may lead to overlong cluster labels and many unclustered documents. The highest value effectively disables the filter, which may result in short or truncated labels.

See Also:
CompleteLabelFilter.labelOverrideThreshold


Copyright (c) Dawid Weiss, Stanislaw Osinski