org.carrot2.matrix.factorization
Class LocalNonnegativeMatrixFactorization

java.lang.Object
  extended by org.carrot2.matrix.factorization.LocalNonnegativeMatrixFactorization
All Implemented Interfaces:
IIterativeMatrixFactorization, IMatrixFactorization

public class LocalNonnegativeMatrixFactorization
extends Object

Performs matrix factorization using the Local Non-negative Matrix Factorization algorithm with minimization of the Kullback-Leibler divergence between A and UV' and multiplicative updating.


Field Summary
protected  org.apache.mahout.math.matrix.DoubleMatrix2D A
          Input matrix
protected  double[] aggregates
          Sorting aggregates
protected  double approximationError
          Current approximation error
protected  double[] approximationErrors
          Approximation errors during subsequent iterations
protected static int DEFAULT_K
           
protected static int DEFAULT_MAX_ITERATIONS
           
protected static boolean DEFAULT_ORDERED
           
protected static ISeedingStrategy DEFAULT_SEEDING_STRATEGY
           
protected static double DEFAULT_STOP_THRESHOLD
           
protected  int iterationsCompleted
          Iteration counter
protected  int k
          The desired number of base vectors
protected  int maxIterations
          The maximum number of iterations the algorithm is allowed to run
protected  boolean ordered
          Order base vectors according to their 'activity'?
protected  ISeedingStrategy seedingStrategy
          Seeding strategy
protected  double stopThreshold
          If the percentage decrease in approximation error becomes smaller than stopThreshold, the algorithm will stop.
protected  org.apache.mahout.math.matrix.DoubleMatrix2D U
          Base vector result matrix
protected  org.apache.mahout.math.matrix.DoubleMatrix2D V
          Coefficient result matrix
 
Constructor Summary
LocalNonnegativeMatrixFactorization(org.apache.mahout.math.matrix.DoubleMatrix2D A)
          Creates the LocalNonnegativeMatrixFactorization object for matrix A.
 
Method Summary
 void compute()
          Computes the factorization.
 double[] getAggregates()
          Returns column aggregates for a sorted factorization, and null for an unsorted factorization.
 double getApproximationError()
          Returns final approximation error or -1 if the approximation error calculation has been turned off (see setMaxIterations(int).
 double[] getApproximationErrors()
          Returns an array of approximation errors during after subsequent iterations of the algorithm.
 int getIterationsCompleted()
          Returns the number of iterations the algorithm has completed.
 int getK()
          Returns the number of base vectors k .
 int getMaxIterations()
          Returns the maximum number of iterations the algorithm is allowed to run.
 ISeedingStrategy getSeedingStrategy()
          Returns current ISeedingStrategy.
 double getStopThreshold()
          Returns the algorithms stopThreshold.
 org.apache.mahout.math.matrix.DoubleMatrix2D getU()
          Returns the U matrix (base vectors matrix).
 org.apache.mahout.math.matrix.DoubleMatrix2D getV()
          Returns the V matrix (coefficient matrix)
 boolean isOrdered()
          Returns true when the factorization is set to generate an ordered basis.
protected  void order()
          Orders U and V matrices according to the 'activity' of base vectors.
 void setK(int k)
          Sets the number of base vectors k .
 void setMaxIterations(int maxIterations)
          Sets the maximum number of iterations the algorithm is allowed to run.
 void setOrdered(boolean ordered)
          Set to true to generate an ordered basis.
 void setSeedingStrategy(ISeedingStrategy seedingStrategy)
          Sets new ISeedingStrategy.
 void setStopThreshold(double stopThreshold)
          Sets the algorithms stopThreshold.
 String toString()
           
protected  boolean updateApproximationError()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.carrot2.matrix.factorization.IMatrixFactorization
getU, getV
 

Field Detail

k

protected int k
The desired number of base vectors


DEFAULT_K

protected static int DEFAULT_K

maxIterations

protected int maxIterations
The maximum number of iterations the algorithm is allowed to run


DEFAULT_MAX_ITERATIONS

protected static final int DEFAULT_MAX_ITERATIONS
See Also:
Constant Field Values

stopThreshold

protected double stopThreshold
If the percentage decrease in approximation error becomes smaller than stopThreshold, the algorithm will stop. Note: calculation of approximation error is quite costly. Setting the threshold to -1 turns off approximation error calculation and hence makes the algorithm do the maximum number of iterations.


DEFAULT_STOP_THRESHOLD

protected static double DEFAULT_STOP_THRESHOLD

seedingStrategy

protected ISeedingStrategy seedingStrategy
Seeding strategy


DEFAULT_SEEDING_STRATEGY

protected static final ISeedingStrategy DEFAULT_SEEDING_STRATEGY

ordered

protected boolean ordered
Order base vectors according to their 'activity'?


DEFAULT_ORDERED

protected static final boolean DEFAULT_ORDERED
See Also:
Constant Field Values

approximationError

protected double approximationError
Current approximation error


approximationErrors

protected double[] approximationErrors
Approximation errors during subsequent iterations


iterationsCompleted

protected int iterationsCompleted
Iteration counter


aggregates

protected double[] aggregates
Sorting aggregates


A

protected org.apache.mahout.math.matrix.DoubleMatrix2D A
Input matrix


U

protected org.apache.mahout.math.matrix.DoubleMatrix2D U
Base vector result matrix


V

protected org.apache.mahout.math.matrix.DoubleMatrix2D V
Coefficient result matrix

Constructor Detail

LocalNonnegativeMatrixFactorization

public LocalNonnegativeMatrixFactorization(org.apache.mahout.math.matrix.DoubleMatrix2D A)
Creates the LocalNonnegativeMatrixFactorization object for matrix A. Before accessing results, perform computations by calling the compute()method.

Parameters:
A - matrix to be factorized
Method Detail

compute

public void compute()
Computes the factorization.


toString

public String toString()
Overrides:
toString in class Object

setK

public void setK(int k)
Sets the number of base vectors k .

Parameters:
k - the number of base vectors

getK

public int getK()
Returns the number of base vectors k .


updateApproximationError

protected boolean updateApproximationError()
Returns:
true if the decrease in the approximation error is smaller than the stopThreshold

order

protected void order()
Orders U and V matrices according to the 'activity' of base vectors.


getSeedingStrategy

public ISeedingStrategy getSeedingStrategy()
Returns current ISeedingStrategy.


setSeedingStrategy

public void setSeedingStrategy(ISeedingStrategy seedingStrategy)
Sets new ISeedingStrategy.


getMaxIterations

public int getMaxIterations()
Returns the maximum number of iterations the algorithm is allowed to run.


setMaxIterations

public void setMaxIterations(int maxIterations)
Sets the maximum number of iterations the algorithm is allowed to run.


getStopThreshold

public double getStopThreshold()
Returns the algorithms stopThreshold. If the percentage decrease in approximation error becomes smaller than stopThreshold, the algorithm will stop.


setStopThreshold

public void setStopThreshold(double stopThreshold)
Sets the algorithms stopThreshold. If the percentage decrease in approximation error becomes smaller than stopThreshold, the algorithm will stop.

Note: calculation of approximation error is quite costly. Setting the threshold to -1 turns off calculation of the approximation error and hence makes the algorithm do the maximum allowed number of iterations.


getApproximationError

public double getApproximationError()
Returns final approximation error or -1 if the approximation error calculation has been turned off (see setMaxIterations(int).

Specified by:
getApproximationError in interface IIterativeMatrixFactorization
Returns:
final approximation error or -1

getApproximationErrors

public double[] getApproximationErrors()
Returns an array of approximation errors during after subsequent iterations of the algorithm. Element 0 of the array contains the approximation error before the first iteration. The array is null if the approximation error calculation has been turned off (see setMaxIterations(int).


getIterationsCompleted

public int getIterationsCompleted()
Description copied from interface: IIterativeMatrixFactorization
Returns the number of iterations the algorithm has completed.

Specified by:
getIterationsCompleted in interface IIterativeMatrixFactorization
Returns:
the number of iterations the algorithm has completed

isOrdered

public boolean isOrdered()
Returns true when the factorization is set to generate an ordered basis.


setOrdered

public void setOrdered(boolean ordered)
Set to true to generate an ordered basis.


getAggregates

public double[] getAggregates()
Returns column aggregates for a sorted factorization, and null for an unsorted factorization.


getU

public org.apache.mahout.math.matrix.DoubleMatrix2D getU()
Description copied from interface: IMatrixFactorization
Returns the U matrix (base vectors matrix).

Specified by:
getU in interface IMatrixFactorization
Returns:
U matrix

getV

public org.apache.mahout.math.matrix.DoubleMatrix2D getV()
Description copied from interface: IMatrixFactorization
Returns the V matrix (coefficient matrix)

Specified by:
getV in interface IMatrixFactorization
Returns:
V matrix


Copyright (c) Dawid Weiss, Stanislaw Osinski