org.apache.mahout.cf.taste.impl.recommender.svd
Class ParallelSGDFactorizer

java.lang.Object
  extended by org.apache.mahout.cf.taste.impl.recommender.svd.AbstractFactorizer
      extended by org.apache.mahout.cf.taste.impl.recommender.svd.ParallelSGDFactorizer
All Implemented Interfaces:
Refreshable, Factorizer

public class ParallelSGDFactorizer
extends AbstractFactorizer

Minimalistic implementation of Parallel SGD factorizer based on "Scalable Collaborative Filtering Approaches for Large Recommender Systems" and "Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent"


Nested Class Summary
protected static class ParallelSGDFactorizer.PreferenceShuffler
           
 
Field Summary
protected  double[][] itemVectors
          item features
protected  double[][] userVectors
          user features
 
Constructor Summary
ParallelSGDFactorizer(DataModel dataModel, int numFeatures, double lambda, int numEpochs)
           
ParallelSGDFactorizer(DataModel dataModel, int numFeatures, double lambda, int numIterations, double mu0, double decayFactor, int stepOffset, double forgettingExponent)
           
ParallelSGDFactorizer(DataModel dataModel, int numFeatures, double lambda, int numIterations, double mu0, double decayFactor, int stepOffset, double forgettingExponent, double biasMuRatio, double biasLambdaRatio)
           
ParallelSGDFactorizer(DataModel dataModel, int numFeatures, double lambda, int numIterations, double mu0, double decayFactor, int stepOffset, double forgettingExponent, double biasMuRatio, double biasLambdaRatio, int numThreads)
           
ParallelSGDFactorizer(DataModel dataModel, int numFeatures, double lambda, int numIterations, double mu0, double decayFactor, int stepOffset, double forgettingExponent, int numThreads)
           
 
Method Summary
 Factorization factorize()
           
protected  void initialize()
           
protected  void update(Preference preference, double mu)
          TODO: this is the vanilla sgd by Tacaks 2009, I speculate that using scaling technique proposed in: Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent section 5, page 6 can be beneficial in term s of both speed and accuracy.
 
Methods inherited from class org.apache.mahout.cf.taste.impl.recommender.svd.AbstractFactorizer
createFactorization, itemIndex, refresh, userIndex
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

userVectors

protected volatile double[][] userVectors
user features


itemVectors

protected volatile double[][] itemVectors
item features

Constructor Detail

ParallelSGDFactorizer

public ParallelSGDFactorizer(DataModel dataModel,
                             int numFeatures,
                             double lambda,
                             int numEpochs)
                      throws TasteException
Throws:
TasteException

ParallelSGDFactorizer

public ParallelSGDFactorizer(DataModel dataModel,
                             int numFeatures,
                             double lambda,
                             int numIterations,
                             double mu0,
                             double decayFactor,
                             int stepOffset,
                             double forgettingExponent)
                      throws TasteException
Throws:
TasteException

ParallelSGDFactorizer

public ParallelSGDFactorizer(DataModel dataModel,
                             int numFeatures,
                             double lambda,
                             int numIterations,
                             double mu0,
                             double decayFactor,
                             int stepOffset,
                             double forgettingExponent,
                             int numThreads)
                      throws TasteException
Throws:
TasteException

ParallelSGDFactorizer

public ParallelSGDFactorizer(DataModel dataModel,
                             int numFeatures,
                             double lambda,
                             int numIterations,
                             double mu0,
                             double decayFactor,
                             int stepOffset,
                             double forgettingExponent,
                             double biasMuRatio,
                             double biasLambdaRatio)
                      throws TasteException
Throws:
TasteException

ParallelSGDFactorizer

public ParallelSGDFactorizer(DataModel dataModel,
                             int numFeatures,
                             double lambda,
                             int numIterations,
                             double mu0,
                             double decayFactor,
                             int stepOffset,
                             double forgettingExponent,
                             double biasMuRatio,
                             double biasLambdaRatio,
                             int numThreads)
                      throws TasteException
Throws:
TasteException
Method Detail

initialize

protected void initialize()
                   throws TasteException
Throws:
TasteException

factorize

public Factorization factorize()
                        throws TasteException
Throws:
TasteException

update

protected void update(Preference preference,
                      double mu)
TODO: this is the vanilla sgd by Tacaks 2009, I speculate that using scaling technique proposed in: Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent section 5, page 6 can be beneficial in term s of both speed and accuracy. Tacaks' method doesn't calculate gradient of regularization correctly, which has non-zero elements everywhere of the matrix. While Tacaks' method can only updates a single row/column, if one user has a lot of recommendation, her vector will be more affected by regularization using an isolated scaling factor for both user vectors and item vectors can remove this issue without inducing more update cost it even reduces it a bit by only performing one addition and one multiplication. BAD SIDE1: the scaling factor decreases fast, it has to be scaled up from time to time before dropped to zero or caused roundoff error BAD SIDE2: no body experiment on it before, and people generally use very small lambda so it's impact on accuracy may still be unknown. BAD SIDE3: don't know how to make it work for L1-regularization or "pseudorank?" (sum of singular values)-regularization



Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.