org.apache.mahout.math.hadoop.stochasticsvd
Class SparseRowBlockAccumulator
java.lang.Object
org.apache.mahout.math.hadoop.stochasticsvd.SparseRowBlockAccumulator
- All Implemented Interfaces:
- Closeable, org.apache.hadoop.mapred.OutputCollector<Long,Vector>
public class SparseRowBlockAccumulator
- extends Object
- implements org.apache.hadoop.mapred.OutputCollector<Long,Vector>, Closeable
Aggregate incoming rows into blocks based on the row number (long). Rows can
be sparse (meaning they come perhaps in big intervals) and don't even have to
come in any order, but they should be coming in proximity, so when we output
block key, we hopefully aggregate more than one row by then.
If block is sufficiently large to fit all rows that mapper may produce, it
will not even ever hit a spill at all as we would already be plussing
efficiently in the mapper.
Also, for sparse inputs it will also be working especially well if transposed
columns of the left side matrix and corresponding rows of the right side
matrix experience sparsity in same elements.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SparseRowBlockAccumulator
public SparseRowBlockAccumulator(int height,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.LongWritable,SparseRowBlockWritable> delegate)
collect
public void collect(Long rowIndex,
Vector v)
throws IOException
- Specified by:
collect
in interface org.apache.hadoop.mapred.OutputCollector<Long,Vector>
- Throws:
IOException
close
public void close()
throws IOException
- Specified by:
close
in interface Closeable
- Throws:
IOException
Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.