Package org.apache.mahout.utils

Interface Summary
SplitInput.SplitCallback Used to pass information back to a caller once a file has been split without the need for a data object
 

Class Summary
Bump125 Helps with making nice intervals at arbitrary scale.
ConcatenateVectorsJob  
ConcatenateVectorsReducer  
MatrixDumper Export a Matrix in various text formats: * CSV file Input format: Hadoop SequenceFile with Text key and MatrixWritable value, 1 pair TODO: Needs class for key value- should not hard-code to Text.
SequenceFileDumper  
SplitInput A utility for splitting files in the input format used by the Bayes classifiers or anything else that has one item per line or SequenceFiles (key/value) into training and test sets in order to perform cross-validation.
SplitInputJob  
SplitInputJob.SplitInputComparator Randomly permute key value pairs
SplitInputJob.SplitInputMapper Mapper which downsamples the input by downsamplingFactor
SplitInputJob.SplitInputReducer Reducer which uses MultipleOutputs to randomly allocate key value pairs between test and training outputs
 



Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.