org.apache.mahout.vectorizer
Class SimpleTextEncodingVectorizer

java.lang.Object
  extended by org.apache.mahout.vectorizer.SimpleTextEncodingVectorizer
All Implemented Interfaces:
Vectorizer

public class SimpleTextEncodingVectorizer
extends Object
implements Vectorizer

Runs a Map/Reduce job that encodes FeatureVectorEncoder the input and writes it to the output as a sequence file.

Only works on basic text, where the value in the SequenceFile is a blob of text.


Constructor Summary
SimpleTextEncodingVectorizer()
           
 
Method Summary
 void createVectors(org.apache.hadoop.fs.Path input, org.apache.hadoop.fs.Path output, VectorizerConfig config)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleTextEncodingVectorizer

public SimpleTextEncodingVectorizer()
Method Detail

createVectors

public void createVectors(org.apache.hadoop.fs.Path input,
                          org.apache.hadoop.fs.Path output,
                          VectorizerConfig config)
                   throws IOException,
                          ClassNotFoundException,
                          InterruptedException
Specified by:
createVectors in interface Vectorizer
Throws:
IOException
ClassNotFoundException
InterruptedException


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.