org.apache.mahout.text
Class MultipleTextFileInputFormat

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputFormat<K,V>
      extended by org.apache.hadoop.mapreduce.lib.input.FileInputFormat<K,V>
          extended by org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.BytesWritable>
              extended by org.apache.mahout.text.MultipleTextFileInputFormat

public class MultipleTextFileInputFormat
extends org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.BytesWritable>

Used in combining a large number of text files into one text input reader along with the WholeFileRecordReader class.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter
 
Field Summary
 
Fields inherited from class org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat
SPLIT_MINSIZE_PERNODE, SPLIT_MINSIZE_PERRACK
 
Constructor Summary
MultipleTextFileInputFormat()
           
 
Method Summary
 org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.BytesWritable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
           
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat
createPool, createPool, getFileBlockLocations, getSplits, isSplitable, setMaxSplitSize, setMinSplitSizeNode, setMinSplitSizeRack
 
Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MultipleTextFileInputFormat

public MultipleTextFileInputFormat()
Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.BytesWritable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit,
                                                                                                                                        org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
                                                                                                                                 throws IOException
Specified by:
createRecordReader in class org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat<org.apache.hadoop.io.IntWritable,org.apache.hadoop.io.BytesWritable>
Throws:
IOException


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.