org.apache.mahout.utils.io
Class ChunkedWriter

java.lang.Object
  extended by org.apache.mahout.utils.io.ChunkedWriter
All Implemented Interfaces:
Closeable

public final class ChunkedWriter
extends Object
implements Closeable

Writes data splitted in multiple Hadoop sequence files of approximate equal size. The data must consist of key-value pairs, both of them of String type. All sequence files are created in the same directory and named "chunk-0", "chunk-1", etc.


Constructor Summary
ChunkedWriter(org.apache.hadoop.conf.Configuration conf, int chunkSizeInMB, org.apache.hadoop.fs.Path output)
           
 
Method Summary
 void close()
           
 void write(String key, String value)
          Writes a new key-value pair, creating a new sequence file if necessary.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChunkedWriter

public ChunkedWriter(org.apache.hadoop.conf.Configuration conf,
                     int chunkSizeInMB,
                     org.apache.hadoop.fs.Path output)
              throws IOException
Parameters:
conf - needed by Hadoop to know what filesystem implementation to use.
chunkSizeInMB - approximate size of each file, in Megabytes.
output - directory where the sequence files will be created.
Throws:
IOException
Method Detail

write

public void write(String key,
                  String value)
           throws IOException
Writes a new key-value pair, creating a new sequence file if necessary.

Throws:
IOException

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Throws:
IOException


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.