org.apache.mahout.common.iterator.sequencefile
Class SequenceFileDirIterator<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>

java.lang.Object
  extended by com.google.common.collect.ForwardingObject
      extended by com.google.common.collect.ForwardingIterator<Pair<K,V>>
          extended by org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator<K,V>
All Implemented Interfaces:
Closeable, Iterator<Pair<K,V>>

public final class SequenceFileDirIterator<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>
extends com.google.common.collect.ForwardingIterator<Pair<K,V>>
implements Closeable

Like SequenceFileIterator, but iterates not just over one sequence file, but many. The input path may be specified as a directory of files to read, or as a glob pattern. The set of files may be optionally restricted with a PathFilter.


Constructor Summary
SequenceFileDirIterator(org.apache.hadoop.fs.Path[] path, boolean reuseKeyValueInstances, org.apache.hadoop.conf.Configuration conf)
          Multifile sequence file iterator where files are specified explicitly by path parameters.
SequenceFileDirIterator(org.apache.hadoop.fs.Path path, PathType pathType, org.apache.hadoop.fs.PathFilter filter, Comparator<org.apache.hadoop.fs.FileStatus> ordering, boolean reuseKeyValueInstances, org.apache.hadoop.conf.Configuration conf)
          Constructor that uses either FileSystem.listStatus(Path) or FileSystem.globStatus(Path) to obtain list of files to iterate over (depending on pathType parameter).
 
Method Summary
 void close()
           
protected  Iterator<Pair<K,V>> delegate()
           
 
Methods inherited from class com.google.common.collect.ForwardingIterator
hasNext, next, remove
 
Methods inherited from class com.google.common.collect.ForwardingObject
toString
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SequenceFileDirIterator

public SequenceFileDirIterator(org.apache.hadoop.fs.Path[] path,
                               boolean reuseKeyValueInstances,
                               org.apache.hadoop.conf.Configuration conf)
                        throws IOException
Multifile sequence file iterator where files are specified explicitly by path parameters.

Throws:
IOException

SequenceFileDirIterator

public SequenceFileDirIterator(org.apache.hadoop.fs.Path path,
                               PathType pathType,
                               org.apache.hadoop.fs.PathFilter filter,
                               Comparator<org.apache.hadoop.fs.FileStatus> ordering,
                               boolean reuseKeyValueInstances,
                               org.apache.hadoop.conf.Configuration conf)
                        throws IOException
Constructor that uses either FileSystem.listStatus(Path) or FileSystem.globStatus(Path) to obtain list of files to iterate over (depending on pathType parameter).

Throws:
IOException
Method Detail

delegate

protected Iterator<Pair<K,V>> delegate()
Overrides:
delegate in class com.google.common.collect.ForwardingIterator<Pair<K extends org.apache.hadoop.io.Writable,V extends org.apache.hadoop.io.Writable>>

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Throws:
IOException


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.