org.apache.mahout.math.neighborhood
Class Searcher

java.lang.Object
  extended by org.apache.mahout.math.neighborhood.Searcher
All Implemented Interfaces:
Iterable<Vector>
Direct Known Subclasses:
UpdatableSearcher

public abstract class Searcher
extends Object
implements Iterable<Vector>

Describes how to search a bunch of vectors. The vectors can be of any type (weighted, sparse, ...) but only the values of the vector matter when searching (weights, indices, ...) will not. When iterating through a Searcher, the Vectors added to it are returned.


Field Summary
protected  DistanceMeasure distanceMeasure
           
 
Constructor Summary
protected Searcher(DistanceMeasure distanceMeasure)
           
 
Method Summary
abstract  void add(Vector vector)
          Add a new Vector to the Searcher that will be checked when getting the nearest neighbors.
 void addAll(Iterable<? extends Vector> data)
          Adds all the data elements in the Searcher.
 void addAllMatrixSlices(Iterable<MatrixSlice> data)
          Adds all the data elements in the Searcher.
 void addAllMatrixSlicesAsWeightedVectors(Iterable<MatrixSlice> data)
           
 void clear()
           
static org.apache.lucene.util.PriorityQueue<WeightedThing<Vector>> getCandidateQueue(int limit)
          Returns a bounded size priority queue, in reverse order that keeps track of the best nearest neighbor vectors.
 DistanceMeasure getDistanceMeasure()
           
 boolean remove(Vector v, double epsilon)
           
 List<List<WeightedThing<Vector>>> search(Iterable<? extends Vector> queries, int limit)
           
abstract  List<WeightedThing<Vector>> search(Vector query, int limit)
          When querying the Searcher for the closest vectors, a list of WeightedThings is returned.
 List<WeightedThing<Vector>> searchFirst(Iterable<? extends Vector> queries, boolean differentThanQuery)
           
abstract  WeightedThing<Vector> searchFirst(Vector query, boolean differentThanQuery)
          Returns the closest vector to the query.
abstract  int size()
          Returns the number of WeightedVectors being searched for nearest neighbors.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.lang.Iterable
iterator
 

Field Detail

distanceMeasure

protected DistanceMeasure distanceMeasure
Constructor Detail

Searcher

protected Searcher(DistanceMeasure distanceMeasure)
Method Detail

getDistanceMeasure

public DistanceMeasure getDistanceMeasure()

add

public abstract void add(Vector vector)
Add a new Vector to the Searcher that will be checked when getting the nearest neighbors. The vector IS NOT CLONED. Do not modify the vector externally otherwise the internal Searcher data structures could be invalidated.


size

public abstract int size()
Returns the number of WeightedVectors being searched for nearest neighbors.


search

public abstract List<WeightedThing<Vector>> search(Vector query,
                                                   int limit)
When querying the Searcher for the closest vectors, a list of WeightedThings is returned. The value of the WeightedThing is the neighbor and the weight is the the distance (calculated by some metric - see a concrete implementation) between the query and neighbor. The actual type of vector in the pair is the same as the vector added to the Searcher.

Parameters:
query - the vector to search for
limit - the number of results to return
Returns:
the list of weighted vectors closest to the query

search

public List<List<WeightedThing<Vector>>> search(Iterable<? extends Vector> queries,
                                                int limit)

searchFirst

public abstract WeightedThing<Vector> searchFirst(Vector query,
                                                  boolean differentThanQuery)
Returns the closest vector to the query. When only one the nearest vector is needed, use this method, NOT search(query, limit) because it's faster (less overhead).

Parameters:
query - the vector to search for
differentThanQuery - if true, returns the closest vector different than the query (this only matters if the query is among the searched vectors), otherwise, returns the closest vector to the query (even the same vector).
Returns:
the weighted vector closest to the query

searchFirst

public List<WeightedThing<Vector>> searchFirst(Iterable<? extends Vector> queries,
                                               boolean differentThanQuery)

addAll

public void addAll(Iterable<? extends Vector> data)
Adds all the data elements in the Searcher.

Parameters:
data - an iterable of WeightedVectors to add.

addAllMatrixSlices

public void addAllMatrixSlices(Iterable<MatrixSlice> data)
Adds all the data elements in the Searcher.

Parameters:
data - an iterable of MatrixSlices to add.

addAllMatrixSlicesAsWeightedVectors

public void addAllMatrixSlicesAsWeightedVectors(Iterable<MatrixSlice> data)

remove

public boolean remove(Vector v,
                      double epsilon)

clear

public void clear()

getCandidateQueue

public static org.apache.lucene.util.PriorityQueue<WeightedThing<Vector>> getCandidateQueue(int limit)
Returns a bounded size priority queue, in reverse order that keeps track of the best nearest neighbor vectors.

Parameters:
limit - maximum size of the heap.
Returns:
the priority queue.


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.