org.apache.mahout.clustering
Class AbstractCluster

java.lang.Object
  extended by org.apache.mahout.clustering.AbstractCluster
All Implemented Interfaces:
org.apache.hadoop.io.Writable, Cluster, Model<VectorWritable>, Parametered
Direct Known Subclasses:
DistanceMeasureCluster

public abstract class AbstractCluster
extends Object
implements Cluster


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.mahout.common.parameters.Parametered
Parametered.ParameteredGeneralizations
 
Field Summary
 
Fields inherited from interface org.apache.mahout.clustering.Cluster
CLUSTERED_POINTS_DIR, CLUSTERS_DIR, FINAL_ITERATION_SUFFIX, INITIAL_CLUSTERS_DIR
 
Fields inherited from interface org.apache.mahout.common.parameters.Parametered
log
 
Constructor Summary
protected AbstractCluster()
           
protected AbstractCluster(Vector point, int id2)
           
protected AbstractCluster(Vector center2, Vector radius2, int id2)
           
 
Method Summary
 String asFormatString(String[] bindings)
          Produce a custom, human-friendly, printable representation of the Cluster.
 Vector computeCentroid()
          Compute the centroid by averaging the pointTotals
 void computeParameters()
          Compute a new set of posterior parameters based upon the Observations that have been observed since my creation
 void configure(org.apache.hadoop.conf.Configuration job)
           
 void createParameters(String prefix, org.apache.hadoop.conf.Configuration jobConf)
          EXPERT: consumers should never have to call this method.
static String formatVector(Vector v, String[] bindings)
          Return a human-readable formatted string representation of the vector, not intended to be complete nor usable as an input/output representation
 Vector getCenter()
          Get the "center" of the Cluster as a Vector
 int getId()
          Get the id of the Cluster
abstract  String getIdentifier()
           
 long getNumObservations()
          Return the number of observations that this model has seen since its parameters were last computed
 Collection<Parameter<?>> getParameters()
           
 Vector getRadius()
          Get the "radius" of the Cluster as a Vector.
protected  double getS0()
           
protected  Vector getS1()
           
protected  Vector getS2()
           
 long getTotalObservations()
          Return the number of observations that this model has seen over its lifetime
 boolean isConverged()
           
 void observe(Model<VectorWritable> x)
          Observe the given model, retaining information about its observations
 void observe(Vector x)
           
 void observe(Vector x, double weight)
           
 void observe(VectorWritable x)
          Observe the given observation, retaining information about it
 void observe(VectorWritable x, double weight)
          Observe the given observation, retaining information about it
 void readFields(DataInput in)
           
protected  void setCenter(Vector center)
           
protected  void setId(int id)
           
protected  void setNumObservations(long l)
           
protected  void setRadius(Vector radius)
           
protected  void setS0(double s0)
           
protected  void setS1(Vector s1)
           
protected  void setS2(Vector s2)
           
protected  void setTotalObservations(long totalPoints)
           
 void write(DataOutput out)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.mahout.clustering.Model
pdf, sampleFromPosterior
 

Constructor Detail

AbstractCluster

protected AbstractCluster()

AbstractCluster

protected AbstractCluster(Vector point,
                          int id2)

AbstractCluster

protected AbstractCluster(Vector center2,
                          Vector radius2,
                          int id2)
Method Detail

write

public void write(DataOutput out)
           throws IOException
Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
IOException

configure

public void configure(org.apache.hadoop.conf.Configuration job)
Specified by:
configure in interface Parametered

getParameters

public Collection<Parameter<?>> getParameters()
Specified by:
getParameters in interface Parametered

createParameters

public void createParameters(String prefix,
                             org.apache.hadoop.conf.Configuration jobConf)
Description copied from interface: Parametered
EXPERT: consumers should never have to call this method. It would be friendly visible to Parametered.ParameteredGeneralizations if java supported it. Calling this method should create a new list of parameters and is called

Specified by:
createParameters in interface Parametered
Parameters:
prefix - ends with a dot if not empty.
jobConf - configuration used for retrieving values
See Also:
invoking method, invoking method

getId

public int getId()
Description copied from interface: Cluster
Get the id of the Cluster

Specified by:
getId in interface Cluster
Returns:
a unique integer

setId

protected void setId(int id)
Parameters:
id - the id to set

getNumObservations

public long getNumObservations()
Description copied from interface: Model
Return the number of observations that this model has seen since its parameters were last computed

Specified by:
getNumObservations in interface Model<VectorWritable>
Returns:
a long

setNumObservations

protected void setNumObservations(long l)
Parameters:
l - the numPoints to set

getTotalObservations

public long getTotalObservations()
Description copied from interface: Model
Return the number of observations that this model has seen over its lifetime

Specified by:
getTotalObservations in interface Model<VectorWritable>
Returns:
a long

setTotalObservations

protected void setTotalObservations(long totalPoints)

getCenter

public Vector getCenter()
Description copied from interface: Cluster
Get the "center" of the Cluster as a Vector

Specified by:
getCenter in interface Cluster
Returns:
a Vector

setCenter

protected void setCenter(Vector center)
Parameters:
center - the center to set

getRadius

public Vector getRadius()
Description copied from interface: Cluster
Get the "radius" of the Cluster as a Vector. Usually the radius is the standard deviation expressed as a Vector of size equal to the center. Some clusters may return zero values if not appropriate.

Specified by:
getRadius in interface Cluster
Returns:
aVector

setRadius

protected void setRadius(Vector radius)
Parameters:
radius - the radius to set

getS0

protected double getS0()
Returns:
the s0

setS0

protected void setS0(double s0)

getS1

protected Vector getS1()
Returns:
the s1

setS1

protected void setS1(Vector s1)

getS2

protected Vector getS2()
Returns:
the s2

setS2

protected void setS2(Vector s2)

observe

public void observe(Model<VectorWritable> x)
Description copied from interface: Model
Observe the given model, retaining information about its observations

Specified by:
observe in interface Model<VectorWritable>
Parameters:
x - a Model<0>

observe

public void observe(VectorWritable x)
Description copied from interface: Model
Observe the given observation, retaining information about it

Specified by:
observe in interface Model<VectorWritable>
Parameters:
x - an Observation from the posterior

observe

public void observe(VectorWritable x,
                    double weight)
Description copied from interface: Model
Observe the given observation, retaining information about it

Specified by:
observe in interface Model<VectorWritable>
Parameters:
x - an Observation from the posterior
weight - a double weighting factor

observe

public void observe(Vector x,
                    double weight)

observe

public void observe(Vector x)

computeParameters

public void computeParameters()
Description copied from interface: Model
Compute a new set of posterior parameters based upon the Observations that have been observed since my creation

Specified by:
computeParameters in interface Model<VectorWritable>

asFormatString

public String asFormatString(String[] bindings)
Description copied from interface: Cluster
Produce a custom, human-friendly, printable representation of the Cluster.

Specified by:
asFormatString in interface Cluster
Parameters:
bindings - an optional String[] containing labels used to format the primary Vector/s of this implementation.
Returns:
a String

getIdentifier

public abstract String getIdentifier()

computeCentroid

public Vector computeCentroid()
Compute the centroid by averaging the pointTotals

Returns:
the new centroid

formatVector

public static String formatVector(Vector v,
                                  String[] bindings)
Return a human-readable formatted string representation of the vector, not intended to be complete nor usable as an input/output representation


isConverged

public boolean isConverged()
Specified by:
isConverged in interface Cluster
Returns:
if the receiver has converged, or false if that has no meaning for the implementation


Copyright © 2008–2014 The Apache Software Foundation. All rights reserved.