java.lang.Object

org.hipparchus.clustering.Clusterer<T>

org.hipparchus.clustering.KMeansPlusPlusClusterer<T>

Type Parameters:: T - type of the points to cluster

public class KMeansPlusPlusClusterer<T extends Clusterable> extends Clusterer<T>

Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.

See Also:

K-means++ (wikipedia)

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static enum

KMeansPlusPlusClusterer.EmptyClusterStrategy

Strategies to use for replacing an empty cluster.
Constructor Summary

Constructors

Constructor

Description

KMeansPlusPlusClusterer(int k)

Build a clusterer.

KMeansPlusPlusClusterer(int k, int maxIterations)

Build a clusterer.

KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure)

Build a clusterer.

KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, RandomGenerator random)

Build a clusterer.

KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, RandomGenerator random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)

Build a clusterer.
Method Summary

Modifier and Type

Method

Description

List<CentroidCluster<T>>

cluster(Collection<T> points)

Runs the K-means++ clustering algorithm.

KMeansPlusPlusClusterer.EmptyClusterStrategy

getEmptyClusterStrategy()

Returns the KMeansPlusPlusClusterer.EmptyClusterStrategy used by this instance.

int

getK()

Return the number of clusters this instance will use.

int

getMaxIterations()

Returns the maximum number of iterations this instance will use.

RandomGenerator

getRandomGenerator()

Returns the random generator this instance will use.

Methods inherited from class org.hipparchus.clustering.Clusterer
distance, getDistanceMeasure

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- KMeansPlusPlusClusterer
  
  public KMeansPlusPlusClusterer(int k)
  
  Build a clusterer.
  The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
  The euclidean distance will be used as default distance measure.
  
  Parameters:
  
  k - the number of clusters to split the data into
- KMeansPlusPlusClusterer
  
  public KMeansPlusPlusClusterer(int k, int maxIterations)
  
  Build a clusterer.
  The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
  The euclidean distance will be used as default distance measure.
  
  Parameters:
  
  k - the number of clusters to split the data into
  
  maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
- KMeansPlusPlusClusterer
  
  public KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure)
  
  Build a clusterer.
  The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
  
  Parameters:
  
  k - the number of clusters to split the data into
  
  maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
  
  measure - the distance measure to use
- KMeansPlusPlusClusterer
  
  public KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, RandomGenerator random)
  
  Build a clusterer.
  The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.
  
  Parameters:
  
  k - the number of clusters to split the data into
  
  maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
  
  measure - the distance measure to use
  
  random - random generator to use for choosing initial centers
- KMeansPlusPlusClusterer
  
  public KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, RandomGenerator random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
  
  Build a clusterer.
  
  Parameters:
  
  k - the number of clusters to split the data into
  
  maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
  
  measure - the distance measure to use
  
  random - random generator to use for choosing initial centers
  
  emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations
Method Details
- getK
  
  public int getK()
  
  Return the number of clusters this instance will use.
  
  Returns:
  
  the number of clusters
- getMaxIterations
  
  public int getMaxIterations()
  
  Returns the maximum number of iterations this instance will use.
  
  Returns:
  
  the maximum number of iterations, or -1 if no maximum is set
- getRandomGenerator
  
  public RandomGenerator getRandomGenerator()
  
  Returns the random generator this instance will use.
  
  Returns:
  
  the random generator
- getEmptyClusterStrategy
  
  public KMeansPlusPlusClusterer.EmptyClusterStrategy getEmptyClusterStrategy()
  
  Returns the KMeansPlusPlusClusterer.EmptyClusterStrategy used by this instance.
  
  Returns:
  
  the KMeansPlusPlusClusterer.EmptyClusterStrategy
- cluster
  
  public List<CentroidCluster<T>> cluster(Collection<T> points) throws MathIllegalArgumentException, MathIllegalStateException
  
  Runs the K-means++ clustering algorithm.
  
  Specified by:
  
  cluster in class Clusterer<T extends Clusterable>
  
  Parameters:
  
  points - the points to cluster
  
  Returns:
  
  a list of clusters containing the points
  
  Throws:
  
  MathIllegalArgumentException - if the data points are null or the number of clusters is larger than the number of data points
  
  MathIllegalStateException - if an empty cluster is encountered and the emptyStrategy is set to ERROR

Class KMeansPlusPlusClusterer<T extends Clusterable>

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class org.hipparchus.clustering.Clusterer

Methods inherited from class java.lang.Object

Constructor Details

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

KMeansPlusPlusClusterer

Method Details

getK

getMaxIterations

getRandomGenerator

getEmptyClusterStrategy

cluster