Class KMeansPlusPlusClusterer<T extends Clusterable>

java.lang.Object
org.hipparchus.clustering.Clusterer<T>
org.hipparchus.clustering.KMeansPlusPlusClusterer<T>
Type Parameters:
T - type of the points to cluster

public class KMeansPlusPlusClusterer<T extends Clusterable> extends Clusterer<T>
Clustering algorithm based on David Arthur and Sergei Vassilvitski k-means++ algorithm.
See Also:
  • Constructor Details

    • KMeansPlusPlusClusterer

      public KMeansPlusPlusClusterer(int k)
      Build a clusterer.

      The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

      The euclidean distance will be used as default distance measure.

      Parameters:
      k - the number of clusters to split the data into
    • KMeansPlusPlusClusterer

      public KMeansPlusPlusClusterer(int k, int maxIterations)
      Build a clusterer.

      The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

      The euclidean distance will be used as default distance measure.

      Parameters:
      k - the number of clusters to split the data into
      maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
    • KMeansPlusPlusClusterer

      public KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure)
      Build a clusterer.

      The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

      Parameters:
      k - the number of clusters to split the data into
      maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
      measure - the distance measure to use
    • KMeansPlusPlusClusterer

      public KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, RandomGenerator random)
      Build a clusterer.

      The default strategy for handling empty clusters that may appear during algorithm iterations is to split the cluster with largest distance variance.

      Parameters:
      k - the number of clusters to split the data into
      maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
      measure - the distance measure to use
      random - random generator to use for choosing initial centers
    • KMeansPlusPlusClusterer

      public KMeansPlusPlusClusterer(int k, int maxIterations, DistanceMeasure measure, RandomGenerator random, KMeansPlusPlusClusterer.EmptyClusterStrategy emptyStrategy)
      Build a clusterer.
      Parameters:
      k - the number of clusters to split the data into
      maxIterations - the maximum number of iterations to run the algorithm for. If negative, no maximum will be used.
      measure - the distance measure to use
      random - random generator to use for choosing initial centers
      emptyStrategy - strategy to use for handling empty clusters that may appear during algorithm iterations
  • Method Details