Class DescriptiveStatistics

java.lang.Object
org.hipparchus.stat.descriptive.DescriptiveStatistics
All Implemented Interfaces:
Serializable, DoubleConsumer, StatisticalSummary

public class DescriptiveStatistics extends Object implements StatisticalSummary, DoubleConsumer, Serializable
Maintains a dataset of values of a single variable and computes descriptive statistics based on stored data.

The windowSize property sets a limit on the number of values that can be stored in the dataset. The default value, INFINITE_WINDOW, puts no limit on the size of the dataset. This value should be used with caution, as the backing store will grow without bound in this case.

For very large datasets, StreamingStatistics, which does not store the dataset, should be used instead of this class. If windowSize is not INFINITE_WINDOW and more values are added than can be stored in the dataset, new values are added in a "rolling" manner, with new values replacing the "oldest" values in the dataset.

Note: this class is not threadsafe.

See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected static final int
    Represents an infinite window size.
  • Constructor Summary

    Constructors
    Modifier
    Constructor
    Description
     
    Construct a DescriptiveStatistics instance with an infinite window.
     
    DescriptiveStatistics(double[] initialDoubleArray)
    Construct a DescriptiveStatistics instance with an infinite window and the initial data values in double[] initialDoubleArray.
     
    Construct a DescriptiveStatistics instance with the specified window.
    protected
    Copy constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    accept(double v)
    void
    addValue(double v)
    Adds the value to the dataset.
    double
    Apply the given statistic to the data associated with this set of statistics.
    void
    Resets all statistics and storage.
    Returns a copy of this DescriptiveStatistics instance with the same internal state.
    double
    getElement(int index)
    Returns the element at the specified index
    double
    Returns the geometric mean of the available values.
    double
    Returns the Kurtosis of the available values.
    double
    Returns the maximum of the available values
    double
    Returns the arithmetic mean of the available values
    double
    Returns the minimum of the available values
    long
    Returns the number of available values
    double
    getPercentile(double p)
    Returns an estimate for the pth percentile of the stored values.
    double
    Returns the population variance of the available values.
    double
    Returns the quadratic mean of the available values.
    double
    Returns the skewness of the available values.
    double[]
    Returns the current set of values in an array of double primitives, sorted in ascending order.
    double
    Returns the standard deviation of the available values.
    double
    Returns the sum of the values that have been added to Univariate.
    double
    Returns the sum of the squares of the available values.
    double[]
    Returns the current set of values in an array of double primitives.
    double
    Returns the variance of the available values.
    int
    Returns the maximum number of values that can be stored in the dataset, or INFINITE_WINDOW (-1) if there is no limit.
    void
    Removes the most recent value from the dataset.
    double
    Replaces the most recently stored value with the given value.
    void
    setWindowSize(int windowSize)
    WindowSize controls the number of values that contribute to the reported statistics.
    Generates a text report displaying univariate statistics from values that have been added.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

    Methods inherited from interface java.util.function.DoubleConsumer

    andThen
  • Field Details

    • INFINITE_WINDOW

      protected static final int INFINITE_WINDOW
      Represents an infinite window size. When the getWindowSize() returns this value, there is no limit to the number of data values that can be stored in the dataset.
      See Also:
  • Constructor Details

    • DescriptiveStatistics

      public DescriptiveStatistics()
      Construct a DescriptiveStatistics instance with an infinite window.
    • DescriptiveStatistics

      public DescriptiveStatistics(int size) throws MathIllegalArgumentException
      Construct a DescriptiveStatistics instance with the specified window.
      Parameters:
      size - the window size.
      Throws:
      MathIllegalArgumentException - if window size is less than 1 but not equal to INFINITE_WINDOW
    • DescriptiveStatistics

      public DescriptiveStatistics(double[] initialDoubleArray)
      Construct a DescriptiveStatistics instance with an infinite window and the initial data values in double[] initialDoubleArray.
      Parameters:
      initialDoubleArray - the initial double[].
      Throws:
      NullArgumentException - if the input array is null
    • DescriptiveStatistics

      protected DescriptiveStatistics(DescriptiveStatistics original)
      Copy constructor.

      Construct a new DescriptiveStatistics instance that is a copy of original.

      Parameters:
      original - DescriptiveStatistics instance to copy
      Throws:
      NullArgumentException - if original is null
  • Method Details

    • copy

      public DescriptiveStatistics copy()
      Returns a copy of this DescriptiveStatistics instance with the same internal state.
      Returns:
      a copy of this
    • addValue

      public void addValue(double v)
      Adds the value to the dataset. If the dataset is at the maximum size (i.e., the number of stored elements equals the currently configured windowSize), the first (oldest) element in the dataset is discarded to make room for the new value.
      Parameters:
      v - the value to be added
    • accept

      public void accept(double v)
      Specified by:
      accept in interface DoubleConsumer
    • clear

      public void clear()
      Resets all statistics and storage.
    • removeMostRecentValue

      public void removeMostRecentValue() throws MathIllegalStateException
      Removes the most recent value from the dataset.
      Throws:
      MathIllegalStateException - if there are no elements stored
    • replaceMostRecentValue

      public double replaceMostRecentValue(double v) throws MathIllegalStateException
      Replaces the most recently stored value with the given value. There must be at least one element stored to call this method.
      Parameters:
      v - the value to replace the most recent stored value
      Returns:
      replaced value
      Throws:
      MathIllegalStateException - if there are no elements stored
    • apply

      public double apply(UnivariateStatistic stat)
      Apply the given statistic to the data associated with this set of statistics.
      Parameters:
      stat - the statistic to apply
      Returns:
      the computed value of the statistic.
    • getMean

      public double getMean()
      Returns the arithmetic mean of the available values
      Specified by:
      getMean in interface StatisticalSummary
      Returns:
      The mean or Double.NaN if no values have been added.
    • getGeometricMean

      public double getGeometricMean()
      Returns the geometric mean of the available values.

      See GeometricMean for details on the computing algorithm.

      Returns:
      The geometricMean, Double.NaN if no values have been added, or if any negative values have been added.
      See Also:
    • getStandardDeviation

      public double getStandardDeviation()
      Returns the standard deviation of the available values.
      Specified by:
      getStandardDeviation in interface StatisticalSummary
      Returns:
      The standard deviation, Double.NaN if no values have been added or 0.0 for a single value set.
    • getQuadraticMean

      public double getQuadraticMean()
      Returns the quadratic mean of the available values.
      Returns:
      The quadratic mean or Double.NaN if no values have been added.
      See Also:
    • getVariance

      public double getVariance()
      Returns the variance of the available values.
      Specified by:
      getVariance in interface StatisticalSummary
      Returns:
      The variance, Double.NaN if no values have been added or 0.0 for a single value set.
    • getPopulationVariance

      public double getPopulationVariance()
      Returns the population variance of the available values.
      Returns:
      The population variance, Double.NaN if no values have been added, or 0.0 for a single value set.
      See Also:
    • getSkewness

      public double getSkewness()
      Returns the skewness of the available values. Skewness is a measure of the asymmetry of a given distribution.
      Returns:
      The skewness, Double.NaN if less than 3 values have been added.
    • getKurtosis

      public double getKurtosis()
      Returns the Kurtosis of the available values. Kurtosis is a measure of the "peakedness" of a distribution.
      Returns:
      The kurtosis, Double.NaN if less than 4 values have been added.
    • getMax

      public double getMax()
      Returns the maximum of the available values
      Specified by:
      getMax in interface StatisticalSummary
      Returns:
      The max or Double.NaN if no values have been added.
    • getMin

      public double getMin()
      Returns the minimum of the available values
      Specified by:
      getMin in interface StatisticalSummary
      Returns:
      The min or Double.NaN if no values have been added.
    • getSum

      public double getSum()
      Returns the sum of the values that have been added to Univariate.
      Specified by:
      getSum in interface StatisticalSummary
      Returns:
      The sum or Double.NaN if no values have been added
    • getSumOfSquares

      public double getSumOfSquares()
      Returns the sum of the squares of the available values.
      Returns:
      The sum of the squares or Double.NaN if no values have been added.
    • getPercentile

      public double getPercentile(double p) throws MathIllegalArgumentException
      Returns an estimate for the pth percentile of the stored values.

      The implementation provided here follows the first estimation procedure presented here.

      Preconditions:

      • 0 < p ≤ 100 (otherwise an MathIllegalArgumentException is thrown)
      • at least one value must be stored (returns Double.NaN otherwise)
      Parameters:
      p - the requested percentile (scaled from 0 - 100)
      Returns:
      An estimate for the pth percentile of the stored data
      Throws:
      MathIllegalArgumentException - if p is not a valid quantile
    • getN

      public long getN()
      Returns the number of available values
      Specified by:
      getN in interface StatisticalSummary
      Returns:
      The number of available values
    • getWindowSize

      public int getWindowSize()
      Returns the maximum number of values that can be stored in the dataset, or INFINITE_WINDOW (-1) if there is no limit.
      Returns:
      The current window size or -1 if its Infinite.
    • setWindowSize

      public void setWindowSize(int windowSize) throws MathIllegalArgumentException
      WindowSize controls the number of values that contribute to the reported statistics. For example, if windowSize is set to 3 and the values {1,2,3,4,5} have been added in that order then the available values are {3,4,5} and all reported statistics will be based on these values. If windowSize is decreased as a result of this call and there are more than the new value of elements in the current dataset, values from the front of the array are discarded to reduce the dataset to windowSize elements.
      Parameters:
      windowSize - sets the size of the window.
      Throws:
      MathIllegalArgumentException - if window size is less than 1 but not equal to INFINITE_WINDOW
    • getValues

      public double[] getValues()
      Returns the current set of values in an array of double primitives. The order of addition is preserved. The returned array is a fresh copy of the underlying data -- i.e., it is not a reference to the stored data.
      Returns:
      the current set of numbers in the order in which they were added to this set
    • getSortedValues

      public double[] getSortedValues()
      Returns the current set of values in an array of double primitives, sorted in ascending order. The returned array is a fresh copy of the underlying data -- i.e., it is not a reference to the stored data.
      Returns:
      returns the current set of numbers sorted in ascending order
    • getElement

      public double getElement(int index)
      Returns the element at the specified index
      Parameters:
      index - The Index of the element
      Returns:
      return the element at the specified index
    • toString

      public String toString()
      Generates a text report displaying univariate statistics from values that have been added. Each statistic is displayed on a separate line.
      Overrides:
      toString in class Object
      Returns:
      String with line feeds displaying statistics