Class StreamingStatistics

java.lang.Object
org.hipparchus.stat.descriptive.StreamingStatistics
All Implemented Interfaces:
Serializable, DoubleConsumer, AggregatableStatistic<StreamingStatistics>, StatisticalSummary

public class StreamingStatistics extends Object implements StatisticalSummary, AggregatableStatistic<StreamingStatistics>, DoubleConsumer, Serializable
Computes summary statistics for a stream of data values added using the addValue method. The data values are not stored in memory, so this class can be used to compute statistics for very large data streams.

By default, all statistics other than percentiles are maintained. Percentile calculations use an embedded RandomPercentile which carries more memory and compute overhead than the other statistics, so it is disabled by default. To enable percentiles, either pass true to the constructor or use a StreamingStatistics.StreamingStatisticsBuilder to configure an instance with percentiles turned on. Other stats can also be selectively disabled using StreamingStatisticsBulder.

Note: This class is not thread-safe.

See Also:
  • Constructor Details

    • StreamingStatistics

      public StreamingStatistics()
      Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles.
    • StreamingStatistics

      public StreamingStatistics(double epsilon, RandomGenerator randomGenerator)
      Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles and with/without percentiles per the arguments.
      Parameters:
      epsilon - bound on quantile estimation error (see RandomGenerator)
      randomGenerator - PRNG used in sampling and merge operations (null if percentiles should not be computed)
      Since:
      2.3
  • Method Details

    • copy

      public StreamingStatistics copy()
      Returns a copy of this StreamingStatistics instance with the same internal state.
      Returns:
      a copy of this
    • getSummary

      public StatisticalSummary getSummary()
      Return a StatisticalSummaryValues instance reporting current statistics.
      Returns:
      Current values of statistics
    • addValue

      public void addValue(double value)
      Add a value to the data
      Parameters:
      value - the value to add
    • accept

      public void accept(double value)
      Specified by:
      accept in interface DoubleConsumer
    • clear

      public void clear()
      Resets all statistics and storage.
    • getN

      public long getN()
      Returns the number of available values
      Specified by:
      getN in interface StatisticalSummary
      Returns:
      The number of available values
    • getMax

      public double getMax()
      Returns the maximum of the available values
      Specified by:
      getMax in interface StatisticalSummary
      Returns:
      The max or Double.NaN if no values have been added.
    • getMin

      public double getMin()
      Returns the minimum of the available values
      Specified by:
      getMin in interface StatisticalSummary
      Returns:
      The min or Double.NaN if no values have been added.
    • getSum

      public double getSum()
      Returns the sum of the values that have been added to Univariate.
      Specified by:
      getSum in interface StatisticalSummary
      Returns:
      The sum or Double.NaN if no values have been added
    • getSumOfSquares

      public double getSumOfSquares()
      Returns the sum of the squares of the values that have been added.

      Double.NaN is returned if no values have been added.

      Returns:
      The sum of squares
    • getMean

      public double getMean()
      Returns the arithmetic mean of the available values
      Specified by:
      getMean in interface StatisticalSummary
      Returns:
      The mean or Double.NaN if no values have been added.
    • getVariance

      public double getVariance()
      Returns the variance of the available values.
      Specified by:
      getVariance in interface StatisticalSummary
      Returns:
      The variance, Double.NaN if no values have been added or 0.0 for a single value set.
    • getPopulationVariance

      public double getPopulationVariance()
      Returns the population variance of the values that have been added.

      Double.NaN is returned if no values have been added.

      Returns:
      the population variance
    • getGeometricMean

      public double getGeometricMean()
      Returns the geometric mean of the values that have been added.

      Double.NaN is returned if no values have been added.

      Returns:
      the geometric mean
    • getSumOfLogs

      public double getSumOfLogs()
      Returns the sum of the logs of the values that have been added.

      Double.NaN is returned if no values have been added.

      Returns:
      the sum of logs
    • getSecondMoment

      public double getSecondMoment()
      Returns a statistic related to the Second Central Moment. Specifically, what is returned is the sum of squared deviations from the sample mean among the values that have been added.

      Returns Double.NaN if no data values have been added and returns 0 if there is just one value in the data set.

      Returns:
      second central moment statistic
    • getQuadraticMean

      public double getQuadraticMean()
      Returns the quadratic mean, a.k.a. root-mean-square of the available values
      Returns:
      The quadratic mean or Double.NaN if no values have been added.
    • getStandardDeviation

      public double getStandardDeviation()
      Returns the standard deviation of the values that have been added.

      Double.NaN is returned if no values have been added.

      Specified by:
      getStandardDeviation in interface StatisticalSummary
      Returns:
      the standard deviation
    • getMedian

      public double getMedian()
      Returns an estimate of the median of the values that have been entered. See RandomPercentile for a description of the algorithm used for large data streams.
      Returns:
      the median
    • getPercentile

      public double getPercentile(double percentile)
      Returns an estimate of the given percentile of the values that have been entered. See RandomPercentile for a description of the algorithm used for large data streams.
      Parameters:
      percentile - the desired percentile (must be between 0 and 100)
      Returns:
      estimated percentile
    • aggregate

      public void aggregate(StreamingStatistics other)
      Aggregates the provided instance into this instance.

      This method can be used to combine statistics computed over partitions or subsamples - i.e., the value of this instance after this operation should be the same as if a single statistic would have been applied over the combined dataset. Statistics are aggregated only when both this and other are maintaining them. For example, if this.computeMoments is false, but other.computeMoments is true, the moment data in other will be lost.

      Specified by:
      aggregate in interface AggregatableStatistic<StreamingStatistics>
      Parameters:
      other - the instance to aggregate into this instance
    • toString

      public String toString()
      Generates a text report displaying summary statistics from values that have been added.
      Overrides:
      toString in class Object
      Returns:
      String with line feeds displaying statistics
    • equals

      public boolean equals(Object object)
      Returns true iff object is a StreamingStatistics instance and all statistics have the same values as this.
      Overrides:
      equals in class Object
      Parameters:
      object - the object to test equality against.
      Returns:
      true if object equals this
    • hashCode

      public int hashCode()
      Returns hash code based on values of statistics.
      Overrides:
      hashCode in class Object
      Returns:
      hash code
    • builder

      Returns a StreamingStatistics.StreamingStatisticsBuilder to source configured StreamingStatistics instances.
      Returns:
      a StreamingStatisticsBuilder instance