## Class StreamingStatistics

public class StreamingStatistics
Computes summary statistics for a stream of data values added using the addValue method. The data values are not stored in memory, so this class can be used to compute statistics for very large data streams.

By default, all statistics other than percentiles are maintained. Percentile calculations use an embedded RandomPercentile which carries more memory and compute overhead than the other statistics, so it is disabled by default. To enable percentiles, either pass true to the constructor or use a StreamingStatistics.StreamingStatisticsBuilder to configure an instance with percentiles turned on. Other stats can also be selectively disabled using StreamingStatisticsBulder.

Note: This class is not thread-safe.

static class  StreamingStatistics.StreamingStatisticsBuilder
Builder for StreamingStatistics instances.
StreamingStatistics()
Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles.
StreamingStatistics(boolean computePercentiles)
Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles and with/without percentiles per the argument.
void accept(double value)
void addValue(double value)
Add a value to the data
void aggregate(StreamingStatistics other)
Aggregates the provided instance into this instance.
static StreamingStatistics.StreamingStatisticsBuilder builder()
Returns a StreamingStatistics.StreamingStatisticsBuilder to source configured StreamingStatistics instances.
void clear()
Resets all statistics and storage.
StreamingStatistics copy()
Returns a copy of this StreamingStatistics instance with the same internal state.
boolean equals(Object object)
Returns true iff object is a StreamingStatistics instance and all statistics have the same values as this.
double getGeometricMean()
Returns the geometric mean of the values that have been added.
double getMax()
Returns the maximum of the available values
double getMean()
Returns the arithmetic mean of the available values
double getMedian()
Returns an estimate of the median of the values that have been entered.
double getMin()
Returns the minimum of the available values
long getN()
Returns the number of available values
double getPercentile(double percentile)
Returns an estimate of the given percentile of the values that have been entered.
double getPopulationVariance()
Returns the population variance of the values that have been added.
double getQuadraticMean()
double getSecondMoment()
Returns a statistic related to the Second Central Moment.
double getStandardDeviation()
Returns the standard deviation of the values that have been added.
double getSum()
Returns the sum of the values that have been added to Univariate.
StatisticalSummary getSummary()
Return a StatisticalSummaryValues instance reporting current statistics.
double getSumOfLogs()
Returns the sum of the logs of the values that have been added.
double getSumOfSquares()
Returns the sum of the squares of the values that have been added.
double getVariance()
Returns the variance of the available values.
int hashCode()
Returns hash code based on values of statistics.
String toString()
Generates a text report displaying summary statistics from values that have been added.
• #### StreamingStatistics

public StreamingStatistics()
Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles.
• #### StreamingStatistics

public StreamingStatistics(boolean computePercentiles)
Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles and with/without percentiles per the argument.
Parameters:
computePercentiles - whether or not percentiles are maintained
• #### copy

public StreamingStatistics copy()
Returns a copy of this StreamingStatistics instance with the same internal state.
Returns:
a copy of this
• #### getSummary

public StatisticalSummary getSummary()
Return a StatisticalSummaryValues instance reporting current statistics.
Returns:
Current values of statistics

public void addValue(double value)
Add a value to the data
Parameters:
value - the value to add
• #### accept

public void accept(double value)
Specified by:
accept in interface DoubleConsumer
• #### clear

public void clear()
Resets all statistics and storage.
• #### getN

public long getN()
Returns the number of available values
Specified by:
getN in interface StatisticalSummary
Returns:
The number of available values
• #### getMax

public double getMax()
Returns the maximum of the available values
Specified by:
getMax in interface StatisticalSummary
Returns:
The max or Double.NaN if no values have been added.
• #### getMin

public double getMin()
Returns the minimum of the available values
Specified by:
getMin in interface StatisticalSummary
Returns:
The min or Double.NaN if no values have been added.
• #### getSum

public double getSum()
Returns the sum of the values that have been added to Univariate.
Specified by:
getSum in interface StatisticalSummary
Returns:
The sum or Double.NaN if no values have been added
• #### getSumOfSquares

public double getSumOfSquares()
Returns the sum of the squares of the values that have been added.

Double.NaN is returned if no values have been added.

Returns:
The sum of squares
• #### getMean

public double getMean()
Returns the arithmetic mean of the available values
Specified by:
getMean in interface StatisticalSummary
Returns:
The mean or Double.NaN if no values have been added.
• #### getVariance

public double getVariance()
Returns the variance of the available values.
Specified by:
getVariance in interface StatisticalSummary
Returns:
The variance, Double.NaN if no values have been added or 0.0 for a single value set.
• #### getPopulationVariance

public double getPopulationVariance()
Returns the population variance of the values that have been added.

Double.NaN is returned if no values have been added.

Returns:
the population variance
• #### getGeometricMean

public double getGeometricMean()
Returns the geometric mean of the values that have been added.

Double.NaN is returned if no values have been added.

Returns:
the geometric mean
• #### getSumOfLogs

public double getSumOfLogs()
Returns the sum of the logs of the values that have been added.

Double.NaN is returned if no values have been added.

Returns:
the sum of logs
• #### getSecondMoment

public double getSecondMoment()
Returns a statistic related to the Second Central Moment. Specifically, what is returned is the sum of squared deviations from the sample mean among the values that have been added.

Returns Double.NaN if no data values have been added and returns 0 if there is just one value in the data set.

Returns:
second central moment statistic

public double getQuadraticMean()
Returns the quadratic mean, a.k.a. root-mean-square of the available values
Returns:
The quadratic mean or Double.NaN if no values have been added.
• #### getStandardDeviation

public double getStandardDeviation()
Returns the standard deviation of the values that have been added.

Double.NaN is returned if no values have been added.

Specified by:
getStandardDeviation in interface StatisticalSummary
Returns:
the standard deviation
• #### getMedian

public double getMedian()
Returns an estimate of the median of the values that have been entered. See RandomPercentile for a description of the algorithm used for large data streams.
Returns:
the median
• #### getPercentile

public double getPercentile(double percentile)
Returns an estimate of the given percentile of the values that have been entered. See RandomPercentile for a description of the algorithm used for large data streams.
Parameters:
percentile - the desired percentile (must be between 0 and 100)
Returns:
estimated percentile
• #### aggregate

public void aggregate(StreamingStatistics other)
Aggregates the provided instance into this instance.

This method can be used to combine statistics computed over partitions or subsamples - i.e., the value of this instance after this operation should be the same as if a single statistic would have been applied over the combined dataset. Statistics are aggregated only when both this and other are maintaining them. For example, if this.computeMoments is false, but other.computeMoments is true, the moment data in other will be lost.

Specified by:
aggregate in interface AggregatableStatistic<StreamingStatistics>
Parameters:
other - the instance to aggregate into this instance
• #### toString

public String toString()
Generates a text report displaying summary statistics from values that have been added.
Overrides:
toString in class Object
Returns:
String with line feeds displaying statistics
• #### equals

public boolean equals(Object object)
Returns true iff object is a StreamingStatistics instance and all statistics have the same values as this.
Overrides:
equals in class Object
Parameters:
object - the object to test equality against.
Returns:
true if object equals this
• #### hashCode

public int hashCode()
Returns hash code based on values of statistics.
Overrides:
hashCode in class Object
Returns:
hash code
• #### builder

public static StreamingStatistics.StreamingStatisticsBuilder builder()
Returns a StreamingStatistics.StreamingStatisticsBuilder to source configured StreamingStatistics instances.
Returns:
a StreamingStatisticsBuilder instance