Class StreamingStatistics
- All Implemented Interfaces:
Serializable
,DoubleConsumer
,AggregatableStatistic<StreamingStatistics>
,StatisticalSummary
addValue
method. The data values are not stored in
memory, so this class can be used to compute statistics for very large data
streams.
By default, all statistics other than percentiles are maintained. Percentile
calculations use an embedded RandomPercentile
which carries more memory
and compute overhead than the other statistics, so it is disabled by default.
To enable percentiles, either pass true
to the constructor or use a
StreamingStatistics.StreamingStatisticsBuilder
to configure an instance with percentiles turned
on. Other stats can also be selectively disabled using
StreamingStatisticsBulder
.
Note: This class is not thread-safe.
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Builder for StreamingStatistics instances. -
Constructor Summary
ConstructorDescriptionConstruct a new StreamingStatistics instance, maintaining all statistics other than percentiles.StreamingStatistics
(double epsilon, RandomGenerator randomGenerator) Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles and with/without percentiles per the arguments. -
Method Summary
Modifier and TypeMethodDescriptionvoid
accept
(double value) void
addValue
(double value) Add a value to the datavoid
aggregate
(StreamingStatistics other) Aggregates the provided instance into this instance.builder()
Returns aStreamingStatistics.StreamingStatisticsBuilder
to source configuredStreamingStatistics
instances.void
clear()
Resets all statistics and storage.copy()
Returns a copy of this StreamingStatistics instance with the same internal state.boolean
Returns true iffobject
is aStreamingStatistics
instance and all statistics have the same values as this.double
Returns the geometric mean of the values that have been added.double
getMax()
Returns the maximum of the available valuesdouble
getMean()
Returns the arithmetic mean of the available valuesdouble
Returns an estimate of the median of the values that have been entered.double
getMin()
Returns the minimum of the available valueslong
getN()
Returns the number of available valuesdouble
getPercentile
(double percentile) Returns an estimate of the given percentile of the values that have been entered.double
Returns the population variance of the values that have been added.double
Returns the quadratic mean, a.k.a.double
Returns a statistic related to the Second Central Moment.double
Returns the standard deviation of the values that have been added.double
getSum()
Returns the sum of the values that have been added to Univariate.Return aStatisticalSummaryValues
instance reporting current statistics.double
Returns the sum of the logs of the values that have been added.double
Returns the sum of the squares of the values that have been added.double
Returns the variance of the available values.int
hashCode()
Returns hash code based on values of statistics.toString()
Generates a text report displaying summary statistics from values that have been added.Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.hipparchus.stat.descriptive.AggregatableStatistic
aggregate, aggregate
Methods inherited from interface java.util.function.DoubleConsumer
andThen
-
Constructor Details
-
StreamingStatistics
public StreamingStatistics()Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles. -
StreamingStatistics
Construct a new StreamingStatistics instance, maintaining all statistics other than percentiles and with/without percentiles per the arguments.- Parameters:
epsilon
- bound on quantile estimation error (seeRandomGenerator
)randomGenerator
- PRNG used in sampling and merge operations (null if percentiles should not be computed)- Since:
- 2.3
-
-
Method Details
-
copy
Returns a copy of this StreamingStatistics instance with the same internal state.- Returns:
- a copy of this
-
getSummary
Return aStatisticalSummaryValues
instance reporting current statistics.- Returns:
- Current values of statistics
-
addValue
public void addValue(double value) Add a value to the data- Parameters:
value
- the value to add
-
accept
public void accept(double value) - Specified by:
accept
in interfaceDoubleConsumer
-
clear
public void clear()Resets all statistics and storage. -
getN
public long getN()Returns the number of available values- Specified by:
getN
in interfaceStatisticalSummary
- Returns:
- The number of available values
-
getMax
public double getMax()Returns the maximum of the available values- Specified by:
getMax
in interfaceStatisticalSummary
- Returns:
- The max or Double.NaN if no values have been added.
-
getMin
public double getMin()Returns the minimum of the available values- Specified by:
getMin
in interfaceStatisticalSummary
- Returns:
- The min or Double.NaN if no values have been added.
-
getSum
public double getSum()Returns the sum of the values that have been added to Univariate.- Specified by:
getSum
in interfaceStatisticalSummary
- Returns:
- The sum or Double.NaN if no values have been added
-
getSumOfSquares
public double getSumOfSquares()Returns the sum of the squares of the values that have been added.Double.NaN is returned if no values have been added.
- Returns:
- The sum of squares
-
getMean
public double getMean()Returns the arithmetic mean of the available values- Specified by:
getMean
in interfaceStatisticalSummary
- Returns:
- The mean or Double.NaN if no values have been added.
-
getVariance
public double getVariance()Returns the variance of the available values.- Specified by:
getVariance
in interfaceStatisticalSummary
- Returns:
- The variance, Double.NaN if no values have been added or 0.0 for a single value set.
-
getPopulationVariance
public double getPopulationVariance()Returns the population variance of the values that have been added.Double.NaN is returned if no values have been added.
- Returns:
- the population variance
-
getGeometricMean
public double getGeometricMean()Returns the geometric mean of the values that have been added.Double.NaN is returned if no values have been added.
- Returns:
- the geometric mean
-
getSumOfLogs
public double getSumOfLogs()Returns the sum of the logs of the values that have been added.Double.NaN is returned if no values have been added.
- Returns:
- the sum of logs
-
getSecondMoment
public double getSecondMoment()Returns a statistic related to the Second Central Moment. Specifically, what is returned is the sum of squared deviations from the sample mean among the values that have been added.Returns
Double.NaN
if no data values have been added and returns0
if there is just one value in the data set.- Returns:
- second central moment statistic
-
getQuadraticMean
public double getQuadraticMean()Returns the quadratic mean, a.k.a. root-mean-square of the available values- Returns:
- The quadratic mean or
Double.NaN
if no values have been added.
-
getStandardDeviation
public double getStandardDeviation()Returns the standard deviation of the values that have been added.Double.NaN is returned if no values have been added.
- Specified by:
getStandardDeviation
in interfaceStatisticalSummary
- Returns:
- the standard deviation
-
getMedian
public double getMedian()Returns an estimate of the median of the values that have been entered. SeeRandomPercentile
for a description of the algorithm used for large data streams.- Returns:
- the median
-
getPercentile
public double getPercentile(double percentile) Returns an estimate of the given percentile of the values that have been entered. SeeRandomPercentile
for a description of the algorithm used for large data streams.- Parameters:
percentile
- the desired percentile (must be between 0 and 100)- Returns:
- estimated percentile
-
aggregate
Aggregates the provided instance into this instance.This method can be used to combine statistics computed over partitions or subsamples - i.e., the value of this instance after this operation should be the same as if a single statistic would have been applied over the combined dataset. Statistics are aggregated only when both this and other are maintaining them. For example, if this.computeMoments is false, but other.computeMoments is true, the moment data in other will be lost.
- Specified by:
aggregate
in interfaceAggregatableStatistic<StreamingStatistics>
- Parameters:
other
- the instance to aggregate into this instance
-
toString
Generates a text report displaying summary statistics from values that have been added. -
equals
Returns true iffobject
is aStreamingStatistics
instance and all statistics have the same values as this. -
hashCode
public int hashCode()Returns hash code based on values of statistics. -
builder
Returns aStreamingStatistics.StreamingStatisticsBuilder
to source configuredStreamingStatistics
instances.- Returns:
- a StreamingStatisticsBuilder instance
-