Class Percentile
- java.lang.Object
-
- org.hipparchus.stat.descriptive.AbstractUnivariateStatistic
-
- org.hipparchus.stat.descriptive.rank.Percentile
-
- All Implemented Interfaces:
Serializable
,UnivariateStatistic
,MathArrays.Function
public class Percentile extends AbstractUnivariateStatistic implements Serializable
Provides percentile computation.There are several commonly used methods for estimating percentiles (a.k.a. quantiles) based on sample data. For large samples, the different methods agree closely, but when sample sizes are small, different methods will give significantly different results. The algorithm implemented here works as follows:
- Let
n
be the length of the (sorted) array and0 < p <= 100
be the desired percentile. - If
n = 1
return the unique array element (regardless of the value ofp
); otherwise - Compute the estimated percentile position
pos = p * (n + 1) / 100
and the difference,d
betweenpos
andfloor(pos)
(i.e. the fractional part ofpos
). - If
pos < 1
return the smallest element in the array. - Else if
pos >= n
return the largest element in the array. - Else let
lower
be the element in positionfloor(pos)
in the array and letupper
be the next element in the array. Returnlower + d * (upper - lower)
To compute percentiles, the data must be at least partially ordered. Input arrays are copied and recursively partitioned using an ordering definition. The ordering used by
Arrays.sort(double[])
is the one determined byDouble.compareTo(Double)
. This ordering makesDouble.NaN
larger than any other value (includingDouble.POSITIVE_INFINITY
). Therefore, for example, the median (50th percentile) of{0, 1, 2, 3, 4, Double.NaN}
evaluates to2.5.
Since percentile estimation usually involves interpolation between array elements, arrays containing
NaN
or infinite values will often result inNaN
or infinite values returned.Further, to include different estimation types such as R1, R2 as mentioned in Quantile page(wikipedia), a type specific NaN handling strategy is used to closely match with the typically observed results from popular tools like R(R1-R9), Excel(R7).
Percentile uses only selection instead of complete sorting and caches selection algorithm state between calls to the various
evaluate
methods. This greatly improves efficiency, both for a single percentile and multiple percentile computations. To maximize performance when multiple percentiles are computed based on the same data, users should set the data array once using either one of theevaluate(double[], double)
orsetData(double[])
methods and thereafterevaluate(double)
with just the percentile provided.Note that this implementation is not synchronized. If multiple threads access an instance of this class concurrently, and at least one of the threads invokes the
increment()
orclear()
method, it must be synchronized externally.- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Percentile.EstimationType
An enum for various estimation strategies of a percentile referred in wikipedia on quantile with the names of enum matching those of types mentioned in wikipedia.
-
Constructor Summary
Constructors Modifier Constructor Description Percentile()
Constructs a Percentile with the following defaults.Percentile(double quantile)
Constructs a Percentile with the specific quantile value and the following default method type:Percentile.EstimationType.LEGACY
default NaN strategy:NaNStrategy.REMOVED
a Kth Selector :KthSelector
protected
Percentile(double quantile, Percentile.EstimationType estimationType, NaNStrategy nanStrategy, KthSelector kthSelector)
Constructs a Percentile with the specific quantile value,Percentile.EstimationType
,NaNStrategy
andKthSelector
.Percentile(Percentile original)
Copy constructor, creates a newPercentile
identical to theoriginal
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Percentile
copy()
Returns a copy of the statistic with the same internal state.double
evaluate(double p)
Returns the result of evaluating the statistic over the stored data.double
evaluate(double[] values, double p)
Returns an estimate of thep
th percentile of the values in thevalues
array.double
evaluate(double[] values, int start, int length)
Returns an estimate of thequantile
th percentile of the designated values in thevalues
array.double
evaluate(double[] values, int begin, int length, double p)
Returns an estimate of thep
th percentile of the values in thevalues
array, starting with the element in (0-based) positionbegin
in the array and includinglength
values.Percentile.EstimationType
getEstimationType()
Get the estimationtype
used for computation.KthSelector
getKthSelector()
Get thekthSelector
used for computation.NaNStrategy
getNaNStrategy()
Get theNaN Handling
strategy used for computation.PivotingStrategy
getPivotingStrategy()
Get thePivotingStrategy
used in KthSelector for computation.double
getQuantile()
Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).protected double[]
getWorkArray(double[] values, int begin, int length)
Get the work array to operate.void
setData(double[] values)
Set the data array.void
setData(double[] values, int begin, int length)
Set the data array.void
setQuantile(double p)
Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).Percentile
withEstimationType(Percentile.EstimationType newEstimationType)
Build a new instance similar to the current one except for theestimation type
.Percentile
withKthSelector(KthSelector newKthSelector)
Build a new instance similar to the current one except for thekthSelector
instance specifically set.Percentile
withNaNStrategy(NaNStrategy newNaNStrategy)
Build a new instance similar to the current one except for theNaN handling
strategy.-
Methods inherited from class org.hipparchus.stat.descriptive.AbstractUnivariateStatistic
evaluate, getData, getDataRef
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.hipparchus.stat.descriptive.UnivariateStatistic
evaluate
-
-
-
-
Constructor Detail
-
Percentile
public Percentile()
Constructs a Percentile with the following defaults.- default quantile: 50.0, can be reset with
setQuantile(double)
- default estimation type:
Percentile.EstimationType.LEGACY
, can be reset withwithEstimationType(EstimationType)
- default NaN strategy:
NaNStrategy.REMOVED
, can be reset withwithNaNStrategy(NaNStrategy)
- a KthSelector that makes use of
PivotingStrategy.MEDIAN_OF_3
, can be reset withwithKthSelector(KthSelector)
- default quantile: 50.0, can be reset with
-
Percentile
public Percentile(double quantile) throws MathIllegalArgumentException
Constructs a Percentile with the specific quantile value and the following- default method type:
Percentile.EstimationType.LEGACY
- default NaN strategy:
NaNStrategy.REMOVED
- a Kth Selector :
KthSelector
- Parameters:
quantile
- the quantile- Throws:
MathIllegalArgumentException
- if p is not greater than 0 and less than or equal to 100
- default method type:
-
Percentile
public Percentile(Percentile original) throws NullArgumentException
Copy constructor, creates a newPercentile
identical to theoriginal
- Parameters:
original
- thePercentile
instance to copy- Throws:
NullArgumentException
- if original is null
-
Percentile
protected Percentile(double quantile, Percentile.EstimationType estimationType, NaNStrategy nanStrategy, KthSelector kthSelector) throws MathIllegalArgumentException
Constructs a Percentile with the specific quantile value,Percentile.EstimationType
,NaNStrategy
andKthSelector
.- Parameters:
quantile
- the quantile to be computedestimationType
- one of the percentileestimation types
nanStrategy
- one ofNaNStrategy
to handle with NaNskthSelector
- aKthSelector
to use for pivoting during search- Throws:
MathIllegalArgumentException
- if p is not within (0,100]NullArgumentException
- if type or NaNStrategy passed is null
-
-
Method Detail
-
setData
public void setData(double[] values)
Set the data array.The stored value is a copy of the parameter array, not the array itself.
- Overrides:
setData
in classAbstractUnivariateStatistic
- Parameters:
values
- data array to store (may be null to remove stored data)- See Also:
AbstractUnivariateStatistic.evaluate()
-
setData
public void setData(double[] values, int begin, int length) throws MathIllegalArgumentException
Set the data array. The input array is copied, not referenced.- Overrides:
setData
in classAbstractUnivariateStatistic
- Parameters:
values
- data array to storebegin
- the index of the first element to includelength
- the number of elements to include- Throws:
MathIllegalArgumentException
- if values is null or the indices are not valid- See Also:
AbstractUnivariateStatistic.evaluate()
-
evaluate
public double evaluate(double p) throws MathIllegalArgumentException
Returns the result of evaluating the statistic over the stored data.The stored array is the one which was set by previous calls to
setData(double[])
- Parameters:
p
- the percentile value to compute- Returns:
- the value of the statistic applied to the stored data
- Throws:
MathIllegalArgumentException
- if p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)
-
evaluate
public double evaluate(double[] values, int start, int length) throws MathIllegalArgumentException
Returns an estimate of thequantile
th percentile of the designated values in thevalues
array. The quantile estimated is determined by thequantile
property.- Returns
Double.NaN
iflength = 0
- Returns (for any value of
quantile
)values[begin]
iflength = 1
- Throws
MathIllegalArgumentException
ifvalues
is null, orstart
orlength
is invalid
See
Percentile
for a description of the percentile estimation algorithm used.- Specified by:
evaluate
in interfaceMathArrays.Function
- Specified by:
evaluate
in interfaceUnivariateStatistic
- Specified by:
evaluate
in classAbstractUnivariateStatistic
- Parameters:
values
- the input arraystart
- index of the first array element to includelength
- the number of elements to include- Returns:
- the percentile value
- Throws:
MathIllegalArgumentException
- if the parameters are not valid
- Returns
-
evaluate
public double evaluate(double[] values, double p) throws MathIllegalArgumentException
Returns an estimate of thep
th percentile of the values in thevalues
array.- Returns
Double.NaN
ifvalues
has length0
- Returns (for any value of
p
)values[0]
ifvalues
has length1
- Throws
MathIllegalArgumentException
ifvalues
is null or p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)
The default implementation delegates to
evaluate(double[], int, int, double)
in the natural way.- Parameters:
values
- input array of valuesp
- the percentile value to compute- Returns:
- the percentile value or Double.NaN if the array is empty
- Throws:
MathIllegalArgumentException
- ifvalues
is null or p is invalid
- Returns
-
evaluate
public double evaluate(double[] values, int begin, int length, double p) throws MathIllegalArgumentException
Returns an estimate of thep
th percentile of the values in thevalues
array, starting with the element in (0-based) positionbegin
in the array and includinglength
values.Calls to this method do not modify the internal
quantile
state of this statistic.- Returns
Double.NaN
iflength = 0
- Returns (for any value of
p
)values[begin]
iflength = 1
- Throws
MathIllegalArgumentException
ifvalues
is null ,begin
orlength
is invalid, orp
is not a valid quantile value (p must be greater than 0 and less than or equal to 100)
See
Percentile
for a description of the percentile estimation algorithm used.- Parameters:
values
- array of input valuesp
- the percentile to computebegin
- the first (0-based) element to include in the computationlength
- the number of array elements to include- Returns:
- the percentile value
- Throws:
MathIllegalArgumentException
- if the parameters are not valid or the input array is null
- Returns
-
getQuantile
public double getQuantile()
Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).- Returns:
- quantile set while construction or
setQuantile(double)
-
setQuantile
public void setQuantile(double p) throws MathIllegalArgumentException
Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).- Parameters:
p
- a value between 0 < p <= 100- Throws:
MathIllegalArgumentException
- if p is not greater than 0 and less than or equal to 100
-
copy
public Percentile copy()
Returns a copy of the statistic with the same internal state.- Specified by:
copy
in interfaceUnivariateStatistic
- Specified by:
copy
in classAbstractUnivariateStatistic
- Returns:
- a copy of the statistic
-
getWorkArray
protected double[] getWorkArray(double[] values, int begin, int length)
Get the work array to operate. Makes use of priorstoredData
if it exists or else do a check on NaNs and copy a subset of the array defined by begin and length parameters. The setnanStrategy
will be used to either retain/remove/replace any NaNs present before returning the resultant array.- Parameters:
values
- the array of numbersbegin
- index to start reading the arraylength
- the length of array to be read from the begin index- Returns:
- work array sliced from values in the range [begin,begin+length)
- Throws:
MathIllegalArgumentException
- if values or indices are invalid
-
getEstimationType
public Percentile.EstimationType getEstimationType()
Get the estimationtype
used for computation.- Returns:
- the
estimationType
set
-
withEstimationType
public Percentile withEstimationType(Percentile.EstimationType newEstimationType)
Build a new instance similar to the current one except for theestimation type
.This method is intended to be used as part of a fluent-type builder pattern. Building finely tune instances should be done as follows:
Percentile customized = new Percentile(quantile). withEstimationType(estimationType). withNaNStrategy(nanStrategy). withKthSelector(kthSelector);
If any of the
withXxx
method is omitted, the default value for the corresponding customization parameter will be used.- Parameters:
newEstimationType
- estimation type for the new instance- Returns:
- a new instance, with changed estimation type
- Throws:
NullArgumentException
- when newEstimationType is null
-
getNaNStrategy
public NaNStrategy getNaNStrategy()
Get theNaN Handling
strategy used for computation.- Returns:
NaN Handling
strategy set during construction
-
withNaNStrategy
public Percentile withNaNStrategy(NaNStrategy newNaNStrategy)
Build a new instance similar to the current one except for theNaN handling
strategy.This method is intended to be used as part of a fluent-type builder pattern. Building finely tune instances should be done as follows:
Percentile customized = new Percentile(quantile). withEstimationType(estimationType). withNaNStrategy(nanStrategy). withKthSelector(kthSelector);
If any of the
withXxx
method is omitted, the default value for the corresponding customization parameter will be used.- Parameters:
newNaNStrategy
- NaN strategy for the new instance- Returns:
- a new instance, with changed NaN handling strategy
- Throws:
NullArgumentException
- when newNaNStrategy is null
-
getKthSelector
public KthSelector getKthSelector()
Get thekthSelector
used for computation.- Returns:
- the
kthSelector
set
-
getPivotingStrategy
public PivotingStrategy getPivotingStrategy()
Get thePivotingStrategy
used in KthSelector for computation.- Returns:
- the pivoting strategy set
-
withKthSelector
public Percentile withKthSelector(KthSelector newKthSelector)
Build a new instance similar to the current one except for thekthSelector
instance specifically set.This method is intended to be used as part of a fluent-type builder pattern. Building finely tune instances should be done as follows:
Percentile customized = new Percentile(quantile). withEstimationType(estimationType). withNaNStrategy(nanStrategy). withKthSelector(newKthSelector);
If any of the
withXxx
method is omitted, the default value for the corresponding customization parameter will be used.- Parameters:
newKthSelector
- KthSelector for the new instance- Returns:
- a new instance, with changed KthSelector
- Throws:
NullArgumentException
- when newKthSelector is null
-
-