Class ZipfDistribution
java.lang.Object
org.hipparchus.distribution.discrete.AbstractIntegerDistribution
org.hipparchus.distribution.discrete.ZipfDistribution
- All Implemented Interfaces:
Serializable
,IntegerDistribution
Implementation of the Zipf distribution.
Parameters:
For a random variable X
whose values are distributed according to this
distribution, the probability mass function is given by
P(X = k) = H(N,s) * 1 / k^s for k = 1,2,...,N
.
H(N,s)
is the normalizing constant
which corresponds to the generalized harmonic number of order N of s.
N
is the number of elementss
is the exponent
-
Constructor Summary
ConstructorDescriptionZipfDistribution
(int numberOfElements, double exponent) Create a new Zipf distribution with the given number of elements and exponent. -
Method Summary
Modifier and TypeMethodDescriptionprotected double
Used bygetNumericalMean()
.protected double
Used bygetNumericalVariance()
.double
cumulativeProbability
(int x) For a random variableX
whose values are distributed according to this distribution, this method returnsP(X <= x)
.double
Get the exponent characterizing the distribution.int
Get the number of elements (e.g.double
Use this method to get the numerical value of the mean of this distribution.double
Use this method to get the numerical value of the variance of this distribution.int
Access the lower bound of the support.int
Access the upper bound of the support.boolean
Use this method to get information about whether the support is connected, i.e.double
logProbability
(int x) For a random variableX
whose values are distributed according to this distribution, this method returnslog(P(X = x))
, wherelog
is the natural logarithm.double
probability
(int x) For a random variableX
whose values are distributed according to this distribution, this method returnsP(X = x)
.Methods inherited from class org.hipparchus.distribution.discrete.AbstractIntegerDistribution
inverseCumulativeProbability, probability, solveInverseCumulativeProbability
-
Constructor Details
-
ZipfDistribution
Create a new Zipf distribution with the given number of elements and exponent.- Parameters:
numberOfElements
- Number of elements.exponent
- Exponent.- Throws:
MathIllegalArgumentException
- ifnumberOfElements <= 0
orexponent <= 0
.
-
-
Method Details
-
getNumberOfElements
public int getNumberOfElements()Get the number of elements (e.g. corpus size) for the distribution.- Returns:
- the number of elements
-
getExponent
public double getExponent()Get the exponent characterizing the distribution.- Returns:
- the exponent
-
probability
public double probability(int x) For a random variableX
whose values are distributed according to this distribution, this method returnsP(X = x)
. In other words, this method represents the probability mass function (PMF) for the distribution.- Parameters:
x
- the point at which the PMF is evaluated- Returns:
- the value of the probability mass function at
x
-
logProbability
public double logProbability(int x) For a random variableX
whose values are distributed according to this distribution, this method returnslog(P(X = x))
, wherelog
is the natural logarithm. In other words, this method represents the logarithm of the probability mass function (PMF) for the distribution. Note that due to the floating point precision and under/overflow issues, this method will for some distributions be more precise and faster than computing the logarithm ofIntegerDistribution.probability(int)
.The default implementation simply computes the logarithm of
probability(x)
.- Specified by:
logProbability
in interfaceIntegerDistribution
- Overrides:
logProbability
in classAbstractIntegerDistribution
- Parameters:
x
- the point at which the PMF is evaluated- Returns:
- the logarithm of the value of the probability mass function at
x
-
cumulativeProbability
public double cumulativeProbability(int x) For a random variableX
whose values are distributed according to this distribution, this method returnsP(X <= x)
. In other words, this method represents the (cumulative) distribution function (CDF) for this distribution.- Parameters:
x
- the point at which the CDF is evaluated- Returns:
- the probability that a random variable with this
distribution takes a value less than or equal to
x
-
getNumericalMean
public double getNumericalMean()Use this method to get the numerical value of the mean of this distribution. For number of elementsN
and exponents
, the mean isHs1 / Hs
, whereHs1 = generalizedHarmonic(N, s - 1)
,Hs = generalizedHarmonic(N, s)
.
- Returns:
- the mean or
Double.NaN
if it is not defined
-
calculateNumericalMean
protected double calculateNumericalMean()Used bygetNumericalMean()
.- Returns:
- the mean of this distribution
-
getNumericalVariance
public double getNumericalVariance()Use this method to get the numerical value of the variance of this distribution. For number of elementsN
and exponents
, the mean is(Hs2 / Hs) - (Hs1^2 / Hs^2)
, whereHs2 = generalizedHarmonic(N, s - 2)
,Hs1 = generalizedHarmonic(N, s - 1)
,Hs = generalizedHarmonic(N, s)
.
- Returns:
- the variance (possibly
Double.POSITIVE_INFINITY
orDouble.NaN
if it is not defined)
-
calculateNumericalVariance
protected double calculateNumericalVariance()Used bygetNumericalVariance()
.- Returns:
- the variance of this distribution
-
getSupportLowerBound
public int getSupportLowerBound()Access the lower bound of the support. This method must return the same value asinverseCumulativeProbability(0)
. In other words, this method must return
The lower bound of the support is always 1 no matter the parameters.inf {x in Z | P(X <= x) > 0}
.- Returns:
- lower bound of the support (always 1)
-
getSupportUpperBound
public int getSupportUpperBound()Access the upper bound of the support. This method must return the same value asinverseCumulativeProbability(1)
. In other words, this method must return
The upper bound of the support is the number of elements.inf {x in R | P(X <= x) = 1}
.- Returns:
- upper bound of the support
-
isSupportConnected
public boolean isSupportConnected()Use this method to get information about whether the support is connected, i.e. whether all integers between the lower and upper bound of the support are included in the support. The support of this distribution is connected.- Returns:
true
-