Class ZipfDistribution
- java.lang.Object
-
- org.hipparchus.distribution.discrete.AbstractIntegerDistribution
-
- org.hipparchus.distribution.discrete.ZipfDistribution
-
- All Implemented Interfaces:
Serializable
,IntegerDistribution
public class ZipfDistribution extends AbstractIntegerDistribution
Implementation of the Zipf distribution.Parameters: For a random variable
X
whose values are distributed according to this distribution, the probability mass function is given byP(X = k) = H(N,s) * 1 / k^s for
k = 1,2,...,N
.H(N,s)
is the normalizing constant which corresponds to the generalized harmonic number of order N of s.N
is the number of elementss
is the exponent
-
-
Constructor Summary
Constructors Constructor Description ZipfDistribution(int numberOfElements, double exponent)
Create a new Zipf distribution with the given number of elements and exponent.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected double
calculateNumericalMean()
Used bygetNumericalMean()
.protected double
calculateNumericalVariance()
Used bygetNumericalVariance()
.double
cumulativeProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X <= x)
.double
getExponent()
Get the exponent characterizing the distribution.int
getNumberOfElements()
Get the number of elements (e.g. corpus size) for the distribution.double
getNumericalMean()
Use this method to get the numerical value of the mean of this distribution.double
getNumericalVariance()
Use this method to get the numerical value of the variance of this distribution.int
getSupportLowerBound()
Access the lower bound of the support.int
getSupportUpperBound()
Access the upper bound of the support.boolean
isSupportConnected()
Use this method to get information about whether the support is connected, i.e. whether all integers between the lower and upper bound of the support are included in the support.double
logProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnslog(P(X = x))
, wherelog
is the natural logarithm.double
probability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X = x)
.-
Methods inherited from class org.hipparchus.distribution.discrete.AbstractIntegerDistribution
inverseCumulativeProbability, probability, solveInverseCumulativeProbability
-
-
-
-
Constructor Detail
-
ZipfDistribution
public ZipfDistribution(int numberOfElements, double exponent) throws MathIllegalArgumentException
Create a new Zipf distribution with the given number of elements and exponent.- Parameters:
numberOfElements
- Number of elements.exponent
- Exponent.- Throws:
MathIllegalArgumentException
- ifnumberOfElements <= 0
orexponent <= 0
.
-
-
Method Detail
-
getNumberOfElements
public int getNumberOfElements()
Get the number of elements (e.g. corpus size) for the distribution.- Returns:
- the number of elements
-
getExponent
public double getExponent()
Get the exponent characterizing the distribution.- Returns:
- the exponent
-
probability
public double probability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X = x)
. In other words, this method represents the probability mass function (PMF) for the distribution.- Parameters:
x
- the point at which the PMF is evaluated- Returns:
- the value of the probability mass function at
x
-
logProbability
public double logProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnslog(P(X = x))
, wherelog
is the natural logarithm. In other words, this method represents the logarithm of the probability mass function (PMF) for the distribution. Note that due to the floating point precision and under/overflow issues, this method will for some distributions be more precise and faster than computing the logarithm ofIntegerDistribution.probability(int)
.The default implementation simply computes the logarithm of
probability(x)
.- Specified by:
logProbability
in interfaceIntegerDistribution
- Overrides:
logProbability
in classAbstractIntegerDistribution
- Parameters:
x
- the point at which the PMF is evaluated- Returns:
- the logarithm of the value of the probability mass function at
x
-
cumulativeProbability
public double cumulativeProbability(int x)
For a random variableX
whose values are distributed according to this distribution, this method returnsP(X <= x)
. In other words, this method represents the (cumulative) distribution function (CDF) for this distribution.- Parameters:
x
- the point at which the CDF is evaluated- Returns:
- the probability that a random variable with this
distribution takes a value less than or equal to
x
-
getNumericalMean
public double getNumericalMean()
Use this method to get the numerical value of the mean of this distribution. For number of elementsN
and exponents
, the mean isHs1 / Hs
, whereHs1 = generalizedHarmonic(N, s - 1)
,Hs = generalizedHarmonic(N, s)
.
- Returns:
- the mean or
Double.NaN
if it is not defined
-
calculateNumericalMean
protected double calculateNumericalMean()
Used bygetNumericalMean()
.- Returns:
- the mean of this distribution
-
getNumericalVariance
public double getNumericalVariance()
Use this method to get the numerical value of the variance of this distribution. For number of elementsN
and exponents
, the mean is(Hs2 / Hs) - (Hs1^2 / Hs^2)
, whereHs2 = generalizedHarmonic(N, s - 2)
,Hs1 = generalizedHarmonic(N, s - 1)
,Hs = generalizedHarmonic(N, s)
.
- Returns:
- the variance (possibly
Double.POSITIVE_INFINITY
orDouble.NaN
if it is not defined)
-
calculateNumericalVariance
protected double calculateNumericalVariance()
Used bygetNumericalVariance()
.- Returns:
- the variance of this distribution
-
getSupportLowerBound
public int getSupportLowerBound()
Access the lower bound of the support. This method must return the same value asinverseCumulativeProbability(0)
. In other words, this method must return
The lower bound of the support is always 1 no matter the parameters.inf {x in Z | P(X <= x) > 0}
.- Returns:
- lower bound of the support (always 1)
-
getSupportUpperBound
public int getSupportUpperBound()
Access the upper bound of the support. This method must return the same value asinverseCumulativeProbability(1)
. In other words, this method must return
The upper bound of the support is the number of elements.inf {x in R | P(X <= x) = 1}
.- Returns:
- upper bound of the support
-
isSupportConnected
public boolean isSupportConnected()
Use this method to get information about whether the support is connected, i.e. whether all integers between the lower and upper bound of the support are included in the support. The support of this distribution is connected.- Returns:
true
-
-