Class OneWayAnova
Tests for differences between two or more categories of univariate data
(for example, the body mass index of accountants, lawyers, doctors and
computer programmers). When two categories are given, this is equivalent to
the TTest.
Uses the Hipparchus F Distribution implementation to estimate exact p-values.
This implementation is based on a description at One way Anova (dead link)
Abbreviations: bg = between groups,
wg = within groups,
ss = sum squared deviations
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiondoubleanovaFValue(Collection<double[]> categoryData) Computes the ANOVA F-value for a collection ofdouble[]arrays.doubleanovaPValue(Collection<double[]> categoryData) Computes the ANOVA P-value for a collection ofdouble[]arrays.doubleanovaPValue(Collection<StreamingStatistics> categoryData, boolean allowOneElementData) Computes the ANOVA P-value for a collection ofStreamingStatistics.booleananovaTest(Collection<double[]> categoryData, double alpha) Performs an ANOVA test, evaluating the null hypothesis that there is no difference among the means of the data categories.
-
Constructor Details
-
OneWayAnova
public OneWayAnova()Empty constructor.This constructor is not strictly necessary, but it prevents spurious javadoc warnings with JDK 18 and later.
- Since:
- 3.0
-
-
Method Details
-
anovaFValue
public double anovaFValue(Collection<double[]> categoryData) throws MathIllegalArgumentException, NullArgumentException Computes the ANOVA F-value for a collection ofdouble[]arrays.Preconditions:
- The categoryData
Collectionmust containdouble[]arrays. - There must be at least two
double[]arrays in thecategoryDatacollection and each of these arrays must contain at least two values.
This implementation computes the F statistic using the definitional formula
F = msbg/mswg
where
msbg = between group mean square mswg = within group mean square
are as defined here
- Parameters:
categoryData-Collectionofdouble[]arrays each containing data for one category- Returns:
- Fvalue
- Throws:
NullArgumentException- ifcategoryDataisnullMathIllegalArgumentException- if the length of thecategoryDataarray is less than 2 or a containeddouble[]array does not have at least two values
- The categoryData
-
anovaPValue
public double anovaPValue(Collection<double[]> categoryData) throws MathIllegalArgumentException, NullArgumentException, MathIllegalStateException Computes the ANOVA P-value for a collection ofdouble[]arrays.Preconditions:
- The categoryData
Collectionmust containdouble[]arrays. - There must be at least two
double[]arrays in thecategoryDatacollection and each of these arrays must contain at least two values.
This implementation uses the
Hipparchus F Distribution implementationto estimate the exact p-value, using the formulap = 1 - cumulativeProbability(F)
where
Fis the F value andcumulativeProbabilityis the Hipparchus implementation of the F distribution.- Parameters:
categoryData-Collectionofdouble[]arrays each containing data for one category- Returns:
- Pvalue
- Throws:
NullArgumentException- ifcategoryDataisnullMathIllegalArgumentException- if the length of thecategoryDataarray is less than 2 or a containeddouble[]array does not have at least two valuesMathIllegalStateException- if the p-value can not be computed due to a convergence errorMathIllegalStateException- if the maximum number of iterations is exceeded
- The categoryData
-
anovaPValue
public double anovaPValue(Collection<StreamingStatistics> categoryData, boolean allowOneElementData) throws MathIllegalArgumentException, NullArgumentException, MathIllegalStateException Computes the ANOVA P-value for a collection ofStreamingStatistics.Preconditions:
- The categoryData
Collectionmust containStreamingStatistics. - There must be at least two
StreamingStatisticsin thecategoryDatacollection and each of these statistics must contain at least two values.
This implementation uses the
Hipparchus F Distribution implementationto estimate the exact p-value, using the formulap = 1 - cumulativeProbability(F)
where
Fis the F value andcumulativeProbabilityis the Hipparchus implementation of the F distribution.- Parameters:
categoryData-CollectionofStreamingStatisticseach containing data for one categoryallowOneElementData- if true, allow computation for one catagory only or for one data element per category- Returns:
- Pvalue
- Throws:
NullArgumentException- ifcategoryDataisnullMathIllegalArgumentException- if the length of thecategoryDataarray is less than 2 or a containedStreamingStatisticsdoes not have at least two valuesMathIllegalStateException- if the p-value can not be computed due to a convergence errorMathIllegalStateException- if the maximum number of iterations is exceeded
- The categoryData
-
anovaTest
public boolean anovaTest(Collection<double[]> categoryData, double alpha) throws MathIllegalArgumentException, NullArgumentException, MathIllegalStateException Performs an ANOVA test, evaluating the null hypothesis that there is no difference among the means of the data categories.Preconditions:
- The categoryData
Collectionmust containdouble[]arrays. - There must be at least two
double[]arrays in thecategoryDatacollection and each of these arrays must contain at least two values. - alpha must be strictly greater than 0 and less than or equal to 0.5.
This implementation uses the
Hipparchus F Distribution implementationto estimate the exact p-value, using the formulap = 1 - cumulativeProbability(F)
where
Fis the F value andcumulativeProbabilityis the Hipparchus implementation of the F distribution.True is returned iff the estimated p-value is less than alpha.
- Parameters:
categoryData-Collectionofdouble[]arrays each containing data for one categoryalpha- significance level of the test- Returns:
- true if the null hypothesis can be rejected with confidence 1 - alpha
- Throws:
NullArgumentException- ifcategoryDataisnullMathIllegalArgumentException- if the length of thecategoryDataarray is less than 2 or a containeddouble[]array does not have at least two valuesMathIllegalArgumentException- ifalphais not in the range (0, 0.5]MathIllegalStateException- if the p-value can not be computed due to a convergence errorMathIllegalStateException- if the maximum number of iterations is exceeded
- The categoryData
-