Class MannWhitneyUTest


  • public class MannWhitneyUTest
    extends Object
    An implementation of the Mann-Whitney U test.

    The definitions and computing formulas used in this implementation follow those in the article, Mann-Whitney U Test

    In general, results correspond to (and have been tested against) the R wilcox.test function, with exact meaning the same thing in both APIs and CORRECT uniformly true in this implementation. For example, wilcox.test(x, y, alternative = "two.sided", mu = 0, paired = FALSE, exact = FALSE correct = TRUE) will return the same p-value as mannWhitneyUTest(x, y, false). The minimum of the W value returned by R for wilcox.test(x, y...) and wilcox.test(y, x...) should equal mannWhitneyU(x, y...).

    • Constructor Detail

      • MannWhitneyUTest

        public MannWhitneyUTest()
        Create a test instance using where NaN's are left in place and ties get the average of applicable ranks.
      • MannWhitneyUTest

        public MannWhitneyUTest​(NaNStrategy nanStrategy,
                                TiesStrategy tiesStrategy)
        Create a test instance using the given strategies for NaN's and ties.
        Parameters:
        nanStrategy - specifies the strategy that should be used for Double.NaN's
        tiesStrategy - specifies the strategy that should be used for ties
    • Method Detail

      • mannWhitneyU

        public double mannWhitneyU​(double[] x,
                                   double[] y)
                            throws MathIllegalArgumentException,
                                   NullArgumentException
        Computes the Mann-Whitney U statistic comparing means for two independent samples possibly of different lengths.

        This statistic can be used to perform a Mann-Whitney U test evaluating the null hypothesis that the two independent samples have equal mean.

        Let Xi denote the i'th individual of the first sample and Yj the j'th individual in the second sample. Note that the samples can have different lengths.

        Preconditions:

        • All observations in the two samples are independent.
        • The observations are at least ordinal (continuous are also ordinal).
        Parameters:
        x - the first sample
        y - the second sample
        Returns:
        Mann-Whitney U statistic (minimum of Ux and Uy)
        Throws:
        NullArgumentException - if x or y are null.
        MathIllegalArgumentException - if x or y are zero-length.
      • mannWhitneyUTest

        public double mannWhitneyUTest​(double[] x,
                                       double[] y)
                                throws MathIllegalArgumentException,
                                       NullArgumentException
        Returns the asymptotic observed significance level, or p-value, associated with a Mann-Whitney U Test comparing means for two independent samples.

        Let Xi denote the i'th individual of the first sample and Yj the j'th individual in the second sample.

        Preconditions:

        • All observations in the two samples are independent.
        • The observations are at least ordinal.

        If there are no ties in the data and both samples are small (less than or equal to 50 values in the combined dataset), an exact test is performed; otherwise the test uses the normal approximation (with continuity correction).

        If the combined dataset contains ties, the variance used in the normal approximation is bias-adjusted using the formula in the reference above.

        Parameters:
        x - the first sample
        y - the second sample
        Returns:
        approximate 2-sized p-value
        Throws:
        NullArgumentException - if x or y are null.
        MathIllegalArgumentException - if x or y are zero-length
      • mannWhitneyUTest

        public double mannWhitneyUTest​(double[] x,
                                       double[] y,
                                       boolean exact)
                                throws MathIllegalArgumentException,
                                       NullArgumentException
        Returns the asymptotic observed significance level, or p-value, associated with a Mann-Whitney U Test comparing means for two independent samples.

        Let Xi denote the i'th individual of the first sample and Yj the j'th individual in the second sample.

        Preconditions:

        • All observations in the two samples are independent.
        • The observations are at least ordinal.

        If exact is true, the p-value reported is exact, computed using the exact distribution of the U statistic. The computation in this case requires storage on the order of the product of the two sample sizes, so this should not be used for large samples.

        If exact is false, the normal approximation is used to estimate the p-value.

        If the combined dataset contains ties and exact is true, MathIllegalArgumentException is thrown. If exact is false and the ties are present, the variance used to compute the approximate p-value in the normal approximation is bias-adjusted using the formula in the reference above.

        Parameters:
        x - the first sample
        y - the second sample
        exact - true means compute the p-value exactly, false means use the normal approximation
        Returns:
        approximate 2-sided p-value
        Throws:
        NullArgumentException - if x or y are null.
        MathIllegalArgumentException - if x or y are zero-length or if exact is true and ties are present in the data