Class AbstractMultipleLinearRegression

    • Constructor Detail

      • AbstractMultipleLinearRegression

        public AbstractMultipleLinearRegression()
        Empty constructor.

        This constructor is not strictly necessary, but it prevents spurious javadoc warnings with JDK 18 and later.

        Since:
        3.0
    • Method Detail

      • getX

        protected RealMatrix getX()
        Get the X sample data.
        Returns:
        the X sample data.
      • getY

        protected RealVector getY()
        Get the Y sample data.
        Returns:
        the Y sample data.
      • isNoIntercept

        public boolean isNoIntercept()
        Chekc if the model has no intercept term.
        Returns:
        true if the model has no intercept term; false otherwise
      • setNoIntercept

        public void setNoIntercept​(boolean noIntercept)
        Set intercept flag.
        Parameters:
        noIntercept - true means the model is to be estimated without an intercept term
      • newSampleData

        public void newSampleData​(double[] data,
                                  int nobs,
                                  int nvars)

        Loads model x and y sample data from a flat input array, overriding any previous sample.

        Assumes that rows are concatenated with y values first in each row. For example, an input data array containing the sequence of values (1, 2, 3, 4, 5, 6, 7, 8, 9) with nobs = 3 and nvars = 2 creates a regression dataset with two independent variables, as below:

           y   x[0]  x[1]
           --------------
           1     2     3
           4     5     6
           7     8     9
         

        Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term. If isNoIntercept() is true, the X matrix will be created without an initial column of "1"s; otherwise this column will be added.

        Throws IllegalArgumentException if any of the following preconditions fail:

        • data cannot be null
        • data.length = nobs * (nvars + 1)
        • nobs > nvars
        Parameters:
        data - input data array
        nobs - number of observations (rows)
        nvars - number of independent variables (columns, not counting y)
        Throws:
        NullArgumentException - if the data array is null
        MathIllegalArgumentException - if the length of the data array is not equal to nobs * (nvars + 1)
        MathIllegalArgumentException - if nobs is less than nvars + 1
      • newYSampleData

        protected void newYSampleData​(double[] y)
        Loads new y sample data, overriding any previous data.
        Parameters:
        y - the array representing the y sample
        Throws:
        NullArgumentException - if y is null
        MathIllegalArgumentException - if y is empty
      • newXSampleData

        protected void newXSampleData​(double[][] x)

        Loads new x sample data, overriding any previous data.

        The input x array should have one row for each sample observation, with columns corresponding to independent variables. For example, if

          x = new double[][] {{1, 2}, {3, 4}, {5, 6}} 

        then setXSampleData(x) results in a model with two independent variables and 3 observations:

           x[0]  x[1]
           ----------
             1    2
             3    4
             5    6
         

        Note that there is no need to add an initial unitary column (column of 1's) when specifying a model including an intercept term.

        Parameters:
        x - the rectangular array representing the x sample
        Throws:
        NullArgumentException - if x is null
        MathIllegalArgumentException - if x is empty
        MathIllegalArgumentException - if x is not rectangular
      • validateSampleData

        protected void validateSampleData​(double[][] x,
                                          double[] y)
                                   throws MathIllegalArgumentException
        Validates sample data.

        Checks that

        • Neither x nor y is null or empty;
        • The length (i.e. number of rows) of x equals the length of y
        • x has at least one more row than it has columns (i.e. there is sufficient data to estimate regression coefficients for each of the columns in x plus an intercept.
        Parameters:
        x - the [n,k] array representing the x data
        y - the [n,1] array representing the y data
        Throws:
        NullArgumentException - if x or y is null
        MathIllegalArgumentException - if x and y do not have the same length
        MathIllegalArgumentException - if x or y are zero-length
        MathIllegalArgumentException - if the number of rows of x is not larger than the number of columns + 1 if the model has an intercept; or the number of columns if there is no intercept term
      • validateCovarianceData

        protected void validateCovarianceData​(double[][] x,
                                              double[][] covariance)
        Validates that the x data and covariance matrix have the same number of rows and that the covariance matrix is square.
        Parameters:
        x - the [n,k] array representing the x sample
        covariance - the [n,n] array representing the covariance matrix
        Throws:
        MathIllegalArgumentException - if the number of rows in x is not equal to the number of rows in covariance
        MathIllegalArgumentException - if the covariance matrix is not square
      • estimateResiduals

        public double[] estimateResiduals()
        Estimates the residuals, ie u = y - X*b.
        Specified by:
        estimateResiduals in interface MultipleLinearRegression
        Returns:
        The [n,1] array representing the residuals
      • estimateRegressionParametersVariance

        public double[][] estimateRegressionParametersVariance()
        Estimates the variance of the regression parameters, ie Var(b).
        Specified by:
        estimateRegressionParametersVariance in interface MultipleLinearRegression
        Returns:
        The [k,k] array representing the variance of b
      • estimateRegressandVariance

        public double estimateRegressandVariance()
        Returns the variance of the regressand, ie Var(y).
        Specified by:
        estimateRegressandVariance in interface MultipleLinearRegression
        Returns:
        The double representing the variance of y
      • estimateErrorVariance

        public double estimateErrorVariance()
        Estimates the variance of the error.
        Returns:
        estimate of the error variance
      • estimateRegressionStandardError

        public double estimateRegressionStandardError()
        Estimates the standard error of the regression.
        Returns:
        regression standard error
      • calculateBeta

        protected abstract RealVector calculateBeta()
        Calculates the beta of multiple linear regression in matrix notation.
        Returns:
        beta
      • calculateBetaVariance

        protected abstract RealMatrix calculateBetaVariance()
        Calculates the beta variance of multiple linear regression in matrix notation.
        Returns:
        beta variance
      • calculateYVariance

        protected double calculateYVariance()
        Calculates the variance of the y values.
        Returns:
        Y variance
      • calculateErrorVariance

        protected double calculateErrorVariance()

        Calculates the variance of the error term.

        Uses the formula
         var(u) = u · u / (n - k)
         
        where n and k are the row and column dimensions of the design matrix X.
        Returns:
        error variance estimate
      • calculateResiduals

        protected RealVector calculateResiduals()
        Calculates the residuals of multiple linear regression in matrix notation.
         u = y - X * b
         
        Returns:
        The residuals [n,1] matrix