Title: | Multivariate Linear Model with Analytic p-Values |
---|---|
Description: | Allows a user to conduct multivariate multiple regression using analytic p-values rather than classic approximate F-tests. |
Authors: | Daniel B. McArtor ([email protected]) [aut, cre] |
Maintainer: | Daniel B. McArtor <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.4 |
Built: | 2024-10-26 05:41:46 UTC |
Source: | https://github.com/dmcartor/mvlm |
The MVLM
package is used to fit linear models with a multivariate
outcome. It utilizes the asymptotic null distribution of the multivariate
linear model test statistic to compute p-values (McArtor et al., under
review). It therefore alleviates the need to use approximate p-values based
Wilks Lambda, Pillai's Trace, the Hotelling-Lawley Trace, and Roy's Greatest
Root.
To access this package's tutorial, type the following line into the console:
vignette("mvlm-vignette")
There is one primary function that comprises this package:
vignette('mvlm-vignette')
There is one primary functions that comprise this package:
mvlm
, which regresses a multivariate outcome onto a set of
predictors. Standard functions like summary
, fitted
,
residuals
, and predict
can be called on a mvlm
output
object.
Davies, R. B. (1980). The Distribution of a Linear Combination of chi-square Random Variables. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(3), 323-333.
Duchesne, P., & De Micheaux, P.L. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4), 858-862.
McArtor, D. B., Grasman, R. P. P. P., Lubke, G. H., & Bergeman, C. S. (under review). The asymptotic null distribution of the multivariate linear model test statistic. Manuscript submitted for publication.
data(mvlmdata) Y <- as.matrix(Y.mvlm) mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) summary(mvlm.res)
data(mvlmdata) Y <- as.matrix(Y.mvlm) mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) summary(mvlm.res)
fitted
method for class mvlm
.
## S3 method for class 'mvlm' fitted(object, ...)
## S3 method for class 'mvlm' fitted(object, ...)
object |
Output from |
... |
Further arguments passed to or from other methods. |
A data frame of fitted values with the same dimension as the outcome
data passed to mvlm
Daniel B. McArtor ([email protected]) [aut, cre]
data(mvlmdata) Y <- as.matrix(Y.mvlm) mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) Y.hat <- fitted(mvlm.res)
data(mvlmdata) Y <- as.matrix(Y.mvlm) mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) Y.hat <- fitted(mvlm.res)
mvlm
is used to fit linear models with a multivariate outcome. It uses
the asymptotic null distribution of the multivariate linear model test
statistic to compute p-values (McArtor et al., under review). It therefore
alleviates the need to use approximate p-values based Wilks' Lambda, Pillai's
Trace, the Hotelling-Lawley Trace, and Roy's Greatest Root.
mvlm(formula, data, n.cores = 1, start.acc = 1e-20, contr.factor = "contr.sum", contr.ordered = "contr.poly")
mvlm(formula, data, n.cores = 1, start.acc = 1e-20, contr.factor = "contr.sum", contr.ordered = "contr.poly")
formula |
An object of class |
data |
Mandatory |
n.cores |
Number of cores to use in parallelization through the
|
start.acc |
Starting accuracy of the Davies (1980) algorithm
implemented in the |
contr.factor |
The type of contrasts used to test unordered categorical
variables that have type |
contr.ordered |
The type of contrasts used to test ordered categorical
variables that have type |
Importantly, the outcome of formula
must be a matrix
, and the
object passed to data
must be a data frame containing all of the
variables that are named as predictors in formula
.
The conditional effects of variables of type factor
or ordered
in data
are computed based on the type of contrasts specified by
contr.factor
and contr.ordered
. If data
contains an
(ordered or unordered) factor with k
levels, a k-1
degree of
freedom test will be conducted corresponding to that factor and the specified
contrast structure. If, instead, the user wants to assess k-1
separate
single DF tests that comprise this omnibus effect (similar to the approach
taken by lm
), then the appropriate model matrix should be formed in
advance and passed to mvlm
directly in the data
parameter. See
the package vigentte for an example by calling
vignette('mvlm-vignette')
.
An object with nine elements and a summary function. Calling
summary(mvlm.res)
produces a data frame comprised of:
Statistic |
Value of the corresponding test statistic. |
Numer DF |
Numerator degrees of freedom for each test statistic. |
Pseudo R2 |
Size of the corresponding (omnibus or conditional) effect on the multivariate outcome. Note that the intercept term does not have an estimated effect size. |
p-value |
The p-value for each (omnibus or conditional) effect. |
In addition to the information in the three columns comprising
summary(mvlm.res)
, the mvlm.res
object also contains:
p.prec |
A data.frame reporting the precision of each p-value.
These are the maximum error bound of the p-values reported by the
|
y.rsq |
A matrix containing in its first row the overall variance explained by the model for variable comprising Y (columns). The remaining rows list the variance of each outcome that is explained by the conditional effect of each predictor. |
beta.hat |
Estimated regression coefficients. |
adj.n |
Adjusted sample size used to determine whether or not the asmptotic properties of the model are likely to hold. See McArtor et al. (under review) for more detail. |
data |
Original input data and the |
formula |
The formula passed to |
Note that the printed output of summary(res)
will truncate p-values
to the smallest trustworthy values, but the object returned by
summary(mvlm.res)
will contain the p-values as computed. If the error
bound of the Davies algorithm is larger than the p-value, the only conclusion
that can be drawn with certainty is that the p-value is smaller than (or
equal to) the error bound.
Daniel B. McArtor ([email protected]) [aut, cre]
Davies, R. B. (1980). The Distribution of a Linear Combination of chi-square Random Variables. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(3), 323-333.
Duchesne, P., & De Micheaux, P.L. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4), 858-862.
McArtor, D. B., Grasman, R. P. P. P., Lubke, G. H., & Bergeman, C. S. (under review). A new approach to conducting linear model hypothesis tests with a multivariate outcome.
data(mvlmdata) Y <- as.matrix(Y.mvlm) # Main effects model mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) summary(mvlm.res) # Include two-way interactions mvlm.res.int <- mvlm(Y ~ .^2, data = X.mvlm) summary(mvlm.res.int)
data(mvlmdata) Y <- as.matrix(Y.mvlm) # Main effects model mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) summary(mvlm.res) # Include two-way interactions mvlm.res.int <- mvlm(Y ~ .^2, data = X.mvlm) summary(mvlm.res.int)
predict
method for class mvlm
. To predict using new data, the
predictor data frame passed to newdata
must have the same number of
columns as the data used to fit the model, and the names of each variable
must match the names of the original data.
## S3 method for class 'mvlm' predict(object, newdata, ...)
## S3 method for class 'mvlm' predict(object, newdata, ...)
object |
Output from |
newdata |
Data frame of observations on the predictors used to fit the model. |
... |
Further arguments passed to or from other methods. |
A data frame of predicted values
Daniel B. McArtor ([email protected]) [aut, cre]
data(mvlmdata) Y.train <- as.matrix(Y.mvlm[1:150,]) X.train <- X.mvlm[1:150,] mvlm.res <- mvlm(Y.train ~ ., data = X.train) X.test <- X.mvlm[151:200,] Y.predict <- predict(mvlm.res, newdata = X.test)
data(mvlmdata) Y.train <- as.matrix(Y.mvlm[1:150,]) X.train <- X.mvlm[1:150,] mvlm.res <- mvlm(Y.train ~ ., data = X.train) X.test <- X.mvlm[151:200,] Y.predict <- predict(mvlm.res, newdata = X.test)
print
method for class mvlm
## S3 method for class 'mvlm' print(x, ...)
## S3 method for class 'mvlm' print(x, ...)
x |
Output from |
... |
Further arguments passed to or from other methods. |
p-value |
Analytical p-values for the omnibus test and each predictor |
Daniel B. McArtor ([email protected]) [aut, cre]
residuals
method for class mvlm
.
## S3 method for class 'mvlm' residuals(object, ...)
## S3 method for class 'mvlm' residuals(object, ...)
object |
Output from |
... |
Further arguments passed to or from other methods. |
A data frame of residuals with the same dimension as the outcome
data passed to mvlm
Daniel B. McArtor ([email protected]) [aut, cre]
data(mvlmdata) Y <- as.matrix(Y.mvlm) mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) Y.resid <- resid(mvlm.res)
data(mvlmdata) Y <- as.matrix(Y.mvlm) mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) Y.resid <- resid(mvlm.res)
summary
method for class mvlm
## S3 method for class 'mvlm' summary(object, ...)
## S3 method for class 'mvlm' summary(object, ...)
object |
Output from |
... |
Further arguments passed to or from other methods. |
Calling
summary(mvlm.res)
produces a data frame comprised of:
Statistic |
Value of the corresponding test statistic. |
Numer DF |
Numerator degrees of freedom for each test statistic. |
Pseudo R2 |
Size of the corresponding (omnibus or conditional) effect on the multivariate outcome. Note that the intercept term does not have an estimated effect size. |
p-value |
The p-value for each (omnibus or conditional) effect. |
In addition to the information in the three columns comprising
summary(mvlm.res)
, the mvlm.res
object also contains:
p.prec |
A data.frame reporting the precision of each p-value.
These are the maximum error bound of the p-values reported by the
|
y.rsq |
A matrix containing in its first row the overall variance explained by the model for variable comprising Y (columns). The remaining rows list the variance of each outcome that is explained by the conditional effect of each predictor. |
beta.hat |
Estimated regression coefficients. |
adj.n |
Adjusted sample size used to determine whether or not the asmptotic properties of the model are likely to hold. See McArtor et al. (under review) for more detail. |
data |
Original input data and the |
Note that the printed output of summary(res)
will truncate p-values
to the smallest trustworthy values, but the object returned by
summary(mvlm.res)
will contain the p-values as computed. If the error
bound of the Davies algorithm is larger than the p-value, the only conclusion
that can be drawn with certainty is that the p-value is smaller than (or
equal to) the error bound.
Daniel B. McArtor ([email protected]) [aut, cre]
Davies, R. B. (1980). The Distribution of a Linear Combination of chi-square Random Variables. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29(3), 323-333.
Duchesne, P., & De Micheaux, P.L. (2010). Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods. Computational Statistics and Data Analysis, 54(4), 858-862.
McArtor, D. B., Grasman, R. P. P. P., Lubke, G. H., & Bergeman, C. S. (under review). A new approach to conducting linear model hypothesis tests with a multivariate outcome.
data(mvlmdata) Y <- as.matrix(Y.mvlm) # Main effects model mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) summary(mvlm.res) # Include two-way interactions mvlm.res.int <- mvlm(Y ~ .^2, data = X.mvlm) summary(mvlm.res.int)
data(mvlmdata) Y <- as.matrix(Y.mvlm) # Main effects model mvlm.res <- mvlm(Y ~ Cont + Cat + Ord, data = X.mvlm) summary(mvlm.res) # Include two-way interactions mvlm.res.int <- mvlm(Y ~ .^2, data = X.mvlm) summary(mvlm.res.int)
See package vignette by calling vignette('mvlm-vignette')
.
X.mvlm
X.mvlm
An object of class data.frame
with 200 rows and 3 columns.
See package vignette by calling vignette('mvlm-vignette')
.
Y.mvlm
Y.mvlm
An object of class matrix
with 200 rows and 5 columns.