R语言中的多元方差分析【MANOVA】

KJY / 2021-06-17


R语言中的多元方差分析【MANOVA】

多变量方差分析(multivariate analysis of variance,简称MANOVA),通常也称为多元方差分析,指的是对于多个组之间多项指标进行比较时所采用的一种复杂的方差分析形式,通过一个综合结果去解释影响因素对多项指标的效应,从而得到一个统一结论。多变量方差分析用于研究控制变量对多个因变量的影响。多变量方差分析的基本原理与单变量方差分析的原理相似,用于分析控制因素取不同水平时因变量的均值是否存在显著性差异。

For example, we might run an experiment in which we give two groups of mice two treatments (A and B) and measure their weight and height.

The weight and height of mice are two dependent variables in this example, and our hypothesis is that the difference in treatment affects both.

treatment能够同时影响weight和height

We conclude that the associated impact (treatment) is significant if the global multivariate test is significant.

多变量能够都被treatment影响

The next step is to figure out whether the treatment impacts only the weight, only the height, or both. To put it another way, we want to figure out which dependent variables led to the substantial global effect.

然后具体去看那种variance被影响,可以用one-way anova进行测试

We can evaluate each dependent variable separately using one-way ANOVA (or univariate ANOVA) to address this question.

接下来是代码测试

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
data <- iris

sample_n(data, 10) # 随机选取10个数据观察
##    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
## 1           6.8         3.2          5.9         2.3 virginica
## 2           6.9         3.2          5.7         2.3 virginica
## 3           6.7         3.1          5.6         2.4 virginica
## 4           5.0         3.3          1.4         0.2    setosa
## 5           6.4         2.8          5.6         2.2 virginica
## 6           6.4         2.7          5.3         1.9 virginica
## 7           7.7         3.8          6.7         2.2 virginica
## 8           4.7         3.2          1.6         0.2    setosa
## 9           6.7         3.0          5.2         2.3 virginica
## 10          5.7         3.8          1.7         0.3    setosa
res.man <- manova(cbind(Sepal.Length, Petal.Length) ~ Species, data = iris)
summary(res.man)
##            Df Pillai approx F num Df den Df    Pr(>F)    
## Species     2 0.9885   71.829      4    294 < 2.2e-16 ***
## Residuals 147                                            
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

manova测试告诉我们两个变量都有影响。

接下来用summary.aov测试看出来单个变量也有影响

summary.aov(res.man)
##  Response Sepal.Length :
##              Df Sum Sq Mean Sq F value    Pr(>F)    
## Species       2 63.212  31.606  119.26 < 2.2e-16 ***
## Residuals   147 38.956   0.265                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Response Petal.Length :
##              Df Sum Sq Mean Sq F value    Pr(>F)    
## Species       2 437.10 218.551  1180.2 < 2.2e-16 ***
## Residuals   147  27.22   0.185                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

最后一次修改于 2021-06-17