讀進資料
其中年級由 2 開始,製造新的年級變項,讓截距對應起點
dta <- read.csv("nlsy.csv",header=T)
dta$grade_c2 <- dta$grade - 2
dta$female <- as.factor(dta$female)
head(dta,12)
## id female low_birth_weight math grade anti grade_c2
## 1 201 1 0 38 3 FALSE 1
## 2 201 1 0 55 5 FALSE 3
## 3 303 1 0 26 2 TRUE 0
## 4 303 1 0 33 5 TRUE 3
## 5 2702 0 0 56 2 FALSE 0
## 6 2702 0 0 58 4 TRUE 2
## 7 2702 0 0 80 8 TRUE 6
## 8 4303 1 0 41 3 TRUE 1
## 9 4303 1 0 58 4 TRUE 2
## 10 5002 0 0 46 4 TRUE 2
## 11 5002 0 0 54 6 TRUE 4
## 12 5002 0 0 66 8 TRUE 6
- 每次觀察是 level 1,個人是 level 2。年級是level
1變項,性別是level2變項。
- 每個人的觀察次數不一定相同,有些人只有兩次
載入 lme4 與畫圖套件
library(pacman)
## Warning: 套件 'pacman' 是用 R 版本 4.2.3 來建造的
p_load(lme4,ggplot2)
畫張圖,看看隨年級變化(成長)情形
ggplot(data=dta,
aes(x = grade, y = math, group = id)) +
geom_line() +
theme_bw() +
scale_x_continuous(breaks = 2:8, name = "Grade") +
scale_y_continuous(name = "Mathematics")
先只處理每個人的資料
rst1 <- lmer(math ~ 1 + (1 | id), data = dta, REML = FALSE)
summary(rst1)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: math ~ 1 + (1 | id)
## Data: dta
##
## AIC BIC logLik deviance df.resid
## 17497.9 17515.0 -8746.0 17491.9 2218
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.79835 -0.57828 0.05408 0.57886 2.52690
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 46.92 6.85
## Residual 116.68 10.80
## Number of obs: 2221, groups: id, 932
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 45.9147 0.3237 141.8
把年級加進去
rst2 <- lmer(math ~ 1 + grade_c2 + (1 + grade_c2 | id), data = dta, REML = FALSE)
summary(rst2)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: math ~ 1 + grade_c2 + (1 + grade_c2 | id)
## Data: dta
##
## AIC BIC logLik deviance df.resid
## 15949.4 15983.6 -7968.7 15937.4 2215
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.03039 -0.52586 0.00162 0.54076 2.54222
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 64.5563 8.0347
## grade_c2 0.7322 0.8557 -0.03
## Residual 36.2315 6.0193
## Number of obs: 2221, groups: id, 932
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 35.26736 0.35499 99.35
## grade_c2 4.33932 0.08734 49.69
##
## Correlation of Fixed Effects:
## (Intr)
## grade_c2 -0.532
- 請注意,每個人會有自己的截距(起點)、自己的斜率(進步速度)
紀錄每個人的截距(起點)與斜率(進步速度)
IntSlope <- coef(rst2)$id
IntSlope$id <- rownames(IntSlope)
names(IntSlope)[1:2] <- c('Intercept','Slope')
head(IntSlope)
## Intercept Slope id
## 201 36.94689 4.534009 201
## 303 26.03608 4.050755 303
## 2702 49.70142 4.594245 2702
## 4303 41.04192 4.548051 4303
## 5002 37.01256 4.496700 5002
## 5005 37.68796 4.324230 5005
看一下男生與女生的平均截距(起點)與斜率(進步速度)
dta_F <- aggregate(female~id,data=dta,FUN=head, 1)
names(dta_F) <- c('id','female')
dta_F$female <- as.factor(dta_F$female)
IntSlope <- merge(IntSlope,dta_F,by="id")
IntSlope_m <- aggregate(cbind(Intercept,Slope)~female,mean,data=IntSlope)
IntSlope_m
## female Intercept Slope
## 1 0 35.70421 4.353250
## 2 1 34.81719 4.324966
IntSlope_s <- aggregate(cbind(Intercept,Slope)~female,sd,data=IntSlope)
IntSlope_s
## female Intercept Slope
## 1 0 7.001527 0.3002893
## 2 1 6.970847 0.3282111
畫個圖看看
ggplot(IntSlope, aes(x=Intercept, y=Slope, color=female)) +
geom_point(alpha=0.5) +
labs(x= "Intercept", y="Slope(grade effect)")+
geom_smooth()+
geom_point(data=IntSlope_m, size=5, shape=24)
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
以性別解釋起點與速度
rst3 <- lmer(math ~ 1 + female + female:grade_c2 +(1 + grade_c2 | id), data = dta, REML = FALSE)
summary(rst3)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: math ~ 1 + female + female:grade_c2 + (1 + grade_c2 | id)
## Data: dta
##
## AIC BIC logLik deviance df.resid
## 15949.4 15995.0 -7966.7 15933.4 2213
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.02610 -0.52455 0.00425 0.53540 2.51795
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 64.2674 8.0167
## grade_c2 0.7328 0.8561 -0.03
## Residual 36.2120 6.0176
## Number of obs: 2221, groups: id, 932
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 35.7507 0.4994 71.591
## female1 -0.9818 0.7091 -1.385
## female0:grade_c2 4.3818 0.1234 35.494
## female1:grade_c2 4.2962 0.1235 34.782
##
## Correlation of Fixed Effects:
## (Intr) femal1 fm0:_2
## female1 -0.704
## fml0:grd_c2 -0.537 0.378
## fml1:grd_c2 0.000 -0.375 0.000
根據模型,畫畫看預測值