Hierarchical Linear Modeling

HLM: Hierarchical Linear Modeling

Katy Pearce, CRRC Armenia, May 15-16, 2008

Introduction

Katy Pearce, current PhD student in Communication at University of California, Santa Barbara.

Communication is sociology + psychology.

Studies technology and how cultural characteristics can moderate technology adoption, attitudes, and use.

Introduction

Data with nested structures are frequently observed in behavioral/social sciences.

For example: Educational settings: Students are

nested within classes; classes are nested within schools.

Organizational studies: Workers are nested within departments; departments are nested within organizations.

Cross-cultural research: People are nested within countries.

But we often ignore these structures.

Example 1

• Educational achievement:Imagine 5 little boys who are very similar:

parental education is the same = low, parental income is the same = low, IQ is the same = low, etc. These 5 boys go to 5 different schools: an excellent school, a very good school, a good school, a poor school, and a very poor school. With HLM we can compare the impact of these different types of schools on the boys’ education achievement (test scores, grades, etc.). One can imagine that the mean parental education, parental income, and IQ are low are the very poor school and are high at the excellent school. With HLM we can control for variance at both the individual and the mean level.

But first, a brief review of other statistical techniques ANOVA: 1 IV with 2+ levels -> DV,

to compare means among the 2+ groups. These means are compared by analyzing the variance in the DV.

Linear regression: linear relationship between two variables so that 1 may predict the other. 1 predictor variable -> 1 criterion variable

Multiple regression: 2+ predictor variables -> 1 criterion varaible

Example 2 World Values Survey Trust and satisfactionTrust and satisfaction with one’s

life have been shown to be related. However, it is possible that the “mean” trust level in a society can moderate this relationship.

L1 (individual): trust generally -> satisfaction with one’s life

L2 (society): “mean” trust level

First, we need to get the data ready Step 1: prepare the file

1. The World Values Survey is too big for the student version of HLM, so let’s take ~10% of the sample and save the file.

2. Sort by nation [v2], save the file.3. Aggregate the data: “break

variable” is nation [v2] and “aggregate variables” are life satisfaction [v81] and take advantage [v26], but sure to create a new data file

HLM program

Step 2: create HLM file1.Open the HLM program2.go to the File menu and select

the following options: Make new MDM file... Stat package input

3.For the L1 file, open your WVS random file

4.For the L2 file, open your WVS aggregate file

HLM program 2

5. Now you must select the variables, in the L2 file the “ID” is v2 (nation) and the other two variables are in MDM. In the L2 file, the “ID” is also v2 and the two variables in the MDM are v26 and v81

6. Select “yes” for missing data and “delete missing data while making MDM”

7. Save the file8. Click “Make MDM”9. Click “Done”

Effects

Before we get to the actual data analysis, let’s talk about effects in HLM.

Fixed effects are the only levels of a variable in which a researcher is interested in studying.

Random effects are a subset of the total possible levels of a variable where the researcher is interested in generalizing to levels not observed.

For example, let’s say that we set up a school where in different classrooms, some of the students receive special tutoring and others are in a control group. A fixed effect variable would be which group the student was in: control or treatment, only two groups exist. A random effect variable would be the classroom that the student was in, as it shouldn’t matter to the study.

HLM analysis – Means as Outcomes9. Let’s start with specifying the L1

model. First we need to tell the program what our DV is, life satisfaction or [v81]. Click on v81 and select “outcome variable.”

10. Now we need to tell the program what our fixed and random effects are. V26 (trust) is a fixed effect, because we care about it. The intercept and slope are by default random effects.

11. Repeat for L2.12. Click “Run analysis”

Output

13. Go to the file menu, click on “View Output”

They show us the model:Summary of the model specified (in equation

format)

---------------------------------------------------

Level-1 ModelY = B0 + B1*(V26) + R

Level-2 ModelB0 = G00 + G01*(V26_1) + U0B1 = G10 + G11*(V26_1)

Output 2 Sigma_squared = 82.48620 Tau INTRCPT1,B0 4.21449 Tau (as correlations) INTRCPT1,B0 1.000

----------------------------------------------------

Random level-1 coefficient Reliability estimate

----------------------------------------------------

INTRCPT1, B0 0.845

----------------------------------------------------

The value of the likelihood function at iteration 5 = -1.747747E+004

The outcome variable is V81

Output 3

Final estimation of fixed effects:

---------------------------------------------------------

Standard Approx.

Fixed Effect Coefficient Error T-ratio d.f. P-value

----------------------------------------------------------------------------

For INTRCPT1, B0 INTRCPT2, G00 7.652744 0.870587

8.790 38 0.000 V26_1, G01 -0.440045 0.404599 -

1.088 38 0.284 For V26 slope, B1 INTRCPT2, G10 0.333436 0.195697

1.704 4806 0.088 V26_1, G11 -0.070027 0.078756 -

0.889 4806 0.374

----------------------------------------------------------------------------

Output 4

The outcome variable is V81

Final estimation of fixed effects (with robust standard errors)

----------------------------------------------------------------------------

Standard Approx.

Fixed Effect Coefficient Error T-ratio d.f. P-value

----------------------------------------------------------------------------

For INTRCPT1, B0 INTRCPT2, G00 7.652744 0.670477 11.414

38 0.000 V26_1, G01 -0.440045 0.309190 -1.423

38 0.163 For V26 slope, B1 INTRCPT2, G10 0.333436 0.212963 1.566

4806 0.117 V26_1, G11 -0.070027 0.075376 -0.929

4806 0.353

----------------------------------------------------------------------------

Output 5

Final estimation of variance components:

-----------------------------------------------------------------------------

Random Effect Standard Variance df Chi-square P-value

Deviation Component

-----------------------------------------------------------------------------

INTRCPT1, U0 2.05292 4.21449 38 288.90950 0.000

level-1, R 9.08219 82.48620

-----------------------------------------------------------------------------

Statistics for current covariance components model -------------------------------------------------- Deviance = 34954.948408 Number of estimated parameters = 2

What to do with this output?

First, we must calculate the intraclass correlation.

ρ = τ00 / (τ00 + σ2) 4.21449 / (4.21449 + 82.48620)= 4.21449/ 86.70069= 0.0486096477Which means that ~5% of the

variance is at the national level (L2), and that 95% of the variance is at the individual (L1) level.

Let’s try some different WVS examples Family important [v4] -> Work

important [v8]~6% of variance is at the national

level.• Democracy isn’t good [v171] ->

Having army rule [v166]~57% of the variance is at the

national level.

CRRC DI

3 countries (AM, AZ, and GE) are technically too small of groups to compare, but can compare regions

First, Armenia only, sort by quadrant.

What variables would differ by quadrant?

• English language knowledge level [e9_2] -> political cooperation with U.S. [p15_6]

3% of variance is at the quadrant level

Your own data Your own data set Needs to have 10+ groups Continuous variables or categorical,

but preferably with a larger scale If you don’t have your own data,

you’re welcome to use the WVS or CRRC DI or if there is a topic that you’re interested in, get a data set before tomorrow or give me a sense of your interests and I’ll find one.

Other datasets freely available http://www.icpsr.umich.edu/:

archive of thousands of datasets http://unstats.un.org/unsd/default.

htm: United Nations Statistics

http://www.worldbank.org/data : World Bank data

http://www.icpsr.umich.edu/

http://unstats.un.org/unsd/default.htm

http://unstats.un.org/unsd/default.htm

http://www.worldbank.org/data

Technology

Hierarchical Linear Modeling