# How to center in multilevel models

Ever thought about centering your variables before running an regression based analysis? From my personal experiences with students, chances are high that you haven’t. Although a useful data transformation procedure for many different types of statistical analyses, centering is seldom taught in fundamental statistic courses. Although the mathematical principles behind it may seem arbitrary, its effects on results are oftentimes quite strong. Centering is useful in standard statistical modeling (e.g., OLS regression), but its usefulness is particularly evident in multilevel analyses. Although some great papers on this topic exits, I felt that there is not enough hands-on R code and a corresponding documentation of how to apply it to actual multilevel problems. In the following, I will hence discuss different centering approaches and how they affect the results in multilevel model while providing example R code. I did this mostly for myself, but maybe someone might find it useful, too?

So what does centering actually mean? It generally refers to establishing a meaningful zero point on scales that otherwise lack such a value.1 In communication science, for example, we often measure some sort of psychological concept (e.g., attitudes, concerns, feelings…) on rather arbitrary metrics (e.g., 1 = do not at all agree to 5 = Agree fully and completely). The range of a mean index computed from such variables only includes values between 1 and 5. Why is this problematic? Let us consider a basic linear regression example.

### Centering in linear regression analysis

We quickly simulate two correlated variables which we can use to test a classical centering approach. We use the function `rnorm()` to simulate a normally distributed variable `x` and then write a random regression formula for variable `y`

``````library(tidyverse) # for data wrangling
library(multilevel) # for simulating multilevel data
library(lme4) # for estimating multilevel models
library(texreg) # for better outputs
select <- dplyr::select # to avoid clashes with the MASS package

set.seed(1000)
x <- rnorm(100, 2, 0.5)
y <- 3 + 3*x + rnorm(100, 2, .65)
d <- cbind(x,y) %>%
as.tibble
summary(d)``````
``````##        x                y
##  Min.   :0.8388   Min.   : 7.363
##  1st Qu.:1.7204   1st Qu.:10.052
##  Median :2.0206   Median :11.000
##  Mean   :2.0082   Mean   :11.090
##  3rd Qu.:2.2858   3rd Qu.:12.068
##  Max.   :3.3350   Max.   :15.375``````

As we can see, both variables have a range above 0. Let us now estimate a simple linear regression model.

``````lm1 <- lm(y~x, d)

# Results
lm1\$coef %>%
round(2)``````
``````## (Intercept)           x
##        5.20        2.93``````

We get two coefficients. Whereas the second parameter is easily interpretable (1 unit increase in `x` results in an 2.93 unit increase in `y`), the intercepts remains somewhat mysterious. For now, let’s assume that `x` represents attitude towards environmental protection on a scale from 1 to 5 (with higher values indicating a positige attitude towards environmental protection) and `y` represents the frequency of taking public transport per month on a scale from 1 to 15. In this case, the intercept could be interpretated as follows: Participants with a value of `x` = 0 would take the bus 5.20 times per months. The problem is obvious: such a person does not exit in the sample. The intercept is not meaningful.

This is were centering comes into play. In this case, we could simply center around the population’s mean. This can be done by substracting the populations mean from each person’s individual value. Such a procedure does not change the variable per se and most importanty not the variables relation to `y`, yet the values due of course differ.

``````# Centering by substracting mean
d\$x.c <- d\$x-mean(d\$x)
``````## # A tibble: 6 x 2
##          x         x.c
##      <dbl>       <dbl>
## 1 1.777111 -0.23107874
## 2 1.397072 -0.61111789
## 3 2.020563  0.01237355
## 4 2.319694  0.31150460
## 5 1.606723 -0.40146679
## 6 1.807255 -0.20093426``````

We can plot both variables to get a clear understanding of their differences: The distribution remains the same (as it should), yet the means (shown as dashed lines in the plot below) differ. Whereas the original variable has mean close to 2, the centered variable (as it should) has mean of 0.

``````# Visualizing differences in the variables
d %>% gather(Variable, Value, c("x", "x.c")) %>%
ggplot(aes(x = Value,
group = Variable,
fill = Variable)) +
geom_density(alpha = .4) +
labs(x = "Variable X" ,
y = "Count") +
geom_vline(xintercept = mean(d\$x),
linetype = "dashed") +
geom_vline(xintercept = mean(d\$x.c),
linetype = "dashed") +
xlim(-1.5, 4)
`````` In a next step, we estimate the same linear regression model with the mean-centered `x.c` variable and compare the results with the model estimated above.

``````lm2 <- lm(y~x.c, d)

# Comparing result from lm1 and lm2
cbind(uncentered = lm1\$coef,
centered = lm2\$coef) %>%
round(2)``````
``````##             uncentered centered
## (Intercept)       5.20    11.09
## x                 2.93     2.93``````

As we can see, the relationship between both variables remains the same (which of course it should, because we don’t want to mess with that relationship). However, the intercept changed. It is still the value of `y` when `x` equals 0. But 0 in the second model equals the population’s mean. We can thus interpret: Participants with an average value of `x` (e.g., an average attitude toward environmental protection) have a value of 11.09 on the `y` variable (e.g., the frequency of using public transport). This is a meanigful value. A value that provides useful information about the central tendency of the studied sample.

Centering around the population’s mean is the most frequently used centering approach. It becomes particularly important when we include interaction terms in classical linear regression models. When we conduct such moderation analyses, we receive conditional effects that (who would have guessed) represent the relationship between variables for those participants who have a zero on the moderator variable. It thus makes sense to center the independent variables (including the moderator) to get more meaninfully interpretable coefficients.

Centering thus helps us to establish meanigful zero points which, in turn, affects our regression output. It is important to note that centering is not only meaningful for scales that measure some abstract construct on arbitrary metrics. When metric variables (e.g., age or TV use duration in minutes) are included in a regression model, we likewise get intercepts (or conditional effects in moderation analyses) that are bound to their original zero value (e.g., participants whose age is 0 years or who use the TV 0 minutes per day). Also bear in mind that we do not have to center around a population’s mean (although this approach is quite common). We could similarly centered around the minimum or the maximum, one standard deviation below or above the population’s mean, and so on. The appropriate centering strategy again depends on the question of interest.

### Centering in multilevel analyses

Although mean-centering is pretty straight-forward in simple linear regression models with non-hierarchical data, it becomes a bit more complex in multilevel models.2 For the following example, let us assume we conducted an experience sampling study in which 100 participants who answered 10 situational questionnaires (e.g., over the course of 5 days). Repeated measurements (level 1) are hence nested in persons (level 2). Of course, we could also think of other multilevel structures (e.g., pupils in schools, articles in newspapers, citizens in countries, diaries of persons…)

Let us further assume that we want to estimate the relationship between two level 1 variables that were assessed in each situational questionnaire (simulated as b = .50). As the assumption of independent observations is violated because situational measures within each person show higher correlations than situational measures between persons (expressed in the intraclass correlation coefficient = .30), we need to estimate a multilevel model. To simulate the data, we can use the package “multilevel” and the function “sim.icc”.

``````# Simulation of multilevel data with two level 1 variables
set.seed(15324)
d <- sim.icc(gsize = 10,
ngrp = 100,
icc1 = .30,
nitems = 2,
item.cor = .50)

# Renaming variables
d <- d %>%
rename(id = GRP,
dv = VAR1,
iv = VAR2) %>%
as.tibble
d``````
``````## # A tibble: 1,000 x 3
##       id         dv         iv
##    <dbl>      <dbl>      <dbl>
##  1     1 -2.6192245 -1.9856413
##  2     1 -4.4665198 -2.6625285
##  3     1 -2.6292235 -1.8483439
##  4     1 -2.0857113 -1.9805144
##  5     1 -1.4762146 -1.4971770
##  6     1 -1.3775514 -1.9684235
##  7     1 -2.3314514 -0.6334999
##  8     1 -2.5274275 -2.0188760
##  9     1 -0.6930214 -0.6905164
## 10     1 -1.4852694 -0.4586449
## # ... with 990 more rows``````

The resulting data set is organized in a long format: Each row represents one situational assessement. The `id` variable shows us which situational assessments belong to which person. We can quickly check how well the simulation worked and estimate the nullmodel and used the variance parameters to estimate the ICC.3

``````m0 <- lmer(dv ~ 1 + (1|id), d)

# Function to compute the intraclass correlation coefficient (ICC)
compute_icc <- function(lmer_object){
var_dat <- lmer_object %>% VarCorr %>% as.data.frame
icc <- var_dat\$vcov/(var_dat\$vcov+var_dat\$vcov)
return(icc)
}
compute_icc(m0) %>%
round(2)``````
``##  0.32``

The ICC of the dependent variable is .32 and thus very close to the value that we wanted to simulate. It means that about 32% of the variance in this variable may be explainable by interindividual differences (i.e., person-related characteristic). This also means that a larger part of the variance is potentially explainable by situational characteristics (e.g., by other variables measured on the level 1).

If we would have variables on level 2, we could center them around the population mean (just as we have done earlier in the linear regression example). Level 1 variables can be centered in two ways:4

1. Centering around the grand mean (Grand mean centering = GMC)
2. Centering around the person’s mean (also centering within cluster = CWC)

In the first case, we ignore that several situational assessment are clustered within persons. We hence just take the overal mean of all situations regardless of any interindividual differences. In the second case, we center around each person’s mean. Bear in mind that we have to create each person’s mean of the situational variable in order to be able to subtract it from the situational variable to create the person-mean centered variable. It is useful to keep the person’s mean as an additional variable, because we will need it later. Creating a person’s mean is also know as aggregation. We take a level 1 variable and aggregate it to the next higher level (in this case, the person’s level or level 2). As this new variable is essential a level 2 variable, we can should again center it around the grand mean.

``````d <- d %>%
# Grand mean centering (CMC)
mutate(iv.gmc = iv-mean(iv)) %>%
# Person mean centering (more generally, centering within cluster)
group_by(id) %>%
mutate(iv.cm = mean(iv),
iv.cwc = iv-iv.cm) %>%
ungroup %>%
# Grand mean centering of the aggregated variable
mutate(iv.cmc = iv.cm-mean(iv.cm))

# Comparing the results of the centering approaches
d %>% select(iv, iv.gmc, iv.cm, iv.cmc, iv.cmc) %>%
summary``````
``````##        iv               iv.gmc             iv.cm
##  Min.   :-4.11272   Min.   :-4.09607   Min.   :-1.57442
##  1st Qu.:-0.80006   1st Qu.:-0.78342   1st Qu.:-0.51000
##  Median :-0.03193   Median :-0.01529   Median : 0.04509
##  Mean   :-0.01665   Mean   : 0.00000   Mean   :-0.01665
##  3rd Qu.: 0.79000   3rd Qu.: 0.80665   3rd Qu.: 0.47743
##  Max.   : 3.78723   Max.   : 3.80388   Max.   : 1.48612
##      iv.cmc
##  Min.   :-1.55777
##  1st Qu.:-0.49336
##  Median : 0.06173
##  Mean   : 0.00000
##  3rd Qu.: 0.49407
##  Max.   : 1.50276``````

First we can see that both GMC (`iv.gmc`) and CWC (`iv.cwc`) can be used to establish a meaningful zero point. However, bear in mind that these zeros differ: In the GMC variable that value zero represents the overall mean across all situations. In the CWC variable, zero represents the average value across one person’s situations. It is important to understand that the variable representing the persons’ mean (`iv.cm`) and the variable that was person mean centered (`iv.cwc`) represents two parts of the variance that was originally contained in the raw variable (`iv`). The first is the trait or stable component that stays the same across all situations. It differs between persons but not between situations. The second represents situational deviations from this person mean across the 10 situations. We thus have separated the variable iv into level 2 and level 1 variance.

We use these newly created variables and estimate several multilevel models. By the way, we can use the great package “texreg”5 to produce a really neat looking html-table.

``````
m1.nc <- lmer(dv ~ 1 + iv + (1|id), d) # no centering
m1.gmc <- lmer(dv ~ 1 + iv.gmc + (1|id), d) # grand mean centering
m1.cwc <- lmer(dv ~ 1 + iv.cwc + (1|id), d) # centering within cluster
m1.cmc <- lmer(dv ~ 1 + iv.cm + iv.cwc + (1|id), d) # including the person's mean (cluster's mean)

# Creating a nice output (use screenreg() to get a readable output in the console)
htmlreg(list(m1.nc, m1.gmc, m1.cwc, m1.cmc),
single.row = T,
stars = numeric(0),
caption = "",
custom.note = "Model 1 = Uncentered predictor,
Model 2 = grand-mean centered predictor,
Model 3 = person-mean centered predictor,
Model 4 = person-mean centered predictor and aggregated person mean")``````
Model 1 Model 2 Model 3 Model 4
(Intercept) 0.07 (0.04) 0.06 (0.04) 0.06 (0.07) 0.07 (0.04)
iv 0.57 (0.03)
iv.gmc 0.57 (0.03)
iv.cwc 0.50 (0.03) 0.50 (0.03)
iv.cm 0.90 (0.05)
AIC 2616.85 2616.85 2719.30 2578.19
BIC 2636.48 2636.48 2738.93 2602.73
Log Likelihood -1304.43 -1304.43 -1355.65 -1284.09
Num. obs. 1000 1000 1000 1000
Num. groups: id 100 100 100 100
Var: id (Intercept) 0.11 0.11 0.47 0.05
Var: Residual 0.72 0.72 0.71 0.71
Model 1 = Uncentered predictor,
Model 2 = grand-mean centered predictor,
Model 3 = person-mean centered predictor,
Model 4 = person-mean centered predictor and aggregated person mean

Several things in the results table are notworthy: First, Model 1 and 2 only differ with regard to the intercept coefficients. Due to the GMC, however, the intercept in Model 2 can be interpreted more meaningfully. It is the value of the dependant variable when the independent variable equals the mean across all situations. The relationship coefficient is unfortunately pretty hard to interpret in both models because it represents a combination of both trait (level 2) and state (level 1) influence of the independent variable on both variance components of the dependent variable. Furthermore, the model fit indices (e.g., AIC, BIC, etc.) show that both models are identical.

Second, we can observe that using only the CWC variable (Model 3) changes the coefficients and also the model fit. The intercept now represents the average value of dv when iv equals the person’s average value on iv. The relationship coefficients represents that situational effect of `iv` on the level 1 variance of `dv`. Hence a positive 1 unit deviation from the person’s mean results in a 0.50 increase in the dependent variable `dv`. It is important to bear in mind that we quantify only the situational effects in this model. It is hence not surprising that the model fit is a bit worse compared to the other two models.

Third, we can also include both components of the `iv` variable (`iv.cwc` and `iv.cm`) in one model (Model 4). In this case, we get two coefficients which represent both trait and state influences of `iv` on `dv`. We can now see that the stable variance component of `iv` is also positively related to `dv` (b = 0.90, SE = 0.05). In other word, if a person scores 1 unit higher on its mean of `iv` than another person, he or she will also generally score 0.90 units higher on the `dv` compared to the other person. The situational effect is — of course — similar to the one obtained in model 3. Interestingly, this last model (in which both variance components are neatly separated) fits better than model 1 and 2.

Finally, we can look at the amount of explained variance for each model.

``````# Variance reduction approach according to Hox, 2010, pp. 69-78
var_reduction = function(m0, m1){
library(tidyverse)
VarCorr(m0) %>%
as.data.frame %>%
select(grp, var_m0 = vcov) %>%
left_join(VarCorr(m1) %>%
as.data.frame %>%
select(grp, var_m1 = vcov)) %>%
mutate(var_red = 1 - var_m1 / var_m0)
}

# Create comparison table
cbind(M1 = var_reduction(m0, m1.nc)[,4],
M2 = var_reduction(m0, m1.gmc)[,4],
M3 = var_reduction(m0, m1.cwc)[,4],
M4 = var_reduction(m0, m1.cmc)[,4]) %>%
round(2)``````
``````##        M1   M2    M3   M4
## [1,] 0.76 0.76 -0.05 0.88
## [2,] 0.24 0.24  0.25 0.25``````

The table shows the explained variance on level 2 (first row) and on level 1 (second row). As we can see, the first model – which included the raw variable – explains 76% on the person level and 24% on the situational level. The second model in which we used the GMC variable provides similar results. This is not surprising because we did not change the variable per se. We only changed their metrics. The third model, however, provides a different results. We only included the CWC variable which represents merely situational deviations from the person’s mean. It should hence not be too surprising that it explains only variance on level 1. The negative variance on level 2 should not be to alarming as this is a known problem in multilevel analyses 6. Finally, the last model – which includes both the person’s mean (the aggregated variable `iv.cm`) and the situational deviations (`iv.cwc`) – of course explains variance on both levels. Interestingly, however, it is able to explain more variance than model 1 and 2. This is due to the better model fit already identified by looking at the AIC and BIC which results from the clean seperation of both variance components and thus the reduction of covariance. This further emphasizes the usefulness of this last approach.

### Conclusion

If we are interested in relationships between level 1 variables, it makes sense to separate the dependant variable into level 1 and level 2 variance. We can do so by first aggregating the variable to the second level (i.e., estimating the cluster’s mean) and second, by using this aggregated variable to produce a cluster mean centered variable (i.e., substracting the cluster mean from the original variable) to produce the level 1 deviations from the cluster’s mean. This way, we can investigate both the influence of level 2 and level 1 variance (e.g., trait and state variance in experience sampling studies, school and pupil variance in school surveys, person and wave variance in panel studies, and so on) on the dependent variable and thereby acquire more precise estimates of the multilevel relationship.

#### Footnotes

1. e.g., Enders & Tofighi, 2007; Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage.
2. For more information on multilevel modelling, see e.g., Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd ed.). Quantitative methodology series. New York: Routledge; Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Advanced quantitative techniques in the social sciences series. Thousand Oaks, CA: Sage Publications; Gelman, G., & Hill, J. (2009). Data analysis using regression and multilevel/hierarchical models: Analytical methods for social research (11th ed.). Cambridge: Cambridge University Press.
3. Hox, 2010, p. 15.
4. Depending on the value that we center around, we could of course distinguish many more ways. Yet, in general, we either center on the first or the second level.
5. Leifeld, P. (2017). Package ‘texreg’. https://cran.r-project.org/web/packages/texreg/texreg.pdf
6. Hox, 2010, p. 72-73.

## 11 thoughts on “How to center in multilevel models”

1. Hello,
I am running a binomial logistic regression model and it turns out that one of my IVs has an extremely small Beta coefficient in the models such as 0.0000014 and this is difficult to report in the tables. This IV has extreme min and max values and therefore, a very high range. I thought of centering this variable while keeping other IVs at their original values, but I am not sure whether this is the right thing to do. Would you mind expressing your idea on this issue and maybe giving an advice?
Thank you very much in advance.

1. Hey Mleda,
In your particular case, I would not necessarily recommend centering. Instead, you should rescale your variable so that it is more interpretable. Let’s say you have an IV that has values from 1 to 1000 and your DV ranges from 1 to 5. Assuming a small effect size, a regression would probably reveal a very small unstandarized coefficient. Why is that so? It is because 1 change in the IV (e.g., 105 to 106) is not a big differences. So why not simply transform the IV so that the change that we get a coefficient for is more meaningfull. In this case, we could e.g., divide the IV by 100 and would thus get a variable ranging from 1 to 10. The resulting coefficient would then refer to a change of 100 in the IV. As long as you bear that in mind in your interpretation, everythings fine.

In your particular case, I would rescale the IV and then compute Odds ratio (Exp(b)). Unstandardized coefficients in logistic regressions are hard to interpret anyway.

Hope this helps! 😉

2. Could it be possible that you interchanged some variables at and after the summary? CMC is summarized and not mentioned, while CWC is not summarized but mentioned. Furthermore, you use the CM in model 4, but calling it the *centered* aggregated mean, which sounds like it should be CMC? Great subject anyway, I am just a bit confused.

1. EDIT: you named the model variables correctly. Then what’s left over of my confusion is the summary part. Where did we lose CMC on the line? I think this is a more valuable variable if one implements interactions, compared to the absolute CM.

1. Dear Hub, please excuse the late reply. I am sorry for your confusion. I just realized that my model naming convention (m1.nc, m1.gmc…) is indeed annoyingly confusing and unnecessary. And you’re right, I used CM (cluster mean or person mean) instead of CMC (grand-mean centered cluster mean or person mean). Although the results should not change (at least not the resulting slope), it would change the intercept. And I do agree that it many cases, using a centered person-mean makes a lot of sense.

Will update as soon as I have some time. Thanks for your comment! 😉

1. I spot a small error in this line “d %>% select(iv, iv.gmc, iv.cm, iv.cmc, iv.cmc) %>%”, the first “iv.cmc” should be “iv.cwc”. I guess this could explain Hub’s question a bit.

3. Amazing tutorial on the effects of centering, Philipp! Very clear and easy to follow. In fact, it’s probably the only tutorial that includes with clear and well-documented R code I can find online. Do you know of any other tutorials or resources (with R code) that explain the effects of centering? Your tutorial is incredible, but it’s always good to read explanations from different perspectives!

4. Very clear and useful explanation!! Thank You!
I have one question: When I create level1 interaction terms is it wrong to create the interaction term from the raw score and then center it later around the group mean? I have multilevel data from employees nested in departments, and the two versions (centering the L1 variables before or after creating the interaction term) changes the results quite a lot! I am curious what you think about this.

5. Dear Phillip,

Thank you for such a great post, I think that’s a great explanation.

I just have one comment regarding mean centering: is not the same while using a wide format dataset compared with a long format dataset.

Thus, I share with you my code to perform gran mean centering in a wide format dataset (gran mean centering is computed for each wave):

library(dplyr)

# Grand mean centering function
gmc <- function (column) {
column – mean(column, na.rm=TRUE)
}

# Mean centering main predictor: "predict1"
data %
mutate(w1_predict1_gmc = gmc(w1_predict1),
w2_predict1_gmc = gmc(w2_predict1),
w3_predict1_gmc = gmc(w3_predict1),
)

# Show new mean values together with old ones
data %>% select(w1_predict1, w1_predict1_gmc, w2_predict1, w2_predict1_gmc, w3_predict1, w3_predict1_gmc) %>%
summary

1. Dear Armando,

yes, it is of course not the same. Thanks for the extra code .

6. Really nice and interesting post. Keep posting. Thanks for sharing.

360DigiTMG