Author ORCID Identifier

Date Available


Year of Publication


Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation




Education Sciences

First Advisor

Dr. Xin Ma


To help improve and advance research methodology when comparing the group characteristics, two advanced multilevel models were developed and introduced, which would allow a deeper and more refined look at the issue of sex differences in reading achievement.

The first model is a restricted multilevel model for the examination of institutional effects on multiple groups of individuals. The goal of this multivariate multilevel model with individuals nested within institutions was to estimate the institutional effects on multiple groups of individuals. With the employment of 2009 OECD Programme for International Student Assessment (PISA) data, an application was illustrated to examine whether school reading environment had the same effect on reading achievement between boys and girls. In this two-level model, the level 1 was a multivariate model highlighting students’ average reading achievement for each sex group (two dichotomous variables) and level 2 was two linear regression equations, one for boys and one for girls. The effects of five school reading environment variables (diversity of reading, enjoyment of reading, stimulators of reading, daily reading hours, and online reading hours) were constrained respectively to be the same for both boys and girls. A significance test was performed to examine whether this restriction held true. It was found that the effects of enjoyment of reading and online reading hours were statistically different on reading achievement between boys and girls based on PISA 2009 dataset. The model is an effective omnibus statistical technique to examine the institutional effects on multiple groups of individuals, which unmasked the specific group dynamics concerning institutional effects with a broad applicability as well as convenient execution.

The second model was a multilevel model with heterogeneous sigma squared function to compare distributional properties of multiple groups. A good understanding of the distributional properties across groups is an essential part of making group comparisons. The combination of central tendency and variability is the preferred way to describe (and compare) distributions across groups. An advanced multilevel model with an embedded analytic function referred to as heterogeneous sigma squared was developed to perform statistical tests of significance to compare means and variances across multiple groups at the same time, which made it convenient to examine the distributional properties comprehensively and simultaneously. With the employment of 2009 OECD PISA data, an application was illustrated to examine the distributional properties concerning reading achievement for boys and girls. In the two-level model, the level one had sex as the categorical independent variable (dummy coded as boys = 0 and girls = 1) and level two had the random intercept modeled by school background variables. It was found that girls performed significantly better than boys in reading achievement, but boys and girls share similar variance in reading achievement. A violin plot revealed that girls had higher mean and occupied the very top distribution of reading achievement, while boys had a lower mean and occupied the very bottom of reading achievement. The distribution for girls was near normal, but there were two peaks for boys indicating that the distribution for boys was not normal. The full model explained a total of nearly a third of the variance in reading achievement.

The above advanced multilevel models can be easily extended to examine other equity issues in education. It is the hope of the author that these advanced multilevel models would inspire statistical efforts in developing other advanced models. The results of similar models may promote more credible educational reforms through a revisit to educational policies and practices concerning equity issues in education (based on more robust and precise empirical evidence).

Digital Object Identifier (DOI)