< Back to previous page


Estimation methods for generalized linear mixed models with binary outcomes from small clusters

Book Contribution - Book Abstract Conference Contribution

Generalized linear mixed models (GLMMs) have been widely used for the modelling of longitudinal and clustered data in medical research, social and behavioural sciences, as well as across may other disciplines. Statistical inference of the GLMM, however, is hampered due to the incorporation of random effects: the likelihood function of the GLMM involves integrating out these effects from the joint density of the responses and random effects, which is, except for a few cases, analytically intractable. To tackle this intractability of GLMMs, numerous likelihood-based approximation methods have been proposed. One such method, the Laplace approximation, stands out as one of the most popular ones. Alternatively, Taylor expansions aimed to reduce the estimation of a GLMM to that of an approximated linear mixed or penalized quasi-likelihood method could be used (PQL), as well as an adaptive Gaussian quadrature (AGQ) approach. A second class of methodologies next to the above three likelihood-based approximations, is to pursue a Bayesian approach in which MCMC methods are used to make inferences based on the posterior distribution of the parameters, by e.g. relying on Gibbs sampling. Bayesian methods, although they show good frequentist properties when the model is correct, are known to be computationally intensive. To this end, hybrid models using integrated nested Laplace approximations (INLA) were recently proposed to approximate the posterior marginals for latent Gaussian models, as they have shown a steep decline in the computational burden of MCMC algorithms. A third class of methodology finds its origins in the Structural Equation Modelling (SEM) framework, where a limited-information diagonally weighted least squares (DWLS) estimation procedure has been suggested. In this presentation, we focus on the analysis of binary clustered data with small cluster sizes, since this setting is especially known for posing a challenge to the available GLMM methods. Such data structures may for example arise from crossover studies or dyadic studies with binary outcomes. With this in mind, our intent is to explore the performance of the above-proposed methods as they are available in the statistical computing environment R, in this particular setting. More specifically, we will consider the following functions within their respective R-packages: glmer from lme4 (Laplace, AGQ), glmmPQL from MASS (PQL), MCMCglmm from MCMCglmm (MCMC), inla from R-inla (hybrid), and sem from lavaan (DWLS). Since the performances of many of these methods have but been assessed by themselves or within their classes of methodology, an over-arching comparison through simulation studies will be presented here. The above-mentioned approaches will be compared in terms of bias, mean squared error and coverage. These criteria will be reviewed by monitoring different sample sizes (with fixed cluster size), different intra-cluster correlations, dichotomous versus continuous predictors, within-cluster and between-cluster predictors with varying effect sizes, and different event rates.
Book: 10th International Multilevel Conference, Abstracts
Number of pages: 1
Publication year:2015