A Multivariate Negative-Binomial Model with Random Effects for Differential Gene-Expression Analysis of Correlated mRNA Sequencing Data

Experimental designs such as matched-pair or longitudinal studies yield mRNA sequencing (mRNA-Seq) counts that are correlated across samples. Most of the approaches for the analysis of correlated mRNA-Seq data are restricted to a specific design and/or balanced data only (with the same number of samples in each group). We propose a model that is applicable to the analysis of correlated mRNA-Seq data of different types: paired, clustered, longitudinal, or others. Any combination of explanatory variables, as well as unbalanced data, can be processed within the proposed modeling framework. The model assumes that exon counts of a particular gene of an individual sample jointly follow a multivariate negative-binomial distribution. Additional correlation between exon counts obtained for, for example, individual samples within the same pair or cluster, is taken into account by including into the model a cluster-level normally distributed random effect. An interesting feature of the model is that it provides explicit expression for marginal correlation between exon counts at different levels. The performance of the model is evaluated by using a simulation study and an analysis of two real-life data sets: a paired mRNA-Seq experiment for 24 patients with clear-cell renal-cell carcinoma and a longitudinal mRNA-Seq experiment for 29 patients with Lyme disease.
Tijdschrift: Journal of Computational Biology
ISSN: 1066-5277
Issue: 12
Volume: 26
Pagina's: 1339 - 1348
Jaar van publicatie:2019
Trefwoorden:correlated data