< Back to previous page

Publication

Data Representativeness: Issues and Solutions

Journal Contribution - Journal Article

In its control programmes on maximum residue level compliance and exposure assessments, EFSA requires the participating countries to submit results, from specific numbers of food item samples, analyzed in the countries. These data are used to obtain estimates such as the proportion of samples exceeding the maximum residue limits, and the mean and maximum residue concentration per food item to assess exposure. An important consideration is the design and analysis of the programmes. In this report, we combine elements of survey sampling methodology, and statistical modeling, as a benchmark framework for the programmes, starting from the translation of research questions into statistical problems, to the statistical analysis and interpretation. Particular focus is placed on the issues that could affect the representativeness of the data, and remedial procedures are proposed. For example, in the absence of information on the sampling design, a sensitivity analysis, across a range of designs, is proposed. On the other hand, weighted generalized linear mixed models, and generalized linear mixed models combining both conjugate and normal random effects, are proposed, to address selection bias. Likelihood-based analysis methods are also proposed to address missing and censored data problems. Suggestions for improvements in the design and analysis of the programmes are also identified and discussed. For instance, incorporation of stratified sampling methodology, in determining both the total number, and the allocation of samples to the participating countries, is proposed. All through the report, statistical analysis models which properly take into account the hierarchical (and thus correlated) structure in which the data are collected are proposed.
Journal: EFSA Supporting Publications
ISSN: 2397-8325
Issue: 2
Volume: 12
Pages: 1 - 159
Number of pages: 159
Publication year:2015
Keywords:bias, censoring, clustering, likelihood, missing data, stratification, linear mixed models, generalized linear mixed models
Accessibility:Open