Project

Advanced screening techniques for ultra high-dimensional data (R-4478)

Statisticians are frequently confronted with massive data sets from various scientific research domains. Fields such as genomics, neuroscience, finance and earth sciences have different concerns on their object matters, but nevertheless share a common theme: they rely heavily on extracting useful information from massive data while the number of predictors can be huge in comparison with the sample size. In such a situation, the parameters are only identifiable when the number of predictors that are relevant to the response is small. To explore the sparsity, variable selection techniques are needed. Sure independence screening (SIS) is a powerful method for variable selection when the number of explanatory variables is massive. In this project we study alternatives for the existing SIS method that can be used for massive data with some extra complications.

Date:1 Feb 2013 → 31 Dec 2013

Keywords:big data, variable selection

Disciplines:Mathematical sciences and statistics

Project

Advanced screening techniques for ultra high-dimensional data (R-4478)

Researchers

Project partners

Funding