< Back to previous page

Project

Statistically and Computationally Efficient Hypothesis Tests for Similarity and Dependency

The dissertation present novel statistically and computationally efficient hypothesis tests for relative dependency, similarity, and precision matrix estimation. The key methodology adopted in this thesis is the class of \ustat estimator. The class of \ustat allows a minimum-variance unbiased estimation of a parameter. We make use of asymptotic distributions and strong consistency of the \ustat estimator to develop novel non-parametric statistical hypothesis tests.

In the first part of the thesis, we will focus mainly focus on developing a novel non-parametric statistical hypothesis test for relative dependency. Test of dependence are important tools in statistical analysis. For many problems in data analysis, however, the question of multiple dependencies is secondary. We present a statistical test which determine whether one variables is significantly more dependent on a first target variable or a second. Dependence is measure via the Hilbert-Schmidt Independence Criterion (HSIC).

On the other hand, this thesis will focus the problem of model selection. Probabilistic generative models provide a powerful framework for representing data. Model selection in this generative setting can be challenging, particularly when likelihoods are not easily accessible. To address this issue, we provide a novel non-parametric hypothesis test of relative similarity and test whether a first candidate model generates sample significantly closer to a reference validation set. Our model selection criterion is based on the Maximum Mean discrepancy (MMD) and measure the distance of the generated samples to some reference target set.

The resulting test of dependence and relative similarity are consistent and unbiased (being based on \ustat) and can be computed in quadratic time.

Finally, a novel method for estimating the precision matrix is proposed. Methods for structure discovery in the literature typically make restrictive distributional or sparsity assumptions that may not apply to a data
sample of interest, and direct estimation of the uncertainty of an estimate of the precision matrix for general distributions remains challenging. Consequently, we derive a new test that makes use of results for \ustat and applies them to the covariance matrix. The resulting test enables one to answer with statistical significance whether an entry in the precision matrix is non-zero, and convergence results are known for a wide range of distributions. The computational complexity is linear in the sample size.

Date:20 Nov 2015 →  30 Jan 2017
Keywords:hypothesis testing, statistical dependency
Disciplines:Nanotechnology, Design theories and methods
Project type:PhD project