Lei Sun

University of Toronto, Department of Statistical Sciences

“Testing a vector of parameters with applications to genetic association studies”

Date: Thursday, March 30, 2017

In many scientific studies, a vector of location and/or scale parameters may be of inferential interest. For example, genetic association studies between an outcome and multiple genetic variants (also known as gene-based or set-based association analyses) simultaneously investigate a vector of location parameters. In another setting where heteroscedasticity may be present due to unaccounted for interaction effects, joint analyses of both location and scale parameters can be more powerful. In the context of gene-based association analyses, we show that many existing methods can be classified into a class of linear statistics and another class of quadratic statistics, where each class is powerful only in part of the high-dimensional parameter space (Derkach et al. 2014, Statistical Science). Consequently we can derive a more robust class of hybrid test statistics, by combining evidence from the competing but complementary individual linear and quadratic test statistics. Similarly, we develop a joint location-scale testing framework to test the global null of no mean and no variance heterogeneity (Soave et al. 2015, the American Journal of Human Genetics; Soave and Sun, in press, Biometrics). We apply Fisher’s method, commonly used in meta-analyses to combine p-values of the same test applied to different samples, to combine p-values of different tests (i.e. linear and quadratic, or location and scale) applied to the same sample. In both settings, we show that the two classes of tests are asymptotically independent of each other under the global null hypothesis. Thus, we can evaluate the significance of the resulting Fisher’s test statistic using the chi-squared distribution with four degrees of freedom; this is a desirable feature for analyzing big data. In addition to theoretical results, we also provide empirical results from extensive simulation studies and multiple data applications.

Important Dates

April 22 – April 29: Final exam period for most classes. Students must remain available until all exam obligations have been fulfilled

May 6: Final Grades Available

Where are they now?

Llwellyn Maria Armstrong, M.Sc (1992)

Chel Hee Lee, M.Sc. (2009)