Statistical Challenges in Genetic Analysis of Biobank Data Note unusual time

With Hongyu Zhao (Yale University)

Statistical Challenges in Genetic Analysis of Biobank Data

The past two decades have seen great advances in human genetics with the identifications of hundreds of thousands of genomic regions associated with thousands of traits and diseases through Genome-Wide Association Studies (GWAS) that collect phenotype and genotype data from large cohorts and biobanks. For example, the UK Biobank has over 500,000 participants, and the Million Veteran Program in the US has recruited more than 900,000 veterans. There are rich phenotypes (e.g. thousands of clinical traits, lab test results, imaging data, and wearable device data) and omics data (e.g. genotype data, whole exome sequencing, whole genome sequencing, gene expression, epigenetics, proteomics, and metabolomics data) available from these cohorts. These data present great opportunities for identifying functional genes and variants for different traits and diseases, inferring specific tissues and cell types relevant for a trait, characterizing the genetic architecture of complex diseases, developing disease risk prediction models that capture joint effects of genetic and environmental factors, investigating genetic similarities and differences across groups (e.g. different ancestral populations), and studying causal relationships among diseases and traits. In this presentation, we will review the statistical methods that have been developed to address these challenges and the significant gaps remaining to analyze and interpret these rich data.

Note unusual time

Add to your calendar or Include in your list