Genome-wide genotyping in the UKHLS

Presenter: Bram Prins, Wellcome Trust Sanger Institute

Author: Bram Prins

Understanding Society provides a unique, deeply phenotyped resource to investigate genetic components and modifying environmental factors contributing to individual health, behaviour and wellbeing. To this end, for a subset of individuals who took part in a nurse health assessment, blood samples were taken and genomic DNA extracted. Of these, 10484 samples were genotyped at the Wellcome Trust Sanger Institute using the Illumina Infinium HumanCoreExome BeadChip Kit®, which includes a panel of > 240000 common and rare exonic markers in addition to > 250000 highly informative genome-wide tagging single nucleotide polymorphisms (SNPs).

Genotype calling was performed using the Illumina GenCall software. We performed sample-level quality control (QC) using the following filters: call rate < 98%, autosomal heterozygosity outliers (> 3 SD), gender mismatches, duplicates as established by identity by descent (IBD) analysis (PI_HAT > 0.9), ethnic outliers as determined by combining with 1000 Genomes Project data and carrying out IBD followed by multidimensional scaling. In total, 9965 samples passed QC.

Next, we performed variant-level QC. We first mapped all 538448 variants to the human reference genome build 37. Variants with a Hardy-Weinberg equilibrium p-value < 1×10-4, a call rate below 98% and poor genotype clustering values (< 0.4) were removed, as well as Y-chromosome and mitochondrial variants, leaving 525314 variants passing QC. We have carried out genome-wide association scans for cardiometabolic traits for which we will present results at the meeting. The genotype data are accessible through the European Genome-phenome Archive (https://www.ebi.ac.uk/ega/home)