Skip to content

Introduction to genetics and epigenetics (omics) data

The rich longitudinal data in Understanding Society allows social science researchers to study socially patterned health outcomes at the population level. 

At Understanding Society, we collect health and biomarkers data and produce a number of omics datasets. For a more detailed description of the omics data, see our User Guides area. 

Genetics and epigenetics data 

A genome-wide scan (Illumina human core exome array) has been conducted on DNA samples from approximately 10,000 people.   

Methylation profiling has been conducted on DNA samples from approximately 3650, consisting of 1425 individuals from the British Household Panel Survey component of Understanding Society and another 2230 from the General Population Sample. Using the Illumina Methylation EPIC BeadChip we have integrated over 850,000 Methylation sites across the genome. 

The omics data are available to researchers by application only. From these data we have produced some derived variables made available via the UK Data Service. 

If you have an established GWAS and EWAS consortium and are interested in the incorporation of phenotypes available in Understanding Society, please contact us for a discussion by emailing genetics@understandingsociety.ac.uk.

Proteomic panels

Proteomics is the analysis of a large set of protein molecules. Data from Understanding Society has been used to create two proteomic panels: one for proteins from the cardiometabolic panel and one for proteins in the neurology panel.

The cardiometabolic panel focuses on cardiovascular health (heart and blood vessels) and metabolic health, for example diabetes. The neurological panel focuses on proteins involved in areas such as brain development and neurodegenerative disease. Read more about the proteomic panels in the User Guide.

Epigenetic ageing algorithms (epigenetic clocks) 

Epigenetic ageing algorithms, constructed based on CpG sites whose DNA methylation levels are associated to chronological age or age-related health outcomes, have attracted a lot of research interest for their potential to quantify rates of biological ageing.  
 
The difference between a person’s chronological age and epigenetic age calculated by these ‘clocks’ has been used as an indicator of whether an individual is aging faster or slower biologically than expected, given their actual age.  

Differences between actual age and biological age may be related to life circumstances and environment. For the Understanding Society data, five epigenetic ageing algorithms have been constructed. You can read more about epigenetic ageing algorithms in the User Guide

Ageing algorithm and proteomic panel data are available from the UK Data Service in the Nurse health assessment dataset, SN: 7251

Polygenic scores 

A polygenic score is a continuous variable that reflects an individual’s propensity towards a given trait. These traits can include disease status, behaviours, and blood levels of biomolecules, among many others.  

A number of polygenic score variables have been derived from the genomic dataset and are available for the first time in the Special Licence version of the Nurse Assessment dataset (SN 7587). Polygenic scores generated in Understanding Society can be used as explanatory variables in a range of analyses, including investigating the causal effect of the polygenic score trait with health or social outcomes. However, any findings from analyses that use polygenic scores as explanatory variables require careful interpretation. Please consult the User Guide for information on using these data appropriately

Polygenic scores data are available from the UK Data Service in the Special Licence version of the Nurse Health Assessment dataset, SN: 7587.

Email newsletter

Sign up to our newsletter