Main Survey

Data management syntax files

These syntax files show you how to perform some common data management tasks, e.g., merging two files, matching information of respondent and their partner etc, using Understanding Society data files. 

You can also find syntax files from other researchers who have used Understanding Society data. Please note that these syntax files have not been checked for accuracy and you should contact the author of the file if you have any queries. 

Common commands

Commonly-used commands to work with data.

Stata


Merging individual files across waves into wide format

To match individual level files across two waves into a wide format, do the following (for more waves add wave specific prefix in the foreach statement)

Stata


Merging individual files across waves into long format

To match individual level files across two waves into a long format, do the following (for more waves add wave specific prefix in the foreach statement)

Stata


Distributing household level information to the individual level

In this example we will distribute household level information to individuals in those households. We can do this by merging household level file (such as w_hhresp) with an individual level file (such as w_indresp) within the same Wave.

Stata   SPSS   SAS   R


Summarising individual level information at the household level

In this example we will summarise individual level information within a household (number of 18-24-year-olds in the household) from an individual level file and then add it to the household level file.

Stata   SPSS  SAS  R


Matching individuals within a household

In this example we will match the information of respondents living with partners/spouses with that of their partners/spouses.

Stata   SPSS   SAS   R


Using the egoalt file to create household composition variables

In this example we will create a variable that measures the number of siblings in the household using the egoalt file. The resulting file can be merged with any individual level file.

Stata   SAS   R


Merging individual files from harmonised bhps and ukhls in long format

To match individual level files from the harmonised BHPS and Understanding Society in long format, you need to remove the wave prefixes in the two sets of files and generate a wave identifier that works across both sets of files. The pidp will work as the unique cross-wave identifier across both sets of files. This code only keeps individuals who took part in BHPS and drops those who joined as part of Understanding Society.

Stata


Long-term depresssion following stressful life events: feeling 'worthless' shows the slowest recovery. 

Research paper: John Simister (2019) ‘Long-term depression following stressful life events: feeling ‘worthless’ shows the slowest recovery’, Archives of Psychology 3(6):1-20. https://doi.org/10.31296/aop.v3i6

The syntax files which were used to create the variables used in this analysis are Simister_syntaxfile_1 and Simister_syntaxfile_2, to be run in that order. The syntax file, Simister_syntaxfile_1.sps, combines the INDRESP and HHRESP files across 18 waves of BHPS and 8 waves of UKHLS to create one long format individual level file and creates a number of variables: clinicalDepression, earn_net, earn_gross, sp_cCare, LIVEcnty, numAdult, region. The resulting data file is saved as “BHPS and UndS subset.sav”. The syntax file, Simister_syntaxfile2.sps, uses “BHPS and UndS subset.sav” to create additional variables which were used in the analysis in the accompanying paper. Disclaimer: The syntax has not be quality checked by Understanding Society team and are the responsibility of the author. For any questions about this please contact Dr John Simister.

Simister_syntaxfile_1

Simister_syntaxfile_2