Data management syntax files
These syntax files show you how to perform some common data management tasks, e.g., merging two files, matching information of respondent and their partner etc, using Understanding Society data files.
You can also find syntax files from other researchers who have used Understanding Society data. Please note that these syntax files have not been checked for accuracy and you should contact the author of the file if you have any queries.
Commonly-used commands to work with data.
Merging individual files across waves into wide format
To match individual level files across two waves into a wide format, do the following (for more waves add wave specific prefix in the foreach statement)
Merging individual files across waves into long format
To match individual level files across two waves into a long format, do the following (for more waves add wave specific prefix in the foreach statement)
Distributing household level information to the individual level
In this example we will distribute household level information to individuals in those households. We can do this by merging household level file (such as w_hhresp) with an individual level file (such as w_indresp) within the same Wave.
Summarising individual level information at the household level
In this example we will summarise individual level information within a household (number of 18-24-year-olds in the household) from an individual level file and then add it to the household level file.
Matching individuals within a household
In this example we will match the information of respondents living with partners/spouses with that of their partners/spouses.
Using the egoalt file to create household composition variables
In this example we will create a variable that measures the number of siblings in the household using the egoalt file. The resulting file can be merged with any individual level file.
Merging individual files from harmonised bhps and ukhls in long format
To match individual level files from the harmonised BHPS and Understanding Society in long format, you need to remove the wave prefixes in the two sets of files and generate a wave identifier that works across both sets of files. The pidp will work as the unique cross-wave identifier across both sets of files. This code only keeps individuals who took part in BHPS and drops those who joined as part of Understanding Society.
Long-term depresssion following stressful life events: feeling 'worthless' shows the slowest recovery.
Research paper: John Simister (2019) ‘Long-term depression following stressful life events: feeling ‘worthless’ shows the slowest recovery’, Archives of Psychology 3(6):1-20. https://doi.org/10.31296/aop.v3i6
The syntax files which were used to create the variables used in this analysis are Simister_syntaxfile_1 and Simister_syntaxfile_2, to be run in that order. The syntax file, Simister_syntaxfile_1.sps, combines the INDRESP and HHRESP files across 18 waves of BHPS and 8 waves of UKHLS to create one long format individual level file and creates a number of variables: clinicalDepression, earn_net, earn_gross, sp_cCare, LIVEcnty, numAdult, region. The resulting data file is saved as “BHPS and UndS subset.sav”. The syntax file, Simister_syntaxfile2.sps, uses “BHPS and UndS subset.sav” to create additional variables which were used in the analysis in the accompanying paper. Disclaimer: The syntax has not be quality checked by Understanding Society team and are the responsibility of the author. For any questions about this please contact Dr John Simister.