Main Survey

Variable naming convention

Understanding Society has a distinct naming convention for its datafiles to identify which wave the data is from and the source of the data.

The naming convention for variables follows the same rules as file names. Variable names have the same root name which is fixed over time,  and begin with a prefix to reflect the wave the data are collected (“a_” for the first wave, “b_” for the second wave; in this user guide we have used “w_” to denote waves in general).

For example, current employment status collected from interviews with responding adults in Wave 1 (both years: 2009 and 2010) is a_jbstat and b_jbstat in Wave 2 (both years: 2010 and 2011).

To ease identification of groups of variables a number of additional general naming conventions have been applied. For instance, following the wave prefix, information from the UKHLS Wave 1 and Wave 2 self-completion interview with adults starts with the prefix “sc”; information from the interview with young adults generally starts with the prefix “ya”, and information from the child development module starts with the prefix “cd”. Similarly, we have attempted to include in the variable name the acronym of well-known instruments such as the Strengths and Difficulties Questionnaire (SDQ) or the General Health Questionnaire (GHQ). See, for example, c_ypsqda to c_ypsdqy on data file c_youth or d_scghq2_dv on data file d_indresp.

The prefix “ff_” following the wave prefix shows variables that were fed forward from previous waves to route respondents appropriately in the questionnaire. To aid data collection, information reported at an earlier time is fed forward to the respondent to personalize the question. Rather than ask a question about current occupation, the question might say:

“The last time you were interviewed you said you were “specific occupation”, are you still “specific occupation”?

Feed-forward variables are used at both the household and individual levels. For example:

b_ff_hhsize feeds forward the household size from the previous wave (Wave 1)

b_ff_plbornc is the country of birth of the respondent fed forward from the previous wave. 

Note that the variable name does not change over time so long as the underlying question does not change substantially. Analysts are advised to carefully read the variable notes in the online documentation to keep track of any definitional changes or changes in the code frame that may impact study results. An example is the derived variable w_qfhigh_dv which provides limited information about continuing BHPS (from Wave 2 onward) and IEMB sample members (from Wave 6 onward) as the underlying code frames for the initial conditions questions in the BHPS Wave 1-18, UKHLS Wave 1-7 and IEMB Wave 1 (as part of UKHLS Wave 6) do not perfectly align.