Skip to content

Derived variables

Derived variables are variables that are computed from one or more variables.

Some are computed during the interview to control the routing within the questionnaire and can be identified in the questionnaires by searching for “Compute”. Others are computed post-field for the purpose of analysis and are positioned last in the data files with the suffix _dv in the variable name. There are exceptions to this rule. Pointers to significant others in the household (such as the natural parents), based on edited information in the household grid, end on the familiar _pidp and _pno.

Variable search

The derived variables are documented as part of the online variable search and contain notes giving information on how they were derived, for example variable agegr5_dv.

Tips for analysts

Information about how a derived variable is produced is shown in the Derived Variable Note field of the variable. The Variable Search provides descriptive statistics for each variable and, in the Origin field, lists the variables used in the computation of the derived variable. For variables that were computed during the interview, additional information is available in the questionnaires.

Identifiers and other useful variables

Households are identified by “w_hidp”, a wave-specific variable with a different wave-specific prefix for each wave. As shown in the table below, “w_hidp” can be used to link information about a household from different records within a wave. “w_hidp” cannot be used to link information across waves. Since the composition of households changes between waves, the data do not include a longitudinal household identifier.

Individuals are identified by the personal identifier (“pidp”), which is constant in all waves, and can be used to link information about a person from different records belonging to one wave, or to link information from different waves. Individuals are also identified by “w_pno” – the person number within the household. The combination of “w_hidp” and “w_pno” is unique for each individual and can also be used to link information about individuals within a wave.

Useful variables

Variable Description Available in File
w_hidp Household identifier All files
pidp Cross wave person identifier All EXCEPT w_hhsamp_ip, w_hhresp_ip
w_gor_dv Government office region (Wave 1) w_hhsamp_ip, w_hhresp_ip, w_indall_ip, w_indresp_ip
w_pno Person number within the household All EXCEPT w_hhsamp_ip, w_hhresp_ip
w_sex Sex w_indall_ip, w_indresp_ip
w_dvage Age w_indall_ip, w_indresp_ip
w_hgpart PNO of spouse/civil partner w_indall_ip, w_indresp_ip
a_psnenip_xd cross-sectional person design weight a_indresp_ip, a_indall_ip, a_youth_ip
a_hhdenip_xd cross-sectional household design weight a_hhsamp_ip, a_hhresp_ip
w_indinip_lw longitudinal adult main interview weight w_indresp_ip
w_psu primary sampling unit w_hhsamp_ip, w_hhresp_ip, w_indall_ip, w_indsamp_ip, w_indresp_ip, w_youth_ip
w_strata sampling strata w_hhsamp_ip, w_hhresp_ip, w_indall_ip, w_indsamp_ip, w_indresp_ip, w_youth_ip

Occupation codes

Understanding Society collects free text information on respondents’ job titles and industry. Industry descriptions are coded to ONS Standard Industry Code 2007, or SIC 2007. Job titles are coded to the ONS Standard Occupational Classification 2000, or SOC 2000. Coding is undertaken using the Computer Assisted Structured Coding Tool (CASCOT) system. We use look-up files between SOC 2000 and other classifications provided on the CAMSIS website to derive further occupational classifications.

We provide the following classifications: International Standard Classification of Occupations (ISCO88), Registrar General Social Class (RGSC), National Statistics Socio-economic Classification (NS-SEC), Employment Status (ES), and Socio-economic Group (SEG).

Email newsletter

Sign up to our newsletter