Can machine learning (ML) improve our understanding of non-response in Understanding Society and means of tackling it?
‘Machine learning’ is a disruptive technology that is having implications in many fields of research, particularly where there is interest in classifying people and outcomes (e.g. whether to grant a loan or invite a candidate to interview). It may offer advantages over statistical methods where the aim is to predict particular outcomes, such as the likelihood of non-response in Understanding Society. Some of the key advantages include less need to pre-prepare data (outlier issues); less need to specify functional forms; and, the ability to include a very large number of independent variables in models. Whilst not presupposing that ML will generate better predictions than statistical models, evidence suggests that this is likely, and this study tested to what extent improvements in predictive accuracy can be made. The predictive performance of two types of model – random forests and gradient boosting – has been found to be very strong and these would be the priorities to test against familiar statistical models such as binary logistic regression. Currently Understanding Society develops non-response models using backward stepwise methods (Knies, 2018), which have known issues. Machine learning models would generate measures of the importance of particular variables in predicting non-response. The project aimed to update literature on patterns of non-response to social surveys and considered how improved predictions of future non-response might be useful to the survey organisers. The predicted response probabilities may be used to create weights using the inverse probability approach (which may be compared with existing practice) and/or to guide data collection.
Outputs
Conference presentations
- Presentation at Essex 4 July 2019. See the presentation here.
- Presentation to European Social Research Association, Croatia, 18 July 2019. See the presentation here.
Find out more about Steve’s work on his profile page.



