Using prior wave information and paradata: can they help to predict response outcomes and call sequence length in a longitudinal study?

Publication type

Journal Article

Published in

Journal of Official Statistics


Gabriele B. Durrant, Olga Maslovskaya and Peter W.F. Smith

Publication date


In recent years the use of paradata for nonresponse investigations has risen significantly. One key question is how useful paradata, including call record data and interviewer observations, from the current and previous waves of a longitudinal study, as well as previous wave survey information, are in predicting response outcomes in a longitudinal context. This article aims to address this question. Final response outcomes and sequence length (the number of calls/visits to a household) are modelled both separately and jointly for a longitudinal study. Being able to predict length of call sequence and response can help to improve both adaptive and responsive survey designs and to increase efficiency and effectiveness of call scheduling. The article also identifies the impact of different methodological specifications of the models, for example different specifications of the response outcomes. Latent class analysis is used as one of the approaches to summarise call outcomes in sequences. To assess and compare the models in their ability to predict, indicators derived from classification tables, ROC (Receiver Operating Characteristic) curves, discrimination and prediction are proposed in addition to the standard approach of using the pseudo R2 value, which is not a sufficient indicator on its own. The study uses data from Understanding Society, a large-scale longitudinal survey in the UK. The findings indicate that basic models (including geographic, design and survey data from the previous wave), although commonly used in predicting and adjusting for nonresponse, do not predict the response outcome well. Conditioning on previous wave paradata, including call record data, interviewer observation data and indicators of change, improve the fit of the models slightly. A significant improvement can be observed when conditioning on the most recent call outcome, which may indicate that the nonresponse process predominantly depends on the most current circumstances of a sample unit.

Volume and page numbers

33, 801-833





Open Access; © 2017 Gabriele B. Durrant et al., published by De Gruyter Open. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.  BY-NC-ND 3.0