Introducing data description chapters

In my last blog post, I gave a quick summary of the expansion of the Innovation Panel competition to explicitly to new survey questions, and introduced some of the chapters in the IP16 working paper summarising findings from these new questions.

Another new development in the Innovation Panel wave 16 (IP16) working paper is the introduction of data description chapters. There are two of these chapters in the working paper, one on Child development measures from the ‘red book’, and the other on Sea Hero Quest: Spatial navigation data linked to the Innovation Panel Wave 16. So what are these, and what are they for?

Some scholarly journals now accept articles providing descriptions of research datasets, sometimes termed data descriptor articles or data papers. Our data description chapters are intended fill a similar niche within the Understanding Society working paper series.

Data description chapters are intended to serve (at least) three purposes:

improve the visibility of new data
make it easier for data users to understand novel data that is being released
provide a way for data users to acknowledge and cite specific aspects of the data, alongside their main data citation.

Improving the visibility of new data

Understanding Society is, at its heart, a data-generating endeavour. As a core investment of the ESRC and the UK government in social sciences, it produces a lot of data. The best thing for the data we produce is for it to be used.
The data we produce is made available for researchers via the UK Data Service (see, for example, the Innovation Panel data release). As the data can be found in the UK Data Service catalogue, on some level it is already visible. By publishing data description chapters, we hope to enhance that visibility, giving prospective data users more opportunities to discover data that might be useful in their research.

Overall, as with journals publishing data papers, we hope these will help to accelerate dissemination of datasets and foster their reuse.

Understanding novel data

For most variables produced by Understanding Society, the ultimate documentation about the data generating process is available in the form of the questionnaires. These are all published on the Understanding Society website, and data users can always refer to them if they want to dig into the nuance of how the information was obtained from respondents.

But in a few cases, the data variables we make available do not come from questionnaire responses. In those cases, having a data description paper can fulfil a similar function to published questionnaires, giving data users a way to understand what a variable is and where it comes from.

As well as making the data more understandable to direct users of the data, increasing clarity of what the data variables are and how they were generated can ultimately contribute to the transparency of any research conducted using them. If data users cite the data description chapters — as we hope they will — readers of their publications will be able to follow up on the data source if they have any queries.

Citing the data

Quite rightly, the citation of datasets in academic papers has become far more common in recent years. Not only does this mean readers can see where the data underpinning a paper has come from, it also provides a way for data producers to track citations and understand where their datasets are being used. Funders of research infrastructure projects like Understanding Society can get a better sense of how their investments are contributing to a breadth of scholarship if published articles can be linked back to the data that made them possible. Understanding Society asks data users to cite the data in their outputs, and we continue to do so.

Understanding Society is a multi-decade endeavour, and whilst the general data citation serves many purposes, it doesn’t particularly facilitate the recognition of contributors who have created specific novel datasets that complement, extend, and enhance the core Understanding Society data.

For data users who make use of these novel datasets, we are encouraging them to cite (in addition to the overarching Innovation Panel data citation) the respective data description chapters for the data they have used. There are suggested citations for the ‘red book’ (child development) data and the Sea Hero Quest data in the relevant chapters of the working paper.

The future

Future waves of the Innovation Panel will see more datasets produced that might be well-served by being the subject of data description chapters — there is at least one such study in the seventeenth wave of the Innovation Panel (IP17).
With more likely to come, we would welcome any comments you might have on these first two chapters. Do they do a good job of introducing you to the data? Did they spark any ideas about how you might use these datasets? Is there anything additional that you would like future data description chapters to cover?

And further into the future, the Understanding Society Innovation Panel will continue to be a platform for producing new data. February 2025 will see the launch of the next Innovation Panel competition, offering researchers the opportunity to pitch their ideas for studies. If you have an idea for an idea that might generate a novel dataset that would merit special documentation (or any other ideas that are a good fit for the Innovation Panel), sign up to the Understanding Society mailing list to be informed when the next competition is announced.

Find out more about the Innovation Panel

Authors

Jim Vine

Jim is a Senior Research Officer at Understanding Society

Health and wellbeing Survey methodology

Introducing data description chapters

Authors

Jim Vine

What else is Understanding Society doing?

New variable search

Links between disability and smoking

Join our Strategic Oversight Board

Social life during COVID-19 in France, Germany, Italy and the UK

Email newsletter