The Data Processing team at DANS manages an extensive collection of datasets with municipal data from the Dutch National Census of 1947, stored in EASY. The collection was the subject of a recent article, ‘Detailed Tables from the Dutch Census 1947: Experiences and Lessons Learned in Publishing a Large Dataset’, which appeared in het Research Data Journal for the Humanities and Social Sciences.
Digitized census data
Census records have a long history. In the Netherlands, censuses have been taken since 1795, usually every 10 years. Up to and including 1971 these were mainly traditional counts, with house-to-house tellers filling in questionnaires with the residents. Results were published in printed form, totalling some 42,000 pages over the 1795-1971 period. Since 1997, these publications have been digitized in various projects. In 2004 all of the digital tables were made available at DANS in a processable format.
Detailed data from 1947 Census
In addition to the publications, detailed data of a number of counts has been kept at Statistics Netherlands (CBS). For the 1947 Census this amounts to approximately 30,000 sheets of A4 paper, containing data per municipality, and partly also per district and neighbourhood. They cover a wide range of topics, from age and nationality to denomination, from occupation to commuting, and more.
How this massive amount of data was made reusable
Scanned images of the 30,000 sheets had already been created. Using data entry they were transferred to Excel worksheets. The challenge was to publish this mass of data in a way that makes it easily available for new historical research. The article describes in detail how this was achieved. The guideline was to make the total dataset FAIR, with particular focus on the R of Reusable.
Example of new analysis
The reusability of the detailed dataset for the 1947 Census is illustrated with an example. It concerns the number of residents without Dutch citizenship per 10,000, per municipality, in 1947. The article concludes with suggestions for further use of the dataset.