Integrated Census Microdata (I-CeM)
Getting started
Census taking in Britain began in 1801, although it was not until 1841 that the names and details of individuals were collected. In comparison to later enumerations, the information collected in 1841 was still limited; for example, birthplace data in England and Wales was confined to whether individuals were born in same county as they were enumerated or elsewhere. Likewise, information on occupations for 1841 was often confined to the head of the household. Furthermore analysis of household structure is seriously restricted due to the a absence of a field recording relationships to head. As a result, the I-CeM data collection only covers the period 1851 to 1921, after which the census returns are currently closed to public inspection. The 1931 census does not survive and no enumeration was taken in 1941 meaning it is likely no further census data will be available until 2051. The English and Welsh Census of 1871 has not yet been fully transcribed and so only the Scottish data for that year are included in I-CeM. Similarly, the Scottish data for 1911 is currently being transcribed and may be added to I-CeM in coming years.
Before beginning work with the I-CeM dataset, it is important to recognize what it is and what it is not.

Image: © British Library Board P.P.7611
The strength of census data as a source for historical research is that it is comprehensive, containing information on every person in the country on census night. However, this information must be treated with caution as the raw census data may in many cases be erroneous, contradictory, or missing. This is due to the enumeration process which involved an army of enumerators distributing household schedules and millions of household heads interpreting the questions posed by the censuses in different ways. Additionally, the information contained in the census data represent a series of snapshots in time, which may not be representative of what people were doing or where they were living for the rest of the year. Early in the period of census-taking, illiteracy and innumeracy could result in non-standard spellings of names and age heaping when people reported their ages to the nearest five or ten years rather than knowing their exact date of birth. The variety of reported occupations and the degree of specificity also varies hugely. For example, the data include over three hundred different ways of expressing the occupation 'Blacksmith' and over two hundred ways of recording 'Hammersmith' as a birthplace. This problem advances as increasingly modern data is explored. Not all of I-CeM was made from equal circumstances. The original schedules do not survive before 1911 - I-CeM for 1851-1901 (and 1851-1921 in the case of Scotland) is instead based upon the Census Enumerators Books, produced by copying out the schedules, thus involving a certain level of standardisation which is not present in the 1911 and 1921 English and Welsh data which derives from transcription of the actual household schedules.
For these reasons, the raw census strings are usually not what researchers will want to use. Instead, the I-CeM data reconciliation and enrichment processes have involved significant work to standardize these data into a useable form, including coding occupations, relationships, and household structures into internationally recognized classifications or bespoke categorisation systems. Thus, for most purposes researchers will only need these coded variables.
From the standardised variables, a wide range of data enhancements and derived variables have been constructed to augment the transcribed census data. For example, all households containing servants are directly identifiable, as are those that contain a married child living with their parents. The raw census data entries have been preserved throughout, with indicator variables to show where changes have been made during the enrichment process. For a summary of types of variables and their chronological coverage, see the I-CeM variable list.
The Cambridge Group for the History of Population and Social Structure