On 18th December 2015, the ABS put out this media release on privacy of Census data. I didn’t notice, and I follow the ABS quite closely. Most people didn’t notice, possibly due to the proximity to Christmas. Just in the last two weeks, a few media commentators have picked up on it.
The media release says, in a nutshell, that the ABS is abandoning the long-standing (50 year) tradition of destroying name-identified information on Census forms, and for 2016 will be keeping names and addresses. Previously names and addresses were collected, but were destroyed after all the information was entered into the system, unless (since 2001) the respondent answered “Yes” to keeping the information in the 99 year time capsule for future use. ABS stresses that it will be keeping names and addresses in a separate database to the main body of Census data (the Census Unit Record File), but that they will be used “to provide a richer and dynamic statistical picture of Australia through the combination of Census data with other survey and administrative data“.
What that means, in short, is “data linking”. Say you have a survey which collects information on the employment outcomes of school and university leavers in Australia. By linking the survey respondents to records from the Census, enables the cross-classification of these outcomes with all the demographic characteristics collected in the Census. For instance, you would be able to see whether parental income or education levels effects the type of jobs students get after completing school. Importantly you can see how previous parental income effects this, something not easy to do in a survey. In this way linking survey with Census data can also provide a longitudinal picture unavailable to either collection on its own.
Linking is pretty important for improving survey results and providing more useful output, and the ABS already do it. But they do it by probabilistic pattern matching, without reference to name and address – simply matching up the combination of Census answers to match the likely individual to the survey. It’s not the names and addresses in themselves which are important in this new initiative, but that they provide a greater degree of certainty in linking the records between Census and survey results, and minimise records for which no match could be found.
Isn’t this an invasion of privacy?
Many commentators think so. For instance, Chris Berg’s article in The Drum last week.
However, the ABS never release personal information from the Census or any other survey, and they really aren’t interested in anyone’s personal details – it is purely for enhancing statistical collections. They have stressed that the names and addresses will be held in a different database to the full Census dataset, and secured and encrypted.
Nevertheless, with recent hacking scandals (eg. Ashley-Madison), it’s understandable that people could be worried about this data being released even with the ABS having the best intentions of keeping it secure.
Is the Census data that sensitive though?
I don’t actually think the Census questions are particularly intrusive. The question most people are wary of is the income question – presumably afraid that the ABS would share it with the ATO, Centrelink etc. They don’t – but even if they did, this wouldn’t be much use – as Census income is collected in broad ranges (eg. $1500-$1999 per week) – and relates to income from all sources, not annual taxable income anyway. Very useful for understanding the range of incomes and socio-economic status of an area – useless for catching tax evaders.
The other question many object to as being too personal is the Religion question – it generates more chatter around Census time than the rest of the Census questions combined. However, this question has always been optional anyway – people are free not to answer it.
To be honest, I applied for a credit card recently, and had to answer far more detailed and intrusive questions about my family situation and income position, which is now on record with the bank and credit reference agencies, with far less checks on privacy than the ABS has. I guess the difference is though that applying for a credit card is optional (though it’s hard to live without one these days!), while Census is an official collection which is required of Australians once every 5 years.
So what’s the problem with name and address retention?
The ABS security is certainly excellent and they have a strong commitment to the privacy of all their data collections, enshrined in legislation. The problem really is if there is a PERCEPTION of an invasion of privacy, and the risk that people will not answer Census questions truthfully if they fear their information will be kept and possibly released.
The reason the ABS has even collected names and addresses in previous Censuses if they were only going to be destroyed, is that there is a great deal of evidence that respondents will answer questions more truthfully once they have their real name attached to it. This is a simple cognitive tool which leads people to be more honest and gives a better dataset as a result.
The danger is if people are concerned about the security of their information with this new development that this could backfire, and they could then falsify information on the Census form. We certainly hope no-one does this, but it is a concern, if there is a negative reaction to this development.
The Census is the only source of reliable, small area data for planning at the small area level in Australia. It is of vital importance to all levels of government, business, and indeed individuals, and raises the understanding of demographics and social change in the community. .id, and through us, over 250 Local Government clients throughout Australia use this data every day and need it to be accurate.
The job for the ABS then is to convince the population that – with their good track record in this area – the data will be safe and confidential despite this change, in the relatively short time we have before Census day – if there is not to be a resulting reduction in the quality of the Census from it. If the data quality are reduced, any enhancement gained from being able to link datasets from survey and Census will be eliminated by the greater non-response rate or statistical noise generated from false answers. And this is a worse outcome for Australia.
Who else is this change good for?
There should be one group of the population who are very happy about this change, and promote it widely as a good thing.
Every 5 years, the ABS receives more submissions to its consultation process asking FOR the compulsory retention of names and addresses than any other Census-related topic (Religion related submissions are a close second). This may seem odd, given the privacy concerns now that it’s been announced, but they have a very good reason for wanting name identified information.
I’m speaking of course, about Genealogists, who regularly use historic Census records from the Australian colonies, and other nations, to research their family histories. They have long bemoaned the loss of the historic record since the 1960s when the name identified information began to be destroyed as a matter of policy. They should be quite pleased with this outcome, though whether the data will ever in the distant future be released in a format which allows for genealogical research is still an open question.
Census is happening this year, on the 9th of August. The vitally important statistical information derived from it will begin to be incorporated in our Local Government client sites (profile.id, atlas.id and economy.id) from mid-2017. .id believes that Census is the cornerstone of our democracy and the data is of vital importance to all Australians. We encourage everyone to answer the Census honestly, and will be monitoring this issue closely.