Staff and teams are writing in their individual capacity and the views are not necessarily a "Treasury" view. Please read our disclaimer.

The New Zealand research community has been the scene of a revolution over recent years. A quiet and sedate revolution, but a revolution nonetheless. The creation of Stats NZ’s IDI (or Integrated Data Infrastructure), a treasure trove of linked data, sparked the revolution, and its ongoing development drives it along.

The IDI doesn’t collect anything new. Instead it gathers together data that is already collected, links it together at a person level, anonymises it, and makes it available to researchers in government, academia, and beyond. Keeping people’s information safe and secure is critical, and is achieved by restricting access to bona fide researchers working in approved, secure, facilities. Researchers can only take summarised statistical information out of this secure environment, and Stats NZ’s team of confidentiality experts check all results before they are released.

The IDI is essentially a traditional research database, but on a much larger scale. It is big data with a little ‘b’. Unlike true ‘Big data’, it doesn’t contain billions of records of real-time transactional data that require special techniques to analyse. The administrative and survey datasets of which it is composed are all essentially research datasets in their own right. But by bringing them together in one place, the resulting whole is far, far greater than the sum of its parts.

Integrated Data Infrastructure

In its combination of administrative data, Census data, and sample survey data, the IDI opens up a new world of opportunities for public good research to improve policies and services for all New Zealanders. While revolution seems a strong word, at its heart, revolution involves a “dramatic and wide-reaching change”, and it is easy to argue that the development of the IDI is facilitating change that is both of those things.

Dramatic and wide-reaching

The IDI had a gradual genesis, from the creation of the Linked Employer-Employee Data (LEED) in 2005, to the IDI prototype in 2011, and the IDI itself in 2013. Since 2013, its growth has been far more rapid. From a handful of users in its early years, there are now hundreds of people using IDI data to help answer thorny questions across the full range of social and economic research domains. The IDI is incredibly powerful for research, and has a number of important strengths.

  • Longitudinal – Providing a picture of people’s lives over time, crucial for understanding the effect of policies and services.
  • A full enumeration – Incorporating administrative data for almost all New Zealanders, enabling a focus on minority groups and small geographic areas.
  • Accessible – By making data available to researchers at relatively low cost, agencies are no longer gatekeepers of the data they collect, and a culture of sharing in the research community is encouraged.
  • Cross-sectoral – Allowing researchers to explore the relationships between different aspects of people’s lives that may be invisible to individual agencies.
  • Secure – Stats NZ manages the IDI to keep individuals’ data safe and their identities protected.

As powerful as it is, the IDI also has its limitations, and will never answer all of our questions. We will always need to interpret the results with caution, complement them with qualitative sources of data, and view them as just one contributor to the policy development process. Giving due weight to the things we cannot measure in the IDI will be as critical as analysing the things we can.

It is also a steep learning curve to use the data, requiring an investment of considerable time and effort. In this respect, the complexity of the IDI is weakness as well as strength. In time, improved documentation, shared examples of code, and standardised data, will lower these barriers, but this will take time. Even then, strong statistical skills will continue to be important to making well-informed and effective use of the data.

Treasury’s Analytics and Insights team was set up in 2013 to champion the development of IDI infrastructure and capability. While that role has evolved over time, the team has had a consistent focus on generating and publishing evidence about cross-sectoral issues to inform Treasury’s advice to the government of the day. Since 2015, we have been making new data and analysis available through our Insights page, as a complement to our traditional research papers. Insights contains a range of online data analysis tools and provides detailed information about the New Zealand population in an easy-to-use and interactive way.

Our changing population

The Analytics and Insights team recently added two new tools to Insights to help understand changes in the population of New Zealand and its territorial authority areas. This has provided the first detailed picture of New Zealand internal migration patterns outside of Census years. The tools use address records, birth and death records, and immigration data to identify who is in the country, and to identify where they might be living. Rules developed from Census and survey data then help us determine the most likely location of each person in each year, resulting in detailed descriptions of population change.

The new analysis helps local councils, central government agencies, and other decision-makers, to understand our changing population. The true strength of Insights is in its flexibility to answer questions that are specific to the user, filtering the tens of millions of data points, and presenting the information you want in a way that is accessible and useful to you. Amongst a great many other things, Insights reveals that:

  • 38,000 people left Auckland to move to other areas of New Zealand between 2015 and 2016, with the most common destinations being Tauranga, Hamilton, Waikato District, and Whangarei,
  • the population of Christchurch City became more male-dominated following the 2010 and 2011 earthquakes, with migrants from the Philippines, India, China, and the UK arriving to work on the rebuild,
  • there were three times as many 65 to 69-year-olds as 20 to 24-year-olds living in Thames-Coromandel District in 2016 (up from twice as many in 2008),
  • the population of Queenstown-Lakes District grew at a faster rate than any other area between 2008 and 2016, mainly due to the arrival of large numbers of young migrants, especially from the UK.

IDI

Onwards and upwards

The number of IDI projects is continually increasing, as its potential to shed light on issues of public importance grows. Stats NZ’s new online research database highlights the huge breadth of research underway for the benefit of all. While the addition of new data sources has slowed somewhat, the passage of time brings an ever-longer back-series of data, and the addition of survey data provides a critical complement to the administrative data that forms the spine of the IDI. This will be particularly important as our focus broadens beyond the things that agencies typically measure.

Innovative use of a combination of survey and administrative data in the IDI will be a critical contributor to realising the current Government’s wellbeing vision, and to successfully applying the Treasury’s Living Standards Framework to practical investment decisions. Vive la révolution!