What Is Alternative Data and How Can It Help Efforts to Leave No One Behind?
Official statistics and measures of poverty do not fully capture the causes of marginalization and how they intersect and interact. The 2030 Agenda is catalyzing a shift in how the world thinks about data and the use of "non-official data sources" to better reflect the needs of the most marginalized.
The commitment of the 2030 Agenda to leave no one behind and to address the needs of the “furthest behind first” acknowledges that previous efforts to reduce poverty and end marginalization have failed to reach some of the individuals, communities, and countries that need them the most. While poverty has been reduced in many countries, the most marginalized have seen little to no benefit. One reason is that official statistics and measures of poverty do not fully capture the causes of marginalization and how they intersect and interact. The 2030 Agenda is catalyzing a shift in how the world thinks about data to better reflect the needs of the most marginalized.
Recognizing that better data will be required to achieve the SDGs while leaving no one behind, the UN Statistics Division established the World Data Forum on Sustainable Development Data. The Forum is intended to be a platform for improved cooperation between data stakeholders at the national and international levels to mobilize data for sustainable development and to fill data gaps. At its first session in 2017, the Data Forum adopted the Cape Town Global Action Plan for Sustainable Development Data, which calls for integrating new and innovative data generated outside the official statistical system—including administrative data and geospatial data—into official statistics. The Plan also encourages the development of multistakeholder partnerships involving national statistical offices (NSOs), governments, academia, civil society, private sector, and other stakeholders involved in the production and use of data for sustainable development.
In subsequent discussions, participants in the Data Forum increasingly recognized that NSOs must collaborate with the entire data ecosystem—that is, all stakeholders involved in producing and using data, including communities, government, business, and civil society—to produce data fit for the task of leaving no one behind. Participants highlighted innovative data sources and citizen-driven data as essential tools to “fill data gaps on the status and needs of people by income, sex, age, race, ethnicity, migratory status, disability and geographic location and other characteristics.” The discussion also shifted from a focus on “integrating” non-official data sources into statistical systems, which requires other data stakeholders to apply standards and procedures used by NSOs, to a focus on complementing official data with data from alternative sources using their respective standards.
The concept of alternative data thus encompasses any data collected by stakeholders other than the NSO using a minimum of standards to ensure privacy, confidentiality, transparency, and accessibility. This broad definition allows for drawing on a wide variety of potentially useful data sources, several of which are emerging as particularly important for leaving no one behind.
- Citizen-generated data, where the individuals concerned participate in the development of frameworks and data collection and decide over the use of data that describes them. Citizen-generated data is purpose driven and provides important insights into the drivers of marginalization affecting certain groups or localities.
- Human rights data, which includes data on human rights cases and data on legislative review. This data helps understand where marginalization is the consequence of systemic racism or a failure to protect the rights of individuals and groups
- Geospatial data, which in combination with other statistical data can identify where marginalized groups live and how geography and locally specific factors influence marginalization. Geospatial data can overcome challenges of data collection arising from the fact that marginalized people often live in informal settlements, lack a permanent address, or are reluctant to share their data for fear of further marginalization.
- Administrative data, which is collected by government agencies and non-governmental organizations serving marginalized groups as part of routine operations. While not intended for statistical purposes, this data can be turned into datasets that can fill specific data gaps in official data sources.
- Private sector data collected by companies as part of efforts to report on the environmental, social, and governance impacts (ESG). ESG data can enable companies to assess their impact on marginalized groups through their activities as well as their employees, but public access to data is often limited, and common foundations for impact measurement that would enable broader use of ESG data are still being developed.
These are some examples of a rapidly growing field of alternative data sources and innovative uses of existing data to leave no one behind. In addition to what alternative data can be used to complement existing sources, the 2030 Agenda is also catalyzing a discussion on how data should be used. Traditionally, NSOs or equivalent institutions act as the main data steward for a country, responsible for collecting and publishing high-quality data adhering to agreed standards to protect data privacy and safety. This means that decisions on what data is collected, how it is disaggregated, and ultimately how it is used are centralized in a top-down fashion.
This model is coming under scrutiny as mounting evidence shows that data can be used far more effectively if people have a say in the collection and use of data describing them. Participation in decisions on what data is collected and how it is disaggregated and communicated ensures that data reflects the experience, values, and perspectives of marginalized groups and that data collection ultimately provides benefits to those who agreed to sharing data about themselves. There are several initiatives that support this transformation in data governance from different perspectives, including, for example, principles for a human rights-based approach to data, the definition of common data values, or best practices for the responsible use of data.
The latest edition of the World Data Forum, held in Bern, Switzerland, in late 2021, also captured these trends in its final declaration, the Bern Data Compact for the Decade of Action on the SDGs. The compact appeals to all members of the data ecosystem to develop data partnerships and urges investments in data literacy and trust in data to better understand the world through data and leave no one behind. Speakers at the Forum echoed these ambitions, noting that “data is power” and we “have it in our hands to give that power to the people.”
You might also be interested in
Citizen-Generated Data: Data by people, for people
Citizen-generated data complement official data and provide a necessary context for decision-makers to address those left behind, suffering from data marginalization and the outright invisible in national statistics.
Disparities in COVID Impacts Underline the Importance of Racialized Data to Understand and Address Systemic Racism
Racialized data on risk exposure and health impacts can help understand inequities in COVID-19 impact and support preventive policy decisions, but collection to date is haphazard.
Not Just Who, But Where: The need for geospatial data to achieve the Sustainable Development Goals
To advance the 2030 Agenda, the availability of geospatial data allows us to know where marginalized people are located and make the evidence-based decisions required to make sure they are no longer left behind.
Leveraging the Linkages: How human rights data can advance SDG monitoring
To create opportunities for synergies between the "leave no one behind" principle and the "realize human rights for all" principle in implementation and improved monitoring, there is a need to properly leverage data and legal mechanisms.