By Ana Acosta, Aspasea Mckenna, Liz Sukhinenko, and McKenzie Horwitz


Ana Acosta, Aspasea Mckenna, Liz Sukhinenko, and McKenzie Horwitz are second-year International Development students who traveled to India to develop a Data Quality Framework (DQF) which assesses India’s FSM data ecosystem with Athena Infonomics and the Bill and Melinda Gates Foundation.

The IDEV Practicum allows students to work directly with public, private and non-governmental organizations as a capstone to their graduate studies. The 2020 IDEV Practicum Blog is a seven-part series that chronicles the travels of IDEV students who take on client projects over winter break.


High quality data is essential for development researchers and practitioners to implement effective interventions and for the development community as a whole to meet goals like the UN’s Sustainable Development Goals (SDGs). Data helps measure whether development programs and policies - and the resources needed to implement them - are achieving their outcomes and creating the intended impact in the lives of the people they serve. Effective data use is especially important for development practitioners who work with scarce programmatic resources as it allows them to go beyond good intentions and intuition when deciding which programs to implement and how. However, data is only informative if it is of high quality, such as lacks inconsistencies and data gaps, has relevant metadata, and is accessible to its users. Good-quality data enables users to make better-informed decisions about current and future program implementation. When the opposite is true, data can hinder, if not be detrimental to, the decision-making process.

In India, Narendra Modi’s administration has invested INR 1.96 lakh crore (USD 28 billion) in improving sanitation infrastructure and practices across the country through an initiative called Swachh Bharat (Clean India).[1] This involved the planned construction of around 100 million toilets across the country to eliminate open defecation. This initiative focuses on these first phases of the shit-flow diagram, a tool that sanitation experts and urban planners use to map out excreta from initial generation and containment to treatment and disposal. However, the later stages regarding the proper disposal and treatment of fecal sludge cannot be ignored if India is to enhance sanitation across the country as a whole. To plan these later stages, reliable population and sanitation data is essential. However, national surveys, like the Census and those administered through the Swachh Bharat initiative, include unreliable self-reported sanitation data, do not include sanitation infrastructure data, and, in the former case, is outdated as it is administered on a ten-year basis. This poses problems for organizations operating in the fecal sludge management (FSM) space in India, like the Bill and Melinda Gates Foundation (BMGF) and the local organizations that they fund, as they lack the data they need to understand the state of FSM in the country for planning and monitoring purposes.. Standardizing these indicators and ensuring that quality data is being collected is imperative to ensure that India continues to make progress on SDG sanitation goals.

In this context, our practicum team worked with Athena Infonomics and BMGF to develop a Data Quality Framework (DQF) which assesses India’s FSM data ecosystem.

The DQF defines data quality through six dimensions – accessibility, completeness, consistency, reliability, integrity, and timeliness. The six dimensions are operationalized into a set of questions that are then applied to each indicator to produce a final score. In this way, states can use the framework to assess their data quality strengths and weaknesses on an ordinal scale, as well as compare data quality across states, themes, and dimensions. We piloted the DQF by applying it to data collected by three BMGF state implementing partners in 2018-2019.

In order to triangulate and provide richer context to the quantitative results of the DQF, our team traveled to New Delhi and Bangalore to conduct structured interviews with the three state-level implementing partners for which the DQF was applied. These implementing partners provided the team with insights on the secondary and primary data collection process in their states, their individual role in state data systems, and the changes they thought would most improve their data ecosystem. This process helped us identify a few gaps, such as the cumbersome top-down process of gaining access to government data, as well as the data collection roadblocks from informally-operating FSM collection trucks. The interviews also shed light on ways in which the data quality can be improved in a sustainable way.

The research we conducted is only a preliminary step towards understanding the complexities of this data ecosystem and harmonizing quality data collection processes across the country. More work is needed to understand how various agents in this ecosystem interact in different states, and how Athena and BMGF can help their local partners improve the functioning of this ecosystem. Understanding this data ecosystem is a critical step towards radically improving FSM service delivery and improving the state of sanitation for India’s population.

[1] Das, Anjana, and Subhash Narayan. “Nearing Completion Swachh Bharat Sees Cut in Allocation: Official.” Https://Www.outlookindia.com/, Outlookindia.com, 12 July 2019, www.outlookindia.com/newsscroll/nearing-completion-swachh-bharat-sees-cut-in-allocation-official/1573421.


To read about the work that other IDEV Practicum teams did this year, visit this page.



Photo Credit: Ana Acosta, Aspasea Mckenna, Liz Sukhinenko, and McKenzie Horwitz

Comment