Mining for Data Gold!

Avoiding the data trap blog series

'Avoiding the Data Trap’ is a 3-part blog series developed by Pamoja to highlight a new approach to impact evaluation, called Contribution Tracing.  The blog series explains key steps in Contribution Tracing that can guide evaluators, and those commissioning evaluations, to avoid common data traps, by identifying and gathering only the strongest data. The blog series draws from a live case study of a Contribution Tracing pilot evaluation of Ghana’s Strengthening Accountability Mechanisms Project  (GSAM) project. This pilot forms part of a learning partnership called the Capturing Complex Change Project, between Pamoja, CARE UK International and CARE Ghana and Bangladesh Country Offices.

Part 1: Mining for Data Gold!

With Monitoring and Evaluation now a standard feature in development projects, NGO staff and evaluation practitioners are charged with the sometimes daunting task of gathering evidence to prove the influence of programming on complex social change. Examples of what NGOs such as CARE are trying to do to tackle poverty and address social injustice are endless. Often, we can see change happening in the communities we serve. However, the process of showing the ‘how’ often results in pages and pages of ‘data’ that yields little reliable evidence. There is frustration that comes with having a strong belief that programming has made a difference for the better, but then failing to capture data that supports a clear cause and effect relationship. We face challenges in claiming with confidence, just how our work actually contributed to positive change. How many of us have been here too many times before?

What is the data trap?

When evaluating a claim made by a project or programme, about the role it may have played in contributing to an observable change, it is crucial to gather evidence that strengthens our confidence in making such claims. All too often when substantiating ‘contribution claims’, strengthening our confidence in the claim is confused with simply collecting an abundance of data. We miss the mark by failing to focus on the relative strength (or weakness) of such data. Wasting time, energy and resources collecting data that does nothing to increase confidence in the claim, is what we like to call a data trap.

Enter Contribution Tracing: a new theory-based impact evaluation approach. It combines the principles and tests found in Process Tracing, with Bayesian Updating. Contribution Tracing helps sort the data wheat from the chaff! Most importantly, it changes the way we look at data, encouraging us to identify and seek out the best quality data with the highest probative power. Contribution Tracing gives us a clear strategy for avoiding the data trap; supporting evaluators instead to mine for data gold.

So how does it work? To illustrate, let’s draw from a live Contribution Tracing evaluation which is part of the Capturing Complex Change learning partnership. Ghana’s Strengthening Accountability Mechanisms (GSAM) is a USAID-funded, multi-year intervention led by CARE with partners IBIS and ISODEC. The ultimate aim of GSAM is to support citizens to demand accountability from their local government officials.

The GSAM evaluation team are currently testing the following claim, using Contribution Tracing:

GSAM’s facilitation of citizen’s oversight on capital projects has improved District Assemblies’ responsiveness to citizen’s concerns.

Essentially this claim is stating that a range of activities provided by - or funded by GSAM - has supported citizens to become more engaged in scrutinising government-funded building projects. As a result of this, District Assemblies (local government) have become more responsive to concerns presented by citizens, related to the quality, performance and/or specification of on-going capital projects in their communities, such as the construction of new schools or roads.

To test this claim, we need to unpack the mechanism that provides a causal explanation for how the project’s range of facilitation activities contributes to the outcome of District Assemblies becoming more responsive to citizens’ concerns.

In Contribution Tracing, causality is thought of as being transmitted along the mechanism, with each interlocking component being a necessary part. A mechanism component is comprised of two essential elements: an entity, such as an individual, community or organisation, for example; that performs an activity or behaviour, or that holds particular knowledge, attitudes or beliefs.

One of the necessary components, identified by the GSAM evaluation team is below:

The GSAM project (entity) delivered training to Civil Society Organisations (activity) to increase their knowledge and skills in engaging with District Assemblies on the planning and implementation processes of capital projects.

In Contribution Tracing, the role of the evaluator is to identify evidence that tests whether each component in the mechanism for a particular claim actually exists, or not. If sufficient empirical evidence can be identified and gathered for each component in a claim’s mechanism, we can update our confidence in the claim, quantitatively.

But wait! Before running off to gather whatever data we can lay our hands on, in Contribution Tracing we take several initial steps to help design our data collection (Box 1). These steps focus our attention on only gathering specific data that supports testing the existence of each component of our claim’s mechanism. Why is this important?

  • It saves a lot of effort in gathering essentially useless data, in respect of our claim;
  • It saves limited resources e.g. staff time, finance, etc;
  • It’s more ethical because we are not asking key informants to spend their precious time providing information that we won’t use; and
  • It produces more rigorous findings.

This blog is focused on step 1, with later blogs in the series describing the other steps.


To begin the data design process in Contribution Tracing, we ask “if the component of the claim is true, what evidence would we expect to find?”. In other words, if the GSAM project really did provide Civil Society Organisations with training, what evidence should be readily available, if we look for it? Some examples of such ‘expect to find’ evidence are shown in Box 2.

The logic behind identifying ‘expect to find’ evidence is simple. If the component of the claim is true - if the project really did deliver its training programme - the evaluator should be able to easily find such evidence. Failure to find ‘expect to find’ evidence, diminishes the evaluator’s confidence in the existence of the component of the claim (and perhaps in the claim overall). ‘Expect to find’ evidence, therefore, becomes powerful only when it is not found.

In addition to expect to find evidence, we must also try and identify ‘love to find’ evidence. This is evidence which is harder to identify and find, but if found, serves to greatly increase our confidence in the component of the claim (and perhaps in the claim overall). We can think of ‘love to find’ evidence as highly unique to the component of the claim. Box 3 shows an example.

While we would love to find video footage of the training event being delivered, it is not an expectation. It is not usual practice to film such events in this context, but if filming did take place, and the evaluation team could gather such evidence; it would confirm the component of the claim. So, while expect to find evidence only becomes powerful when not found, love to find evidence becomes powerful when it is found.

This step in Contribution Tracing helps the evaluation team to begin the process of focusing on identifying data gold, but it is only the first step. In the next blog, we will explore how we use probabilities to be even more targeted in our search for data gold.

Part 2 of the blog series will be published on 31 July 2017. Sign up below and get parts 2 and 3 delivered directly to your inbox.