You are here: Home / Resources / Frontpage Articles / Randomised Control Trials

Randomised Control Trials

The best intended social policies can have no or even negative effects if large portions of the resources are syphoned off. While we seem to approach a consensus that reducing leakage, defined as both corruption and misuse of resources, is a priority in achieving social policy objectives, relatively little is known on the how. A new approach to answering questions in social policy is taking hold across the field of international development – randomised control trials (RCTs). Borrowed from medical trials for new drugs, RCTs are considered the most rigorous methodology available for estimating the real impact of an intervention. RCTs can help us to understand what kind of programs work best in tackling corruption.

Some Questions

What are the most cost effective ways of achieving transparency and accountability of basic service delivery? For example, how can we ensure that teachers and health workers in rural areas face the right incentives to do their work? How can one warrant that funds intended to build community roads are actually used that way? More generally, how can one make sure that resources reach the intended beneficiaries without being captured by elites?

There are a wide range of possible solutions, from incentive based pay for civil servants, over community monitoring, publication of budget allocations, enhanced auditing, integrity pledges, legal reform, etc. But how do we know which intervention works best under which circumstances? How do we know which interventions are the most cost effective? How can we make sure we use the limited resources available for anti-corruption work in the most effective manner?

Randomized controlled trials are beginning to shed some light on some of these questions.

Why use RCTs to address these?

RCTs are considered the most rigorous way to assess the causal effects of an intervention. When we think about impact, we are really asking two questions: First, what would have happened in the absence of the intervention?  Second, how does the situation with the intervention compare to the counterfactual? Answering the first question is non-trivial. The world we live in is complex. Many different influences take place at the same time, making it difficult to attribute any observed change to one single event.

Things get even more complicated when the uptake of a program is self-initiated. Different types of governments, communities or individuals select into different programs, making any before-after comparisons prone to bias. Say we want to know the impact of publishing budget allocations in local communities. Uptake of the program is up to the local leadership of the community. A study compares the change in the percentage of funds leaked across communities who adopted the program and those who did not. It finds that those who adopted the program significantly reduced the share of leaked funds. What can we conclude about the effectiveness of the program? Unfortunately, nothing. First, communities who adopt the program are likely to be different in the first place: They may have a more active civil society, politicians with better intentions, or have more pressure to ‘clean up’. This may mean that they were on a different trend all along. It may also mean that they adopted other measures simultaneously, precisely because they had an interest in improving accountability.

RCTs allow us to get around these challenges by constructing a valid counterfactual. For example, to evaluate this program one would identify a pool of two hundred eligible communities and randomly select one hundred that will receive the program (the treatment group) and one hundred which will not (at least not at this same time – the control group). Comparing outcomes between the treatment and control group, say two years later, allows isolating the actual impact of the program itself. Due to the randomization, unobservable characteristics such as intrinsic motivation, activity of civil society etc. are by expectation the same across treatment and control group. General time trends, such as other government programs implemented, economic growth etc. should be the same across both groups, allowing us to control for those.

Some answers, and more questions

In recent years, development economists and political scientists have started addressing some of the questions around governance and accountability by measuring the impact of different interventions through randomized controlled trials. A few findings from these studies and remaining open questions are highlighted below.

The first question is whether participatory community monitoring initiatives or government audits are more effective in curbing corruption, and in which types of programs. Ben Olken’s experimental work in Indonesia points to the conclusion that community monitoring can be very effective if (a) leakage is easily detectable by laypeople and (b) if the monitoring concerns the provision of private goods (such as salaries in a public works program). If, on the other hand, the monitored service provision is related to public goods (such as roads) or if leakage is hard to detect by laypeople, its effectiveness is limited. Olken’s work suggests that in these circumstances independent audits are more effective. 1 More work is needed to explore whether these patterns hold in other contexts.

Second, what institutional arrangements for community monitoring are the most effective? Should everyone be included, in order to maximize participation, or is it more effective to include only a few elected focal people from the community in order to minimize free-riding? Alternatively, can we use modern technology to have service providers monitor themselves? For example in India, Esther Duflo et al. found that requesting teachers to take pictures of themselves in front of their class with a disposable camera with tamper-proof time stamps every morning and afternoon and making salary payment contingent to it dramatically reduced teacher absenteeism.2

Third, is community monitoring alone sufficient? Björkman and Svensson found the community monitoring of health service provision in rural areas in Uganda highly effective. Part of the intervention was training communities on how to act on the observation of misuse of resources, establishing fora with health service providers and politicians, etc.3 A related intervention in India, community monitoring of schools, did not have any effect.4 One possible explanation is that it was not paired with an intervention aimed at enabling communities to act on the information about misuse of funds. More work is needed to explore this systematically – for example by directly comparing the effectiveness of community monitoring ‘only’ with community monitoring ‘plus’, i.e. combined with an avenue to take action and exerting pressure on politicians.

Another set of open questions centers around norms: What role do changes in norms among either community members or politicians and bureaucrats play in reducing corruption? Can they be induced from the outside, for example through community awareness campaigns or integrity pledges for government officials?5


All methodologies have their shortcomings, and RCTs are no exception. While they are considered the gold standard for estimating the causal effects of an intervention, they only work in certain circumstances. First of all, one needs to think about evaluation before the project starts – once it is underway it is too late (this caveat really applies to any serious evaluation).

Second, the approach relies on randomization into treatment and control group along a sufficiently large number of units in order to establish meaningful counterfactuals. This implies that (a) it must be possible to target the intervention to a specific user group, and (b) effects must be local. For example an integrity pledge aimed at reducing corruption at the local level fits the bill. National legal reform does not meet either criterion: it targets a national legal system and expected effects are national. Neither do advocacy campaigns aimed at changing national policy making. While implementation may be localized, the expected effect occurs at the national level. 

A third caveat is ethical considerations. While we can evaluate additional programs – and I would argue are ethically obliged to do so in order to ensure that limited resources are used efficiently – evaluating  programs to which everyone has a basic right, such as for example investigation as response to accusations is problematic. We can always evaluate the impact of doing additional investigations, but should never take them away from anyone.

Last but not least comes external validity. Will an intervention that worked in country A also work in country B? In a sense, issues related to governance and corruption are the hardest to study – precisely because they are so political and because the effectiveness of any intervention is highly contingent on the environment in which it is implemented and the political will behind it.

In order to really learn, we need to do study corruption programs in a strategic manner. This means designing studies that (a) allow us to disentangle the effects of different components of an intervention through crosscutting designs6 and (b) that speak directly to the existing theory.7 This also implies consistent replication of studies in different countries.


Author: Pia Raffler, Yale University

Pia Raffler is a PhD student in Political Science at Yale University and former Country Director of Innovations for Poverty Action in Uganda. She uses field experiments to study questions about local institutional structures and their relationship to public accountability, corruption and efficient service delivery in developing countries.





1. B. Olken, “Monitoring corruption. Evidence from a Field Experiment in Indonesia”, Journal of Political Economy 115:200–49, 2007

2. E. Duflo et al., “Monitoring works. Getting Teachers to Come to School”, CEPR Discussion Paper no. DP6682, 2008

3. M. Björkman and J. Svensson, “Power to the People. Evidence from a Randomized Experiment of a Citizen Report Card Project in Uganda”, Quarterly Journal of Economics 124, 2009

4. A. Banerjee et al., “Pitfalls of Participatory Programs. Evidence from a Randomized Evaluation in Education in India”, American Economic Journal: Economic Policy, 2:1, 1–30, 2010

5. For a comprehensive review on the literature on governance and open questions see the forthcoming white paper by the JPAL governance initiative ( For a forthcoming review of corruption related experiments, see Peisakhin 2011

6. Crosscutting designs refer to breaking a bundled intervention up into several treatment arms. For example, if one wants to measure the impact of a program that involves community reporting on corruption and a training for communities on how to exert pressure on local government officials to reduce corruption, one would have on treatment group where the community reporting intervention is introduced, and one where both the community reporting and the trainings are introduced, in addition to the control group. This way, it is possible to measure both (a) the impact of the combined intervention, as well as (b) the impact of community reporting alone, and (c) the added benefit from implementing the training intervention in addition to the community reporting.

7. Please see C. Blattman: Impact Evaluation 2.0. Talk at DfID, 2008, on how to design useful experiments.


Author : Pia Raffler

20 Apr 2011

Bookmark and Share

Document Actions

Our partner