What empirical microeconomics tells us about reparations

Ta-Nehisi Coates argues that the United States government should pay reparations to African-Americans for slavery and institutionalized racism. The essay is long and full of supporting evidence, and generally makes a strong case that the US government bears responsibility for oppressing blacks for hundreds of years. While Coates digresses occasionally  – into claims of broader guilt by all Americans, or all whites, or into arguments that America’s current prosperity depends on its history of oppressing blacks – those claims are not necessary for his main point to hold water. That point is fairly straightforward: the US government was complicit in a moral evil, and it should take steps to make right for that evil, as it did, for example, for the internment of Japanese Americans during World War II.

Leaving aside the merits of the underlying idea, and the tasking of pinning down what the value of the reparations would be and how to allocate them, I wanted to discuss the practical aspect: what would providing reparations accomplish? Could transferring money to blacks help close the yawning gaps between them and whites that exist across a broad range of social indicators? Reparations need not be cash transfers – Coates cites Charles Ogletree’s idea of reparations in the form of job training programs – but usually the term is associated with the payment of cash to the afflicted group. This fixes a key economic question: what would happen if the US government made a massive financial transfer to every black person in America?

In some sense the right answer to this is “we don’t know”. We have never tried doing this, let alone in an experimental framework that would allow us to measure its effects. Coates does list one empirical example – the German payment of Holocaust reparations to the Israeli government, which is credited with funding the country through a tough spell and contributing to substantial economic growth. But most of those payments went to the government, not to individuals, so it is unclear how those effects would translate to the context of reparations to blacks in the US.

Even though no one has ever run this experiment, we do have evidence on what happens when people receive large cash transfers. The best evidence comes from a paper by Hoyt Bleakley (who is joining Michigan’s economics faculty in the fall) and Joseph Ferrie, about a lottery that distributed land at random to adult white males in Georgia (ungated working paper version).* The winners of this lottery received land worth approximately as much as the entire wealth holdings of the median person at the time. Given that the average black family has one sixth the wealth of the average white family, this is actually pretty close to the magnitude of the transfer we’d be talking about.


This image (from Wonkblog) shows that the black-white wealth divide has widened rather than narrowing over time

Large cash transfers help: they make the recipients richer. But they don’t have the long-term social ramifications that you might hope for. The children and grandchildren of lottery winners end up no wealthier and no better-educated than non-winners. The big caveat with this comparison is that the Bleakley and Ferrie paper studies people from the 19th century, so the sample and context are quite different than they are today. However, I’d actually expect those differences to lead to larger effects than we’d see from targeting a poorer, more disadvantaged group. Overall, this suggests that wealth transfers – even massive ones – will not have transformational effects on socioeconomic status that last across generations.

On the other hand, a wide range of evidence suggests that, contrary to stereotypes, people (even poor people) do not “waste” cash transfers on alcohol, cigarettes, or other vices.** Those results are for transfers on a scale much smaller than reparations would operate on, and are for much poorer populations than the typical black American. But implicit in the the stereotype that money will go toward alcohol is notion that poorer people should have bigger problems with this.*** Since even very poor people seem to have no problem refraining from potentially-problematic spending, it is unlikely that this would be an issue for a reparations program.

Taken together, the evidence from empirical economics tells us that reparations, if done as pure financial transfers, would make blacks richer and with few downsides – but that they would not have transformative effects on the long-run gaps in outcomes between whites and blacks. While wealth is inherited, wealthy people also propagate success through their family lines by passing down other attributes – from education to behaviors to social connections to their race – that end up washing out the effects of wealth alone. To fix the black-white gap in a permanent way, we need to address all sorts of other differences as well; addressing wealth alone is not enough.

What about other ways of providing reparations? The literature on job training programs for marginalized groups is fairly discouraging, so I’m not convinced that Ogletree’s proposal would work well (although maybe we need to work on developing better job training). Another possibility is to work through the education system. Roland Fryer’s research has shown that improving middle-school educational outcomes for blacks helps them close gaps in other social outcomes. At the college level, there is robust, although not necessarily causal, evidence that high-quality colleges help blacks quite a bit (and matter much less for whites). One policy that might work is to replace affirmative action with an official reparations program, funded by the federal government, that creates additional slots at all universities to accommodate black students. This would reduce the racial tension that is stirred up by the current system, where people perceive that they are being denied admission based on their race, and where the moral and legal justification for the scheme is not made clear. It might reduce opposition to the program as well.

More broadly, we still need more evidence about what kinds of programs help generate permanent reductions in the black-white social divide. If reparations end up being taken seriously, then the government should fund and promote experimental and regression-discontinuity research into a wide range of possible programs in order to see which ones work. Financial transfers alone may not work – but we have the empirical tools needed to figure out what does.

*In a dark irony, this land came from one of the worst crimes against humanity the US government has ever committed.
**As I’ve discussed on this blog in the past, it is not obvious that these purchases are wasteful; we need to take seriously the idea that people have agency – that they can be trusted to make their own decisions.
***Basic economic theory actually suggests the opposite, since poorer people have a tighter budget constraint. But it also tells us that people’s unaltered choices maximize their own welfare, so this is a non-problem. Chris Blattman believes that the homeless in the US are fundamentally different from similar-looking populations in Africa, but that would only apply to the poorest black people who received reparations.
Posted in Uncategorized

"People think it’s easy to contract HIV. That’s a good thing, right? Maybe not."

That’s the title of my guest post on the World Bank’s Development Impact blog, describing my job market paper. Here’s a bit of the introduction:

People are afraid of HIV. Moreover, people around the world are convinced that the virus is easier to get than it actually is. The median person thinks that if you have unprotected sex with an HIV-positive person a single time, you will get HIV for sure. The truth is that it’s not nearly that easy to get HIV – the medical literature estimates that the transmission rate is actually about 0.1% per sex act, or 10% per year.

One way of interpreting these big overestimates of risks is that HIV education is working. […] The classic risk compensation model says this should be causing reductions in unprotected sex.

Unfortunately, the risk compensation story doesn’t seem to be reflected in actual behavior – at least not in sub-Saharan Africa, where the HIV epidemic is at its worst. […] If people are so scared, why don’t they seem to be compensating away from the risk of HIV infection? I tackle that question in my job market paper, “The Effect of HIV Risk Beliefs on Risky Sexual Behavior: Scared Straight or Scared to Death?” My answer is surprising.

You can read the whole thing on their site by following this link. My post is part of their annual Blog Your Job Market Paper series, which features summaries of research from development economics Ph.D. students on the job market. People who follow this blog should check out that series, which has featured some really awesome research this year. More generally, Development Impact is by popular acclamation the best development-focused blog out there; I read every post.

Posted in Uncategorized

Making the Grade: The Sensitivity of Education Program Effectiveness to Input Choices and Outcome Measures




/* Style Definitions */
{mso-style-name:”Table Normal”;
mso-padding-alt:0in 5.4pt 0in 5.4pt;
font-family:”Times New Roman”,serif;
mso-bidi-font-family:”Times New Roman”;

I’m very happy to announce that my paper with Rebecca Thornton, “Making the Grade: The Sensitivity of Education Program Effectiveness to Input Choices and Outcome Measures”, has been accepted by the Review of Economics and Statistics. An un-gated copy of the final pre-print is available here.

Here’s the abstract of the paper:

This paper demonstrates the acute sensitivity of education program effectiveness to the choices of inputs and outcome measures, using a randomized evaluation of a mother-tongue literacy program. The program raises reading scores by 0.64SDs and writing scores by 0.45SDs. A reduced-cost version instead yields statistically-insignificant reading gains and some large negative effects (-0.33SDs) on advanced writing. We combine a conceptual model of education production with detailed classroom observations to examine the mechanisms driving the results; we show they could be driven by the program initially lowering productivity before raising it, and potentially by missing complementary inputs in the reduced-cost version. 

The program we study, the Northern Uganda Literacy Project, is one of the most effective education interventions in the world. It is at the 99th percentile of the distribution of treatment effects in McEwan (2015), and would rank as the single most effective for improving reading. It improves reading scores by 0.64 standard deviations. Using the Evans and Yuan equivalent-years-of-schooling conversion, that is as much as we’d expect students to improve in three years of school under the status quo. It is over four times as much as the control-group students improve from the beginning to the end of the school year in our study.


Effects of the NULP intervention on reading scores (in control-group SDs)

It is also expensive: it costs nearly $20 per student, more than twice as much as the average intervention for which cost data is available. So we worked with Mango Tree, the organization that developed it, to design a reduced-cost version. This version cut costs by getting rid of less-essential materials, and also by shifting to a train-the-trainers model of program delivery. It was somewhat less effective for improving reading scores (see above), and for the basic writing skill of name-writing, but actually backfired for some measures of writing skills:


Effects of the NULP intervention on writing scores (in control-group SDs)

This means that the relative cost-effectiveness of the two versions of the program is highly sensitive to which outcome measure we use. Focusing just on the most-basic skill of letter name recognition makes the cheaper version look great—but its cost effectiveness is negative when we look at writing skills.


Why did this happen? The intervention was delivered as a package, and we couldn’t test the components separately for two reasons. Resource constraints meant that we didn’t have enough schools to test all the many different combinations of inputs. More important, practical constraints make it hard to separate some inputs from one another. For example, the intervention involves intensive teacher training and support. That training relies on the textbooks, and could not be delivered without them.

Instead, we develop a model of education production with multiple inputs and outputs, and show that there are several mechanisms that could lead to a reduction in inputs not just lowering the treatment effects of the program, but actually leading to declines in some education outcomes. First, if the intervention raises productivity more for one outcome more than another, this can lead to a decline in the second outcome due to a substitution effect. Second, a similar pattern can occur if inputs are complements in producing certain skills and one is omitted. Third, the program may actually make teachers less productive in the short term, as part of overhauling their teaching methods—a so-called “J-curve”.

We find the strongest evidence for this third mechanism. Productivity for writing, in terms of learning gains per minute, actually falls in the reduced-cost schools. It is plausible that the reduced-cost version of the program pushed teachers onto the negative portion of the J-curve, but didn’t do enough to get them into the region of gains. In contrast, for reading (and for both skills in the full-cost version) the program provided a sufficient push to achieve gains.

There is also some evidence of that missing complementary inputs were important for the backfiring of the reduced-cost program. Some of the omitted inputs are designed to be complements—for example, slates that students can use to practice writing with chalk. Moreover, we find that classroom behaviors by teachers and students have little predictive power for test scores when entered linearly, but allowing for non-linear terms and interactions leads to a much higher R-squared. Notably, the machine-learning methods we apply indicate that the greatest predictive power comes from interactions between variables.

These findings are an important cautionary tale for policymakers who are interested in using successful education programs, but worried about their costs. Cutting costs by stripping out inputs may not just reduce a program’s effectiveness, but actually make it worse than doing nothing at all.

For more details, check out the paper here. Comments are welcome—while this paper is already published, Rebecca and I (along with Julie Buhl-Wiggers and Jeff Smith) are working on a number of followup papers based on the same dataset.

Posted in Uncategorized

A Nobel Prize for Development Economics as an Experimental Science

Fifteen years ago I was an undergrad physics major, and I had just finished a summer spent teaching schoolchildren in Tanzania about HIV. The trip was both inspiring and demoralizing. I had gotten involved because I knew AIDS was important and thought addressing it was a silver bullet to solve all of sub-Saharan Africa’s problems. I came away from the trip having probably accomplished little, but learned a lot about the tangled constellation of challenges facing Tanzanians. They lacked access to higher education, to power, to running water. AIDS was a big problem, but one of many. And could we do anything about these issues? Most of my courses on international development were at best descriptive and at worst defeatist. There were lots of problems, and colonialism was to blame. Or maybe the oil curse. Or trade policy. It was hard to tell.

Just as I was pondering these problems and what I could do about them, talk began to spread about the incredible work being done by Abhijit Banerjee and Esther Duflo. They had started an organization, J-PAL, that was running actual experiments to study solutions to economic and social problems in the world’s poorest places. At this point, my undergraduate courses still emphasized that economics was not an experimental science. But I started reading about this new movement to change that, in development economics in particular, by using RCTs to test the effects of programs and answer first-order economic questions.

At the same time, I also learned about the work being done by Michael Kremer, another of the architects of the experimental revolution in development economics. One of the first development RCT papers I read remains my all-time favorite economics paper: Ted Miguel and Kremer‘s Worms. This paper has it all. They study a specific & important program, and answer first-order questions in health economics. They use a randomized trial, but their analysis is informed by economic theory: because intestinal worm treatment has positive externalities, you will drastically understate the benefits of treatment if you ignore that in your data analysis. And the results were hugely influential: Deworm the World is now implementing school-based deworming around the world. I was sold: I changed career paths and started pursuing development economics. And I became what is often called a randomista, a researcher focused on using randomized trials to study economic issues and solve policy problems in poor countries. Kremer is in fact my academic grandfather: he advised Rebecca Thornton, who in turn advised me.

When the Nobel Prize in Economics was awarded to Banerjee, Duflo, and Kremer this Monday, a major reason was because of their tremendous influence on hundreds if not thousands of people with stories like mine. Without their influence, the field of development economics would look entirely different. A huge share of us wouldn’t be economists at all, and if we were we would be doing entirely different things. Beyond development economics per se, the RCT revolution spilled over into other fields. We increasingly think of economics as an experimental science (which was the title of my dissertation) – even when we cannot run actual experiments, we think about our data analysis as approximating an experimental ideal. Field experiments have been used in economics for a long time, but this years prize-winners helped make them into the gold standard for empirical work in the field.

They also helped make experiments the gold standard in studying development interventions, and this has been a colossal change in how we try to help the poor. Whereas once policymakers and donors had to be convinced by researchers that rigorous impact evaluations were important, now they actually seek out research partners to study their ideas. This has meant that we increasingly know what actually works in development, and even more important, what doesn’t work. We can rigorously show that many silver bullets aren’t so shiny after all – for example, additional expansions of microcredit do not transform the lives of the poor.

What is particularly striking and praiseworthy about this award is how early it came. There was a consensus that this trio would win a Nobel prize at some point, but these awards tend to be handed out well after the fact, once time has made researchers’ impact on the field clearer. It is a testament to their tremendous impact on the field of economics that it was already obvious that Duflo, Banerjee, and Kremer were worthy of the Nobel prize, and a credit to the committee that they saw fit to recognize the contributions so quickly. I think it’s fitting that Duflo is now the youngest person ever to win a Nobel prize in economics – given her influence on the field, it’s hard to believe she is just 46 years old.

Posted in Uncategorized

“Pay Me Later”: A simple, cheap, and surprisingly effective savings technology

Why would you ask your employer not to pay you yet? This is something I would personally never do. If I don’t want to spend money yet, I can just keep it in a bank account. But it’s a fairly common request in developing countries: my own field staff have asked this of me several times, and dairy farmers in Kenya will actually accept lower pay in order to put off getting paid.

The logic here is simple. In developed economies, savings earns a positive return, but in much of the developing world, people face a negative effective interest rate on their savings. Banks are loaded with transaction costs and hidden fees, and money hidden elsewhere could be stolen or lost. So deferred wages can be a very attractive way to save money until you actually want to spend it.

Lasse Brune, Eric Chyn, and I just finished a paper that takes that idea and turns it into a practical savings product for employees of a tea company in Malawi. Workers could choose to sign up and have a fraction of their pay withheld each payday, to be paid out in a lump sum at the end of the three-month harvest season.  About 52% of workers chose to sign up for the product; this choice was implemented at random for half of them. Workers who signed up saved 14% of their income in the scheme and increased their net savings by 24%.

dw balances

Accumulation of money in the deferred wages account over the course of the harvest season. The lump-sum payout was on April 30th.

The savings product has lasting effects on wealth. Workers spent a large fraction of their savings on durables, especially goods used for home improvements. Four months after the scheme ended, they owned 10% more assets overall, and 34% more of the iron sheeting used to improve roofs. We then let treatment-group workers participate in the savings product two more times, and followed up ten months after the lump sum payout for the last round. Treatment-group workers ended up 10% more likely to have improved metal roofs on their homes.*

This “Pay Me Later” product was unusually popular and successful for a savings intervention, which usually have low takeup and utilization and rarely have downstream effects.** What made this product work so well? We ran a set of additional choice experiments to figure out which features drove the high demand for this form of savings.

The first key feature is paying out the savings in a lump sum. When we offered a version of the scheme that paid out the savings smoothly (in six weekly installments) takeup fell to just 36%. The second is the automatic “deposits” that are built into the design. We offered some workers an identical product that differed only in that deposits were manual: a project staffer was located adjacent to the payroll site to accept deposits. Signup matched the original scheme but actual utilization was much lower.

On the other hand, the seasonal timing of the product was much less important for driving demand: it was just about as popular during the offseason as the main harvest season. The commitment savings aspect of the product also doesn’t matter much. When we offered a version of the product where workers could access the funds at any time during the season, it was just as popular as the original version where the funds were locked away.

In summary, letting people opt in to get paid later is a very promising way to help them save money. It can be run at nearly zero marginal cost, once the payroll system is designed to accommodate it and people are signed up. The benefits are substantial: it’s very popular and leads to meaningful increases in wealth.  It could potentially be deployed not just by firms but also by governments running cash programs and workfare schemes.

The success of “Pay Me Later” highlights the importance of paying attention to the solutions people in developing countries are already finding to the malfunctioning markets hindering their lives. Eric, Lasse, and I did a lot of work to design the experiment, and our field team and the management at the Lujeri Tea Estate deserve credit for making the research and the project work. But a lot of credit also should go to the workers who asked us not to pay them yet – this is their idea, and it worked extremely well.

Check out the paper for more about the savings product and our findings (link).

*These results are robust to correction for multiple hypothesis testing using the FWER adjustment of Haushofer and Shapiro (2016).
**A partial exception is Schaner (2018), which finds that interest rate subsidies on savings accounts lead to increases in assets and income. However, the channel appears to be raising entrepreneurship rather than utilization of the accounts.
Posted in Uncategorized

How Important is Temptation Spending? Maybe Less than We Thought

Poor people often have trouble saving money for a number of reasons: the banks they have access to are low-quality and expensive (and far away), saving is risky, and money that they do save is often eaten away by kin taxes. One reason that has featured prominently in theoretical explanations of poverty traps is “temptation spending” – goods like alcohol or tobacco that people can’t resist buying even though they’d really prefer not to. Intuitively, exposure to temptation reduces saving in two ways. First, it directly drains people’s cash holdings, so money they might have saved gets “wasted” on the good in question. Second, people realize that their future self will just waste their savings on temptation goods, so they don’t even try to save.

But how important is temptation spending in the economic lives of the poor? Together with Lasse Brune and my student Qingxiao Li, I have just completed a draft of a paper that tackles this question using data from a field experiment in Malawi. The short answer is: probably not very important after all.

One of our key contributions in the paper is to measure temptation spending by letting people define it for themselves. We do this two ways: first, we allow our subjects to list goods they are often tempted to buy or feel they waste money on, and then match that person-specific list of goods to a separate enumeration of items that they purchased. Second, we let people give the simple sum of money they spent that they felt was wasted. We also present several other potential definitions of temptation spending that are common in the literature, including the alcohol & tobacco definition, and also a combined index across all the definitions. The correlations between these measures are not very high: spending on alcohol & tobacco correlates with spending on self-designated temptation goods at just 0.07:


This is the result of people picking very different goods than policymakers or researchers might select as “temptation goods”. For example people commonly listed clothes as a temptation good, whereas alcohol was fairly uncommon.

We also show that direct exposure to a tempting environment does not significantly affect spending on temptation goods – let alone downstream outcomes. Our subjects were workers who received extra cash income during the agricultural offseason as part of our study. All workers received their pay at the largest local trading center, and some were randomly assigned to receive their pay during the market day (as opposed to the day before). This was the most-tempting environment commonly reported by the people in our study. Getting paid at the market didn’t move any of our measures of temptation spending and we can rule out meaningful effect sizes.

Why not? We go through a set of six possible explanations and find support for two of them. The first is substitution bias: the market where workers were paid was just one of several in the local area, some of which operated on the day the untreated workers were paid. It was feasible for them to go to the other markets to seek out temptation goods to buy, effectively undoing the treatment. This implies a very different model of temptation than we usually have in mind: it would mean that the purchases tempt you even if they are far away and you have to go seek them out.*

The second is pre-commitment to spending plans. If workers can find a way to mentally “tie their hands” by committing to spend their earnings on specific goods or services, they can mitigate the effects of temptation. We see some empirical evidence for this: the effects of the treatment are heterogeneous by whether workers have children enrolled in school. School fees are a common pre-planned expense in our setting; consistent with workers pre-committing to pay school fees, we see zero treatment effects for workers with children in school, and substantial positive effects for other workers.

Both of these explanations suggest that temptation spending is much less of a policy concern than we might have thought. The first story implies that specific exposure to a tempting environment may not matter at all – people will seek out tempting goods whether they are near them or not. The latter suggests that people can use either mental accounting or actual financial agreements to shield themselves from the risk of temptation spending.

There is much more in the paper, “How Important is Temptation Spending? Maybe Less than We Thought” – check it out by clicking here. Feedback and suggestions are very welcome!

*I have personally experienced this sort of temptation for Icees, which aren’t good for me but which I will go out of my way to obtain.
Posted in Uncategorized

We can do better than just giving cash to poor people. Here’s why that matters.

Cash transfers are an enormously valuable, and increasingly widespread, development intervention. Their value and popularity has driven a vast literature studying how various kinds of cash transfers (conditional, unconditional, cash-for-work, remittances) affect all sorts of outcomes (finances, health, education, job choice). I work in one small corner of this literature myself: Lasse Brune and I just finished a revision of our paper on how the frequency of cash payouts affects savings behavior, and we are currently studying (along with Eric Chyn) how to use that approach as an actual savings product.

After all the excitement over their potential benefits, a couple of recent results have taken a bit of the luster off of cash transfers. First, the three-year followup of the GiveDirectly evaluation in Kenya showed evidence that many effects had faded out, although asset ownership was still higher. Then came a nine-year (!!) followup of a cash grant program in Uganda, where initial gains in earnings had disappeared (but again, asset ownership remained higher).

One question raised by these results is whether we can do any better than just giving people cash. A new paper by McIntosh and Zeitlin tackles this question head-on, with careful comparisons between a sanitation-focused intervention and a cost-equivalent cash transfer. They actually tried a bunch of cash transfers in a range so that they could get the exact cost-equivalency through regression adjustment. In their study, there’s no clear rank ordering between cost-equivalent cash and the actual program; neither have big impacts, and they change different things (though providing a larger cash transfer does appear to dominate the program across all outcomes).

This is just one program, though – can any program beat cash? It turns out that the answer is yes! At MIEDC this spring, I saw Dean Karlan present results from a “Graduation” program that provided a package of interventions (training, mentoring, cash, and a savings group) in several different countries. The Uganda results, available here, show that the program significantly improved a wide range of poverty metrics, while a cost-equivalent cash transfer “did not appear to have meaningful impacts on poverty outcomes”.

This is a huge deal. The basic neoclassical model predicts that, at best, a program can never beat giving people cash, the best you can do is tie.* People know what they need and can use money to buy it. If you spend the same amount of money, you could achieve the same benefits for them if you happen to hit on exactly what they want, but if you pick anything else you would have done better to just hand them money. (This is the logic behind the annual Christmas tradition of journalists trotting out some economist to explain to the world why giving gifts is inefficient. And economists wonder why no one likes us!)

The fact that we can do better than just handing out cash to people is a rejection of that model in favor of models with multiple interlocking market failures – some of which may be psychological or “behavioral” in nature. That’s a validation of our basic understanding of why poor places stay poor. In a standard model, a simple lack of funds, or even the failure of one market, is not enough to drive a permanent poverty trap. You need multiple markets failing at once to keep people from escaping from poverty. For example, a lack of access to credit is bad, and will hurt entrepreneurs’ ability to make investments. But even without credit, they could instead save money to eventually make the same investments. A behavioral or social constraint that keeps them from saving, in contrast, can keep them from making those investments at all.

McIntosh and Zeitlin refer to Das, Do, and Ozler, who point out that “in the absence of external market imperfections, intra-household bargaining concerns, or behavioral inconsistencies, the outcomes moved by cash transfers are by definition those that maximize welfare impacts.” While their study finds that neither cash nor the program was a clear winner, the Graduation intervention package, in contrast, clearly beats an equivalent amount of cash on a whole host of metrics. We can account for this in two ways. One view is that the cash group actually was better off – people would really prefer to spend a windfall quickly than make a set of investments that pay off with longer-term gains. The other, which I ascribe to, is that there are other constraints at work here. Under this model, the cash group just couldn’t make those investments – they didn’t have the access to savings markets, or there is a missing market in training/skill development, etc.

There is an important practical implication as well. The notion of “benchmarking” development interventions by comparing them to handing out cash is growing in popularity, and it’s an important movement. Indeed, the McIntosh and Zeitlin study makes major contributions by figuring out how to do this benchmarking correctly, and by pushing the envelope on getting development agencies to think about cash as a benchmark.** But what do we do when there is no obvious way to benchmark via cash? In particular, when we are studying education interventions, who should we be thinking about making the cash transfers to? McIntosh and Zeitlin talk about a default of targeting the cash to the people targeted by the in-kind program. In many education programs, the teachers are the people targeted directly. In others, it is the school boards that are the direct recipients of an intervention. Neither group of people is really the aim of an education program: we want students to learn. And, perhaps unsurprisingly, direct cash transfers to teachers and school boards don’t do much to improve learning. You could change the targeting in this case, and give the cash to the students, or to their parents, or maybe just to their mothers – there turn out to be many possible ways of doing this.

So it’s really important that we now have an example of a program that clearly did better than a direct cash transfer. From a theoretical perspective, this is akin to Jensen and Miller’s discovery of Giffen goods in their 2008 paper about rice and wheat in China: it validates the way we have been trying to model persistent poverty. From the practical side, it raises our confidence that the other interventions we are doing are worthwhile, in contexts where benchmarking to cash is impractical, overly complicated, or simply hasn’t been tried. Perhaps we haven’t proven that teacher training is better than a cash transfer, but we do at least know that high-quality programs can be more valuable than simply handing out money.

EDIT: Ben Meiselman pointed out a typo in the original version of this post (I was missing “best” in “the best you can do is tie”), which I have corrected.

*I am ignoring spillovers onto people who don’t get the cash here, which, as Berk Ozler has pointed out, can be a big deal – and are often negative.

**Doing this remains controversial in the development sector – so much so that many of the other projects that are trying cash benchmarking are doing it in “stealth mode”.

Posted in Uncategorized

How to quickly convert Powerpoint slides to Beamer (and indent the code nicely too)

Like most economists, I like to present my research using Beamer. This is in part for costly signaling reasons – doing my slides via TeX proves that I am smart/diligent enough to do that. But it’s also for stylistic reasons: Beamer can automatically put a little index at the top of my slides  so people know where I am going, and I like the default fonts and colors.

Moreover, Beamer forces me to obey the First Law of Slidemaking: get all those extra words off your slides. Powerpoint will happily rescale things and let you put tons of text on the screen at once. Beamer – unless you mess with it heavily – simply won’t, and so forces you to make short, parsimonious bullet points (and limit how many you use).

Not everyone is on the same page about which tool to use all the time, which in the past has occasionally meant I needed to take my coauthor’s Powerpoint slides and copy them into Beamer line-by-line. Fortunately, today I found a solution for automating that process.

StackExchange user Louis has a post where he shares VBA code that can quickly move your Powerpoint slides over to Beamer. His code is great but I wasn’t totally happy with the output so I made a couple of tweaks to simplify it a bit. You can view and download my code here; I provide it for free with no warranties, guarantees, or promises. Use at your own risk.

Here is how to use it:

  1. Convert your slides to .ppt format using “Save As”. (The code won’t work on .pptx files).
  2. Put the file in its own folder that contains nothing else. WARNING: If files with the same names as those used by the code are in this folder they will be overwritten.
  3. Download the VBA code here (use at your own risk).
  4. Open up the Macros menu in Powerpoint (You can add it via “Customize the Ribbon”. Hit “New Group” on the right and rename it “Macros”, then select “Macros” on the left and hit “Add”.)
  5. Type “ConvertToBeamer” under “Macro name”, then hit “Create”
  6. Select all the text in the window that appears and delete it. Paste the VBA code in.
  7. Save, then close the Microsoft Visual Basic for Applications window.
  8. Hit the Macros button again, select “ConvertToBeamer” and run it.
  9. There will now be a .txt file with the Beamer code for your slides in it. (It won’t compile without an appropriate header.) If your file is called “MySlides.ppt” the text file will be “MySlides.txt”
  10. You need to manually fix a few special characters, as always when importing text into TeX. Look out for $, %, carriage returns, and all types of quotation marks and apostrophes. I also found that some tables came through fine while others needed manual tweaking.

One issue I had with the output was that it didn’t have any indentations, making it hard to recognize nested bullets. Fortunately I found this page that will indent TeX code automatically.

I found this to be a huge time saver. Even with figuring it out for the first time, tweaking the code, and writing this post, it still probably saved me several hours of work. Hopefully others find this useful as well.

Posted in Uncategorized

Randomization inference vs. bootstrapping for p-values

It’s a common conundrum in applied microeconomics. You ran an experiment on the universe of potential treatment schools in a given region, and you’re looking at school-level outcomes. Alternatively, you look at a policy that was idiosyncratically rolled out across US states, and you have the universe of state outcomes for your sample. What do the standard errors and p-values for my results even mean? After all, there’s no sampling error here, and the inference techniques we normally use in regression analyses are based on sampling error.

The answer is that the correct p-values to use are ones that capture uncertainty in terms of which units in your sample are assigned to the treatment group (instead of to the control group). As Athey and Imbens put it in their new handbook chapter on the econometrics of randomized experiments, “[W]e stress randomization-based inference as opposed to sampling-based inference. In randomization-based inference, uncertainty in estimates arises naturally from the random assignment of the treatments, rather than from hypothesized sampling from a large population.”

Athey and Imbens (2017) is part of an increasing push for economists to use randomization-based methods for doing causal inference. In particular, people looking at the results of field experiments are beginning to ask for p-values from randomization inference. As I have begun using this approach in my own work, and discussing it with my colleagues, I have encountered the common sentiment that “this is just bootstrapping”, or that it is extremely similar (indeed, it feels quite similar to me). While the randomization inference p-values are constructed similarly to bootstrapping-based p-values, there is a key difference that boils down to the distinction between the sampling-based and randomization-based approaches to inference:

Bootstrapped p-values are about uncertainty over the specific sample of the population you drew, while randomization inference p-values are about uncertainty over which units within your sample are assigned to the treatment.

When we bootstrap p-values, we appeal to the notion that we are working with a representative sample of the population to begin with. So we re-sample observations from our actual sample, with replacement, to simulate how sampling variation would affect our results.

In contrast, when we do randomization inference for p-values, this is based on the idea that the specific units in our sample that are treated are random. Thus there is some chance of a treatment-control difference in outcomes of any given magnitude simply based on which units are assigned to the treatment group – even if the treatment has no effect. So we re-assign “treatment” at random, to compute the probability of differences of various magnitudes under the null hypothesis that the treatment does nothing.

To be explicit about what this distinction means, below I lay out the procedure for computing p-values both ways, using my paper with Rebecca Thornton about a school-based literacy intervention in Uganda as an example data-generating process.

Randomization inference p-values

1. Randomly re-assign “treatment” in the same way that it was actually done. This was within strata of three schools (2 treatments and 1 control per cell). As we do this, the sample stays fixed.

2. Use the fake treatments to estimate our regression model:

[math] y_{is}= \beta_0 +\beta_1 T1_s + \beta_2 T2_s + \textbf{L}^\prime_s\gamma +\eta y^{baseline}_{is} + \varepsilon_{is} [/math]

[math]\textbf{L}[/math] are strata fixed effects.
The fake treatments have no effect (on average) by construction. There is some probability that they appear to have an effect by random chance. Our goal is to see where our point estimates lie within the distribution of “by random chance” point estimates from these simulations.

3. Store the estimates for [math]\beta_1[/math] and [math]\beta_2[/math].

4. Repeat 1000 times.

5. Look up the point estimates for our real data in the distribution of the 1000 fake treatment assignment simulations. Compute the share of the fake #s that are higher in absolute value than our point estimates. This is our randomization inference p-value.

Bootstrapped p-values

1. Randomly re-sample observations in the same way they were actually sampled. This was at the level of a school, which was our sampling unit. In every selected school we keep the original sample of kids.

This re-sampling is done with replacement, with a total sample equal to the number of schools in our actual dataset (38). Therefore almost all re-sampled datasets will have repeated copies of the same school. As we do this, the treatment status of any given school stays fixed.

2. Use the fake sample to estimate our regression model:

[math] y_{is}= \beta_0 +\beta_1 T1_s + \beta_2 T2_s + \textbf{L}^\prime_s\gamma +\eta y^{baseline}_{is} + \varepsilon_{is} [/math]

[math]\textbf{L}[/math] are strata fixed effects.

The treatments should in principle have the same average effect as they do in our real sample. Our goal is to see how much our point estimates vary as a result of sampling variation, using the re-sampled datasets as a simulation of the actual sampling variation in the population.

3. Store the estimates for [math]\beta_1[/math] and [math]\beta_2[/math].

4. Repeat 1000 times.

5. Compute the standard deviation of the estimates for [math]\beta_1[/math] and [math]\beta_2[/math] across the 1000 point estimates. This is our bootstrapped standard error. Use these, along with the point estimate from the real dataset, to do a two-sided t-test; the p-value from this test is our bootstrapped p-value.*


I found Matthew Blackwell’s lecture notes to be a very helpful guide on how randomization inference works. Lasse Brune and Jeff Smith provided useful feedback and comments on the randomization inference algorithm, but any mistakes in this post are mine alone. If you do spot an error, please let me know so I can fix it!

EDIT: Guido Imbens shared a new version of his paper with Alberto Abadie, Susan Athey, and Jeffrey Wooldrige about the issue of what standard errors mean when your sample includes the entire population of interest (link). Reading an earlier version really helped with my own understanding of this issue, and I have often recommended it to friends who are struggling to understand why they even need standard errors for their estimates if they have all 50 states, every worker at a firm, etc.

*There are a few other methods of getting bootstrapped p-values but the spirit is the same.
Posted in Uncategorized

Where is Africa’s Economic Growth Coming From?

I recently returned from a two-week* trip to Malawi to oversee a number of research projects, most importantly a study of savings among employees at an agricultural firm in the far south of the country. For the first time in years, however, I also took the time to visit other parts of Malawi. One spot I got back to was the country’s former capital, Zomba, where I spent an extended period in graduate school collecting data for my job market paper. This was my first time back there in over four years.

The break in time between my visits to the city made it possible to see how the city has grown and changed. I was happy to see signs of growth and improvement everywhere:

  • There are far more guest houses than I recall.
  • The prices at my favorite restaurant have gone up, and their menu has expanded by about a factor of five.
  • They finally got rid of the stupid stoplight in the middle of town. I used to complain that traffic flowed better when it was broken or the power was out; things definitely seem to work better without it. (Okay, this might not technically be economic development but it’s a huge improvement.)
  • Whole new buildings full of shops and restaurants have gone up. I was particularly blown away to see a Steers. In 2012, I could count the international fast-food chain restaurants in Malawi on one hand. This Steers is the only fast-food franchise I’ve seen outside of the Lilongwe (the seat of government) and Blantyre (the second-largest city and commercial capital).
2017-07-19 18.36.40

The new Steers in Zomba

What’s driving this evident economic growth? It’s really hard to say. Zomba is not a boomtown with growth driven by the obvious and massive surge of a major industry. Instead, it seems like everything is just a little bit better than it was before. The rate of change is so gradual that you probably wouldn’t notice it if you were watching the whole time. Here’s a graph that shows snapshots of the consumption distribution for the whole country, in 2010 (blue) and 2013 (red), from the World Bank’s Integrated Household Panel Survey:

Consumption CDFs

For most of the distribution, the red line is just barely to the right of the blue one.** That means that for a given percentile of the consumption distribution (on the y-axis) people are a tiny bit better off. It would be very easy to miss this given the myriad individual shocks and seasonal fluctuations that people in Malawi face. It’s probably an advantage for me to come back after a break of several years – it implicitly smooths out all the fluctuations and lets me see the broader trends.

These steady-but-mysterious improvements in livelihoods are characteristic of Africa as a whole. The conventional wisdom on African economic growth is that it is led by resource booms – discoveries of oil, rises in the oil price, etc. That story is wrong. Even in resource-rich countries, growth is driven as much by other sectors as by natural resources:

Nigeria is known as the largest oil exporter in Africa, but its growth in agriculture, manufacturing, and services is either close to or higher than overall growth in GDP per capita. (Diao and McMillan 2015)

Urbanization is also probably not an explanation. Using panel data to track changes in individuals’ incomes when they move to cities, Hicks et al. (2017) find that “per capita consumption gaps between non-agricultural and agricultural sectors, as well as between urban and rural areas, are also close to zero once individual fixed effects are included.”

So what could be going on? One candidate explanation is the steady diffusion of technology. Internet access is more widely available than ever in Malawi: more people have internet-enabled smartphones, and more cell towers have fiber-optic cables linked to them. While in Malawi I was buying internet access for $2.69 per gigabyte. In the US, I pay AT&T $17.68 per GB (plus phone service but I rarely use that). Unsurprisingly, perhaps, better internet leads to more jobs and better ones. Hjort and Poulsen (2017) show that when new transoceanic fiber-optic cables were installed, the countries serviced by them experienced declines in low-skilled employment and larger increases in high-skilled jobs. Other technologies are steadily diffusing into Africa as well, and presumably also leading to economic growth.

Another explanation that I find compelling is that Africa has seen steady improvements in human capital, led by massive gains in maternal and child health and the rollout of universal primary education. Convincing evidence on the benefits of these things is hard to come by, but one example comes from the long-run followup to the classic “Worms” paper. Ten years after the original randomized de-worming intervention, the authors track down the same people and find that treated kids are working 12% more hours per week and eating 5% more meals.

But the really right answer is that we just don’t know. Economics as a discipline has gotten quite good at determining the effects of causes: how does Y move when I change X? The causes of effects (“Why is Y changing?”) are fundamentally harder to study. Recent research on African economic growth has helped rule out some just-so stories – for example, it’s not just rents from mining, and even agriculture is showing increased productivity – but we still don’t have the whole picture. What we do have, however, is increasing evidence on levers that can be used to help raise incomes, such as investing in children’s health and education, or making it easier for new technologies to diffuse across the continent.

*I spent two weeks on the trip, but the travel times are long enough that that amounted to just under 11 days in the country.
**The blue line is farther to the right at the very highest percentiles, but that’s all based on a very small portion of the data and household surveys usually capture the high end of the income/consumption distribution poorly. Even if we take it literally, this graph implies benefits for many of the poor and a cost for a small share of the rich, would seems like a positive tradeoff.
Posted in Uncategorized