Blog Archive

Tuesday, May 27, 2014

A whole fleet of gremlins: Looking more carefully at Richard Tol’s twice-corrected paper, “The Economic Effects of Climate Change”

by Andrew Gelman, "Statistical Modelling, Causal Inference, and Social Science," May 27, 2014


We had a discussion the other day of a paper, “The Economic Effects of Climate Change,” by economist Richard Tol.

The paper came to my attention after I saw a notice from Adam Marcus that it was recently revised because of data errors. But after looking at the paper more carefully, I see a bunch of other problems that, to me, make the whole analysis close to useless as it stands.
I think this is worth discussing because the paper has been somewhat influential (so far cited 328 times, according to Google Scholar) and has even been cited in the popular press as evidence that “Climate change has done more good than harm so far and is likely to continue doing so for most of this century . . . There are many likely effects of climate change: positive and negative, economic and ecological, humanitarian and financial. And if you aggregate them all, the overall effect is positive today — and likely to stay positive until around 2080. That was the conclusion of Professor Richard Tol of Sussex University after he reviewed 14 different studies of the effects of future climate trends.” Once the data errors were corrected, it turns out the above quote is incorrect: of the studies cited by Tol, all but one projected negative or essentially zero economic effects of climate change, with the only paper giving a positive estimate being an earlier one of Tol himself, so clearly no consensus of a positive effect—although the science writer could be excused for thinking that, based on the earlier published paper that had the errors.

Tol himself has written different things on climate change. In his 2009 paper he wrote of “considerable uncertainty about the economic impact of climate change . . . negative surprises are more likely than positive ones. . . . The policy implication is that reduction of greenhouse gas emissions should err on the ambitious side.” Then in 2014 he wrote, “the revised estimate based on the new data “is relevant because the benefits of climate policy are correspondingly revised downwards.”

So these matters are not trivial, at least according to Tol and the above-linked journalist. Let’s try to track down what’s happening from a statistical perspective.

Problems with the data

Tol’s paper is a meta-analysis in which he combines several published projections of the economic effects of global warming, in order to produce some sort of consensus estimate. During the years after publication of the article, several different people pointed out different data errors corresponding to mischaracterizations of the projections from various of the papers in the meta-analysis. On some of the estimates the signs were changed, which looks like a typo, and in another case it appears that Tol had misread the paper. There were only 14 points in the original data analysis so this is a disturbingly high error rate. Perhaps even more disturbingly, Tol’s correction notice was itself re-corrected after an error was noted in the correction:

Screen Shot 2014-05-27 at 3.13.20 PM

Tol attributed the errors to “gremlins,” but I’m guessing that they happened when he was typing data into a file. (They couldn’t simply be errors that were introduced in the journal’s editing or production process because some of them were entered into Tol’s graphs and analyses, not just his data table.) Bob Ward provides a convenient list of errors and data sources here.

And, after this, one more thing. Rachael Jonassen noticed “a subtle but meaningful difference between the labels of the x-axis in the original Figure 1 of Tol (2009) and the updated Figures 1 and 2″:
In the original Figure 1, the x-axis is labelled temperature change relative to ‘today.’ In the new Figures, the x-axis is labelled as temperature change relative to pre-industrial temperatures. The term ‘pre-industrial’ derives from climate scientist’s interest in anthropogenic influences on CO2 levels and they usually assign a date of 1750 as the last time natural CO2 levels were observed. A temperature change of 0 relative to pre-industrial temperatures, existed in 1750. On the newly labelled x-axis, we are ‘today’ at +0.8C.
This would seem to make a hash of everything, if even the uncorrected points are all at the wrong place on the x-axis. Jonassan continues:
It would be difficult to imagine economists interested in (or reporting) the effect of climate change on GDP in comparison to the GDP at pre-industrial times, temperatures, and CO2 levels. More likely the economic analyses were performed relative to the GDP of ‘today’ at the time of each publication (e.g. at +0.8C relative to pre-industrial levels in 2014) but calibrated with climate models that expressed results relative to 1750 levels (usually taken at 285ppm). . . .
Due to non-linearities and lags in the earth system, a +1C change relative to 1750 has a different impact than a +1C change relative to today’s temperature (already at +0.8C relative to 1750). Presumably, economists calibrate their analyses using available climate model results. . . .
Given the shift in labelling of the x-axes, with no adjustments in the position of the data points between the 2009 and current versions of the plots, it is not clear that the author considered the importance of these distinctions so caution in interpreting the relation of the underlying data to published plots is in order.
Here’s what she’s talking about. From the caption to Figure 1 of the 2009 paper:

Screen Shot 2014-05-27 at 3.09.03 PM

And from the corresponding figure in the update:

Screen Shot 2014-05-27 at 3.10.18 PM

This seems like a huge difference, changing the interpretation of everything, but it’s not discussed at all in the correction note.

One scary thing, when we see a paper that has had so many errors, not publicly corrected until five years after publication, is that it can make us wonder how many errors are sitting there in various other articles. As we discussed in the context of the notorious Reinhart and Rogoff paper, we find out about these errors in influential papers because others go to the trouble of checking—but even in that case, it was years before the errors came to light.
This sort of thing is one motivation for the movement toward more openness and transparency in scientific publication. There’s no reason to think that Tol, Reinhart, and Rogoff made these errors on purpose. They just weren’t careful, which is too bad, given that these were highly-publicized analyses that had important practical implications. At some point, I think an “I’m sorry” would be appropriate, if nothing else to acknowledge all the time people have wasted tracking all these problems down. But really I see it as a larger problem with the scientific communication system, the idea that once something is published in a journal, it is presumed to be true and it takes a lot of work to dislodge even gross errors.

Problems with the analysis

For convenience I’ll repeat some things that I wrote on the sister blog the other day. The short story that the problems with Tol’s analysis go beyond a simple miscoding of some data points.

One problem which Tol didn’t note was the role of the changing minus signs in interpreting the estimates that were not garbled. In particular, his estimate of a big positive impact at 1 degree is a clear outlier in his analysis. Did he look into that in the original paper? I took a look, and here’s what he wrote, back in 2009:
Given that the studies in Table 1 use different methods, it is striking that the estimates are in broad agreement on a number of points—indeed, the uncertainty analysis displayed in Figure 1 reveals that no estimate is an obvious outlier.
In this way, a misclassification of a couple of points can affect the interpretation of a third point.

Thus, there was possibly a cascading effect: Tol’s existing estimate of +2.3% made him receptive to the idea that other researchers could estimate large positive economic effects from global warming, and then once he made the mistake and flipped some signs from positive to negative, this made his own +2.3% estimate not stand out so much.

Tol also wrote:
The fitted line in Figure 1 suggests that the turning point in terms of economic benefits occurs at about 1.1 degrees Celsius warming (with a standard deviation of 0.7 degrees Celsius).
This turning point has disappeared in his new Figure 2, so, again, I do think the new analysis has changed his conclusions in a real way.

The other big problem is that when Tol wrote, “The assessment of the impacts of profound climate change has been revised: We are now less pessimistic than we used to be,” and “the benefits of climate policy are correspondingly revised downwards,” these claims are entirely based on (a) the feature of the quadratic that when it goes up and then down, it has to go down even faster, and (b) his extrapolation of his original model (with data points only going past 3 degrees) to 5.5 degrees.

Tol writes that the fit of his quadratic model “is destroyed by the new observations: -11.2% for 3.2K and -4.6% for 5.4K. The former suggests a non-linearity that is much stronger than second degree; the latter suggests linearity.” (I assume the -11.2% he refers to is the -11.5% in his paper.) But when he writes this, showing an incredible faith in your model. But it’s a strange model, as it’s not a model of the impact of warming, it’s a model of other people’s estimates of the impact of warming. To suggest that one paper’s estimate of -11.2 provides evidence of a strong nonlinearity . . . I don’t buy it.

Problems with the model

We had a good discussion of the Tol paper on the blog last week which motivated me to think more about all this. In particular, the problem is not so much with the quadratic functional form as with the conceptual model, the y_i = g(x_i|theta) + epsilon_i model that’s driving the whole thing.

The implied model of Tol’s meta-analysis is that the published studies represent the true curve plus independent random errors with mean 0. I think it would make more sense to consider the different published studies as each defining a curve, and then to go from there. In particular, I’m guessing that the +2.3 and the -11.5 we keep talking about are not evidence for strong nonmotonicity in the curve but rather represent entirely different models of the underlying process.

In short, I don’t think the analysis can be fixed by just playing with the functional form; I think it needs to be re-thought. You just can’t think of these as representing 14 or 21 different data points as if they represent observations of economic impact at different temperatures. The data being used by Tol come from some number of forecasting models, each of which implies its own curve.

Reforming the process of scientific publication and review

The Journal of Economic Perspectives is well respected, and my impression is that Tol’s 2009 paper, even with all errors aside, is far below the quality of usual empirical papers that get published in top economics journals. I typically see econ papers as being pretty serious about model misspecification but the model here just doesn’t make sense. And Tol’s remark that outliers provide evidence of nonlinearity (rather than, as we would usually think in such a situation, evidence that the outlying data points are different from the others in some important ways) indicates a lack of understanding of the relevant statistical issues. That’s not so terrible—Tol does not claim to be an econometrician—but, again, it makes me wonder how the paper got through the review process.

My guess (and it’s just a guess; others can feel free to correct me on this) is that the paper was accepted because it was on an important topic. The economic effects of global warming: you can’t get much more important than this. So perhaps the journal editors felt an obligation to publish a paper on this, even if it was weak. The down side is, once the paper was published, it became influential.

This is where a more open review process might come in handy. Ultimately I can’t get upset with the journal for publishing the paper: the editors are busy people, and at first glance the paper looks reasonable. It’s only on reflection that the problems become clear, and indeed the problems with the model become much more clearer once the data have been corrected and augmented. But what if the referee and associate editors’ reports were public? Then, we might see something like, “This paper is weak but we should publish it because the topic is important.” Or maybe not, I don’t know. But this sort of information could be useful to people. Again, we all make errors so the point is to catch and fix them sooner, not to avoid making them entirely or to punish people who make mistakes.


The Tol paper had many data errors, some of which do not seem to be acknowledged even in the latest corrected version. The errors affect the paper’s substantive conclusions as well as the interpretation of the remaining data points. Beyond this, the regression models used by Tol does not make sense to me. I think a better approach would be to consider each of the forecasts of the effects of climate to be a curve rather than a single point at a single hypothesized level of warming. I don’t object to Tol’s goal of performing a meta-analysis of published forecasts but I think to do it right, you have to get a bit more out of each forecast than a single number. This is not my area of expertise, though, and ultimately my points are statistical, not derived from climate science. The statistical model should be appropriate for the problem and data being studied, especially for a problem as important as this one.

No comments: