Blog Archive

Monday, May 19, 2014

Referees' reports on Lennart Bengtsson paper submitted to ERL: Statement from IOP Publishing on story in The Times

Bristol, UK, 16 May 2014; Updated 19 May 2014

Dr. Nicola Gulley, Editorial Director at IOP Publishing, says, “The draft journal paper by Lennart Bengtsson that Environmental Research Letters declined to publish, which was the subject of this morning’s front page story of The Times, contained errors, in our view did not provide a significant advancement in the field, and therefore could not be published in the journal.”
“The decision not to publish had absolutely nothing to do with any ‘activism’ on the part of the reviewers or the journal, as suggested in The Times’ article; the rejection was solely based on the content of the paper not meeting the journal’s high editorial standards,” she continues.
“The referees selected to review this paper were of the highest calibre and are respected members of the international science community. The comments taken from the referee reports were taken out of context and therefore, in the interests of transparency, we have worked with the reviewers to make the full reports available.”
The full quote actually said “Summarising, the simplistic comparison of ranges from AR4, AR5, and Otto et al, combined with the statement they are inconsistent is less then helpful, actually it is harmful as it opens the door for oversimplified claims of "errors" and worse from the climate sceptics media side.”
 “As the referee's report states, ‘The overall innovation of the manuscript is very low.’ This means that the study does not meet ERL’s requirement for papers to significantly advance knowledge of the field.”
“Far from denying the validity of Bengtsson’s questions, the referees encouraged the authors to provide more innovative ways of undertaking the research to create a useful advance.”
“As the report reads, ‘A careful, constructive, and comprehensive analysis of what these ranges mean, and how they come to be different, and what underlying problems these comparisons bring would indeed be a valuable contribution to the debate.”
“Far from hounding ‘dissenting’ views from the field, Environmental Research Letters positively encourages genuine scientific innovation that can shed light on complicated climate science.”
“The journal Environmental Research Letters is respected by the scientific community because it plays a valuable role in the advancement of environmental science – for unabashedly not publishing oversimplified claims about environmental science, and encouraging scientific debate.”
“With current debate around the dangers of providing a false sense of ‘balance’ on a topic as societally important as climate change, we’re quite astonished that The Times has taken the decision to put such a non-story on its front page.”
Please find the reviewers' reports below, exactly as sent to Lennart Bengtsson.
We have full permission from the referees of this paper to make their reviews public.
The manuscript uses a simple energy budget equation (as employed e.g. by Gregory et al. 2004, 2008, Otto et al. 2013) to test the consistency between three recent "assessments" of radiative forcing and climate sensitivity (not really equilibrium climate sensitivity in the case of observational studies).

The study finds significant differences between the three assessments and also finds that the independent assessments of forcing and climate sensitivity within AR5 are not consistent if one assumes the simple energy balance model to be a perfect description of reality.

The overall innovation of the manuscript is very low, as the calculations made to compare the three studies are already available within each of the sources, most directly in Otto et al.

The finding of differences between the three "assessments" and within the assessments (AR5), when assuming the energy balance model to be right, and compared to the CMIP5 models are reported as apparent inconsistencies.

The paper does not make any significant attempt at explaining or understanding the differences, it rather puts out a very simplistic negative message giving at least the implicit impression of "errors" being made within and between these assessments, e.g., by emphasising the overlap of authors on two of the three studies.

What a paper with this message should have done instead is recognising and explaining a series of "reasons" and "causes" for the differences.

- The comparison between observation based estimates of ECS and TCR (which would have been far more interesting and less impacted by the large uncertainty about the heat content change relative to the 19th century) and model based estimates is comparing apples and pears, as the models are calculating true global means, whereas the observations have limited coverage. This difference has been emphasised in a recent contribution by Kevin Cowtan, 2013.
- The differences in the forcing estimates used, e.g., between Otto et al. 2013 and AR5 are not some "unexplainable change of mind of the same group of authors" but are following different tow different logics, and also two different (if only slightly) methods of compiling aggregate uncertainties relative to the reference period, i.e., the Otto et al. forcing is deliberately "adjusted" to represent more closely recent observations, whereas AR5 has not put so much weight on these satellite observations, due to still persisting potential problems with this new technology
- The IPCC process itself explains potential inconsistencies under the strict requirement of a simplistic energy balance: The different estimates for temperature, heat uptake, forcing, and ECS and TCR are made within different working groups, at slightly different points in time, and with potentially different emphasis on different data sources. The IPCC estimates of different quantities are not based on single data sources, nor on a fixed set of models, but by construction are expert based assessments based on a multitude of sources. Hence the expectation that all expert estimates are completely consistent within a simple energy balance model is unfunded from the beginning.
- Even more so, as the very application of the Kappa model (the simple energy balance model employed in this work, in Otto et al., and Gregory 2004) comes with a note of caution, as it is well known (and stated in all these studies) to underestimate ECS, compared to a model with more time-scales and potential non-linearities (hence again no wonder that CMIP5 doesn't fit the same ranges)
Summarising, the simplistic comparison of ranges from AR4, AR5, and Otto et al., combined with the statement they they are inconsistent is less then helpful, actually it is harmful as it opens the door for oversimplified claims of "errors" and worse from the climate sceptics media side.
One cannot and should not simply interpret the IPCCs ranges for AR4 or 5 as confidence intervals or pdfs and hence they are not directly comparable to observation based intervals (as, e.g., in Otto et al.).
In the same way that one cannot expect a nice fit between observational studies and the CMIP5 models.

A careful, constructive, and comprehensive analysis of what these ranges mean, and how they come to be different, and what underlying problems these comparisons bring would indeed be a valuable contribution to the debate.
I have rated the potential impact in the field as high, but I have to emphasise that this would be a strongly negative impact, as it does not clarify anything but puts up the (false) claim of some big inconsistency, where no consistency was to be expected in the first place.
And I can't see an honest attempt of constructive explanation in the manuscript.

Thus I would strongly advise rejecting the manuscript in its current form.
I would be interested in learning whether or not there are internal inconsistencies in estimates of climate sensitivity and forcing in individual studies and in learning if there are substantial differences among the studies. I would be even more interested in understanding why any apparent inconsistencies and differences might exist. On this second point, the manuscript has little to offer (other than some speculation that aerosol forcing estimates have changed). And unfortunately on the first point, the authors have only superficially demonstrated possible inconsistencies.  Moreover, in addressing the question of “committed warming,” the authors have inexplicably used the wrong equation. For all these reasons, I recommend the paper be rejected.
The authors use the wrong equation to calculate the "committed warming." In their equation 3, they should use the equilibrium climate sensitivity, not the transient climate sensitivity. This would then yield the climate system’s eventual equilibrium temperature increase (relative to pre‐industrial temperature) for a given forcing, which they take to be present day GHG forcing. Since the transient climate sensitivity is quite a bit lower than the equilibrium climate sensitivity, they have substantially underestimated the committed warming.
Even before making this error, there is a troubling shallowness in the arguments describing apparent discrepancies in estimates of forcing and equilibrium climate sensitivity. Here are a few suggestions on how to improve this part of the manuscript.
  1. The casting of ECS in the odd units of K/(W/m**2) is completely unnecessary and not only is confusing, but makes it difficult to check some of the numerical values reported. ECS should be reported in K since it is a temperature change in response to 2xCO2 forcing. Instead of equation 1, simply write ECS = F(2xCO2) * delta-T/(F – N).
  2. The present manuscript is unacceptably unclear about exactly what values are used in constructing fig. 1. For the 4 cases considered (AR4, AR5, Otto et al., and the CMIP5 models), you should construct a table (source of info. vs. value of each parameter) providing all the values (or range of values) used for ECS, delta-T(2xCO2), F(2xCO2), F, and N. Indicate which values (if any) were not reported by the referenced study itself, but were adopted for use from some other source. With this information you might be able to convince the reader that there are in fact differences and inconsistencies in each of the studies. As the manuscript stands, I am left wondering whether the apparent discrepancies might actually be explained more by differences in ocean heat uptake values used as opposed to uncertainty/differences in forcing and ECS. I note that many of the discrepancies disappear if AR5 assumed a somewhat larger value for N than in the other studies.
  3. For clarity (and strict correctness), log-log plots should show the relationship between non‐dimensional quantities only (because taking the log of a dimension yields nonsense).  So fig. 1 should show log(ECS/delta-T) as a function of log( (F-N) / F(2xCO2) ).  This will also make it easy for the reader to understand the meaning of the numerical values plotted: the ordinate indicates by what factor the GMST equil. response exceeds the temperature difference between some perturbed state and the control (preindustrial) states (in this case warming since pre‐industrial times).  On the abscissa, F-N would appear normalized by 2xCO2 forcing. An equilibrium climate with 2xCO2 will by construction be plotted at the origin (i.e., ECS/delta_T(2xCO2) = 1 ).
  1. In the current manuscript, important assumptions are that uncertainty in delta‐T (obs) and N(obs) is negligible and that the values should be the same for use in all 4 studies. I’m not sure this is valid, since the estimates of “present‐day” forcing are for different time periods (I think). Moreover, it does not seem consistent to evaluate N over the period from 1971 to 2010 and GMST change from 1850‐1900 to 2003‐2012. For this to be an appropriate comparison, you must assume the rate of ocean heat uptake is the same during 2003‐2012 as it is during 1971-2010. You also must assume that in the period 1850‐1900 the system is in equilibrium (with N=0 during that period).  I note that Otto et al. assume heat uptake of 0.08 +‐ .03 W/m**2 for  their reference period (1860-1879).  I  suspect  for 1850-1900, the comparable number might be somewhat larger, which would reduce your N by a non‐negligible fraction. In any case N is highly uncertain, and you should discuss how this affects your results. Similarly, you need to consider uncertainty in GMST. Although this quantity is reasonably well measured over the historical period, we cannot expect it on short time‐scales to necessarily exactly be related to net radiative flux by a constant. These  and other uncertainties lead to the range of values shown, for example, in Otto et al. (for the decade 2000‐2010 values range around delta‐T = 0.7 K with a standard error estimate of about 20%, at least as best I can determine from their fig. 1). Again how these uncertainties affect each of the 4 studies you consider should be discussed; it’s possible that the discrepancies could disappear if different studies used different values of N and GMST change within the accepted uncertainties.
  1. One way to better indicate uncertainties on your graph would be to replace your log‐log plot with a plot of (F-N)/F(2xCO2) along the ordinate and 1/ECS on the abscissa, perhaps labeled non-linearly with values of (1/6, 1/5, ¼, ½, and 1) so the reader could directly read the temperature. On this plot you could then indicate the region compatible with temperature observation uncertainty by plotting a couple of lines emanating from the origin with slope equal to different values of delta-T [nb. delta-T /ECS = (F‐N)/F(2xCO2)].   You could also indicate how the uncertainty in N affects your projection lines corresponding to the current diagonal lines in your fig. 1 by plotting at their central point a vertical error‐bar line (vertical because recall I’ve put F‐N on the y-axis).  This figure would resemble fig. 1 of Otto et al., but with their obs. change in GMST replaced by 1/ECS and their shaded diagonal lines of ECS replaced by GMST. You would probably only have to display 2 diagonal lines indicating the uncertainty in obs. GMST change.
  1. In your current discussion you imply that differences in F‐N across different studies are attributable to differences in F, but N could also be responsible.

  1. The study would be much more valuable if it attempted to also begin to address the 4 questions posed in the conclusions. I suspect the answers are really quite mundane, although the tone of the discussion implies otherwise.

No comments: