A universal error source in past climate estimates derived from tree rings

In this study (hereafter: RN()) the authors start out to prove that paucities and age-class gaps in the Torneträsk MXD chronology (Schweingruber 1988, Briffa et al., 1992) are responsible for a low frequency (red) bias. To prove this RN() have developed a method of MXD chronology construction that purportedly accounts for paucities and age-class gaps. They evidence their method’s advantage by comparing a reconstruction of the historical Tornedalen temperature record (Klingbjer and Moberg, 2003) to a reconstruction produced by their method. As evidence for the low frequency bias in the RCS chronology produced by Briffa et al., 1992 (hereafter: BF92), the authors rely on a simple visual comparison.


Introduction
Tree-ring chronologies, as they are used for climate reconstructions, are typically constructed by averaging measurements from multiple trees for each year (Cook and Kairiukstis, 1990).The idea behind it is that generating a mean will cancel out the random noise recorded by the individual trees and that the remaining variations represent the common signal, e.g.climate.However, merging all measurements for each year is also a necessity because every year a varying subset of single trees is available.The age structure of trees in each year is likely to change, especially further back in the past, when fewer samples are available or in the last decades, if scientists did not sample very young trees (Fig. 1).
The single tree measurements, for climate reconstructions mostly tree-ring width (TRW) or maximum latewood density (MXD), show an age-growth relationship, so that young trees generate wider rings than older trees.These effects are usually removed by subtraction of an average growth function.If there are climatic trends or low-frequency variations (in this study used for multi-decadal to centennial variability) it has been difficult to separate this variability from the growth trend and to keep the low-frequency changes in the remaining residuals.
The dendrosciences community has been well aware of these challenges (e.g.Cook et al., 1997).They developed methods such as regional curve standardization (RCS) to preserve low frequency variability in the age-detrending process (e.g.Mitchell, 1967;Briffa, 1996;Esper et. al., 2003).Additionally, the response of young trees to climate may differ from that of old trees.Also, the seasonality of signals or the sensitivity to temperature or precipitation limitation may change over time (Esper et al., 2008).Despite all efforts it remains uncertain if most RCS based reconstructions show the correct amount of low frequency variability.In fact, Franke et al. (2013) succeeded in showing that many present day state-of-the- art reconstructions still tend to be biased in the low frequency part of the spectra.To uncover potential reasons of the bias and to make case studies requires new approaches.In order to study the bias in more detail, we introduce a new method of building tree-ring chronologies in this study.This method does not rely on detrending for age-growth relationships and simultaneously allows for low-frequency error estimation.
In the following chapter, we introduce the new method theoretically.It is to point out that the aim is not produce better temperature reconstructions.Instead, erroneous long-term oscillations and their reasons are studied.An illustrative case study is based on the well-known tree-ring MXD data from Torneträsk (Schweingruber et al., 1988) because this temperature reconstruction is known to show erroneous long oscillations (Rinne et al., 2014).

Preliminary considerations of the computations
The measurements (e.g.maximum densities of the tree rings) are given by M(b,t), where b and t are the age class of the tree ring and the calendar year ( AD 441-1980), respectively.The age class refers to the cambial age of the ring of any tree.The unit of b is year following the convention.Thus, if b=1, the tree ring is the first one from the pith.b=b max is the highest age class accepted into the computations.The measurements available during any calendar year t are M(b i ,t), where i=1,2,3,…, n and b i < b i+1 .Thus the yearly measurements are ordered according to their age class and the youngest and oldest age class are b 1 and b n <=b max , respectively.The number of measurements (n) varies from year to year.
Width and density of the tree rings vary with the biological age of the trees.Thus, though the climate was stable, the tree-ring measurements would change with their age.It has therefore been crucial to identify the age dependence of the treering measurements in dendrochronological studies.The principal solution to this problem has been to average the available tree ring measurements age class by age class (Briffa, 1996).It has then been assumed that this eliminates the contribution of the climate and that the result solely describes the age dependence of the tree rings.If that is not true, then the age dependence found may be contaminated by the climate changes.
In this study, no age dependence of the tree rings will be determined.As a consequence the temperature estimates are determined independently for each calendar year.The computations must for each year apply the same frame.
In the computations, it is implied that measurements for all age classes b 1 =1,2,3,…,b max are available.Then the climate indicator for each calendar year is constructed by averaging the measurement values.In practice this is not possible as only a limited number of measurements (n) are yearly available.Therefore the missing values should first be interpolated or extrapolated.The final climate indicator will be the average of the interpolated, extrapolated and measured values of the calendar year.In practice it is not necessary to explicitly determine the interpolated or extrapolated values as shown in the following and illustrated in Fig. 2. Before computations, the upper limit of the age classes must be selected.Sensitivity experiments show that b max =270 years (Fig. 1) is a reasonable choice for our case study.All older age classes will be excluded.) is to be estimated.However, to avoid too complicated formulae, in these exceptional cases the simpler formulae given will here be applied.By adding the corrections, the yearly average of the interpolated, extrapolated and measured values becomes approximately (2) M int in fact presents a weighted average of the measurements available.It can be seen by reordering the terms of M int as follows (3) where use of two formal additional parameters is made, namely b 0 =b 1 and b n+1 =b n .

Error estimation
Assuming that the errors in the measurements in Eq. ( 3) (being from different trees) are independent of each other and that their error variance is S 2 , the error variance of the sum M int becomes from Eq. (3) as follows (4) Adding error variances arising from the extrapolations, the error variance of the final average becomes (5) The variance estimate consists now of two kind of factors, those unrelated [r 1 2 , S 2 , r n 2 ] and those related [ to the distribution of age classes.To equalize their dimensions, they are multiplied and divided by b max , respectively and can be rewritten as to The latter group is now dimensionless and describes relative variance terms.They present a new kind of error source, which depends on the age classes, i.e.only on the distribution of the measurements over the trees and years (Fig. 1).
It has been assumed that the yearly measurements do not overlap (b i <b i+1 ).During periods of high sample counts there may be several trees with the same birth year (the year of the first ring).In such cases the yearly measurements overlap (b i <=b i+1 ).Because the number of the trees tends to be high, the random and relative variance terms become small.From the point of view of this study, such cases are not essential and can be omitted in the error studies.
As the structure of the tree populations normally varies slowly in time, the new relative error variance terms mainly describe slow variations.Thus they serve as a tool for studying (erroneous) longer variations in temperature reconstructions.
In the following we study, whether they could explain the errors observed in reconstructions and uncover reasons of those errors.To study the impacts of the paucities and data gaps is tedious on the basis of Eq. ( 5) or graphs like in Fig. 3d.The coefficients r 1 , S and r n are unknown and the corresponding terms must be studied separately.In the following they are combined by making an ad hoc assumption that r 1 =r n = 0.005S.Then the variance estimate can be written as follows

A practical formula to trace data gaps and paucities
where the relative, dimensionless variance term depends only on the distribution of the age classes over the trees and years as follows The additional scaling factor 0.7 2 is explained in the following.
If S 2 represents the variance of random errors of the observations and their distribution is a Gaussian one, the variance of the average of the observations can be estimated from S 2 /N, where N is the number of observations.In Eq. ( 6) it is represented by 1/Var rel .Accordingly, 1/Var rel can thus be seen to represent the degrees of freedom (DOF).These are shown in Fig. 3c.The scaling factor (here 0.7 2 ) is to be selected so that DOF are as high as possible but not exceeding N.
Note, it is only necessary to know the actual distribution of the age classes (Fig. 1) to derive the DOF by approximation.
The DOF can be essentially seen as a parameter that controls the effect of the paucity and data gaps of the original data on the final reconstruction.
The degrees of freedom, 1/Var rel , are designed to work in data periods with data gaps and paucities.Their use is especially recommended as a warning tool in such cases of missing data.

Torneträsk case study
The classical record of MXD [g/cm 3 ] from Torneträsk consist of 65 trees grown between 441 to 1980 AD (Schweingruber et al., 1988).The scaling from MXD to temperatures can be achieved by multiplying the MXD measurements by 15.92 (Rinne et al., 2014).The same scaling factor will be used here.Thus, no explicit calibration against the instrumental observations was necessary.The temperature reconstruction (hereafter b270 refers to the selection of b max =270) is shown in Fig. 3a.
To further illustrate the performance of our reconstruction method, a comparison with instrumental observations for the summer months (JJA) roughly after 1800 will be made.As the focus is in the erroneous long-term biases, smoothing will be applied.The case is challenging because of a wide data gap due to the missing trees during recent years (Fig. 1).component in the instrumental observations had been deduced from the tree ring estimates.Nevertheless, temperatures before ca.1850 may still contain biases (Melvin et al., 2013;Grudd, 2008).
Our method allows to choose b max optimally for each consideration.For the full period b max =270 was seen to be the optimal selection.However, for the shorter, more recent period 1802-1980 b max =370 seems to be better.Such a choice makes it possible to include into the computations numerous older tree rings available during recent decades (Fig. 1).As a result, the systematic underestimation becomes lower (Fig. 3a).
The resulting tree-ring reconstruction is shown in Fig. 4 by adjusting its mean to the observation mean during 1850-1950.Otherwise no calibration against observations is made.Instead, the scaling factor of 15.92, selected beforehand, is applied.The weather variations are outside of the scope of the present work so that both curves have been smoothed (26years cubic splines).Note here that the geographic areas of the estimates and observations differ somewhat.Taking into account the high number of missing younger age classes, the temperature estimate during the recent years can be seen to be satisfactory, too.

Bias of the reconstructions
In the tree-ring studies the age-growth function is in a central role.It is a problematic case if measurements deviate significantly from the age-growth function applied, i.e. there is a tree with biased growth.In our approach the age-growth function is by-passed.For comparison, in Fig. 3a the Torneträsk reconstruction made in Briffa et al. (1992) using the classical approach is shown together with the present approach using the same Torneträsk data.
The reconstructions in Fig. 3a are astonishingly close to each other, in spite of the very different computational methods.All cases show a (biased) quasi cycle of a wave of ca.350 years in length.Importantly, the bias is not in the tree ring measurements but in the temperature estimates and is due to missing trees.Melvin et al. (2013, their Fig. 4) present several variants of the reconstructions, including those derived from the tree ring widths.All of them show a similar biased structure as long as only the Torneträsk data is included.This is because they are all derived from the data with the same structure (Fig. 1).These include in a compressed form the error variances due to random variations in the measurements.Paucities in the tree population or periods of missing trees (Fig. 1) cause successively data gaps in young, intermediate and old age classes leading to variance variations in Fig. 3d.The increased variance decrease the degrees of freedom and so makes the bias more probable, which is then seen as erroneous long-term oscillations in the reconstructions.This chain of events is in the following discussed in detail.

Explanation of the bias observed
To further study the connection between the relative error variances and long-term error, the bias is estimated with the aid the temperature estimate in Esper et al. (2012).The original Torneträsk data used in this study consist of 65 trees.
The yearly numbers of tree-ring measurements may vary strongly being sometimes very low (Fig. 3c) and never over 25.
However, when the degrees of freedom in Fig. 3c are e.g. over 14, the bias nearly vanishes.In Esper et al. (2012), the number of trees is as high as 578.Between 650 and 2000 AD the numbers of yearly measurements are only exceptionally less than 25, normally clearly more (their Fig. 2).Accordingly, it can be expected that the degrees of freedom are high enough so that their reconstruction can be seen to be more bias free.A more detailed evaluation could be done following our Eq.7 but it is beyond the scope of this study.Thus the difference between the b270 reconstruction and the one in Esper et al.
(2012) can be seen to represent approximately the temperature bias (Fig. 3b) due to the Torneträsk data.
Fig. 3b shows how the long-term variations in the reconstructions (Fig. 3a) are mostly erroneous.An explanation of this bias, derived from the presented theory (Eq.7), is given in Fig. 3c.Some minima and maxima of the DOF are indicated by full and open circles, respectively.It is seen, that strong minima (maxima) of the DOF well predict following biased (unbiased) values which are reflected in reconstructions (Fig. 3a).
To illustrate, a population change is seen in 1356 AD (Fig. 1) where the oldest age class (<270) drops suddenly down to b=116.As a consequence, variances of the older and intermediate age classes are peaked in Fig. 3d, further lowering strongly the DOF in Fig. 3c.This is reflected in the bias (Fig. 3b) and in the reconstruction (Fig. 3a, smoothed values in Figs. 3a and 3b naturally lag the sudden changes in Figs.3c and 3d).
Correspondingly the optimal distribution of the measurements around 869 and 1500 AD (Fig. 1) is reflected in low error variances (Fig. 3d), high number of the DOF (Fig. 3c) and unbiased temperature estimates (Fig. 3b).
Low values of the DOF (<5) are related to extrema of the bias.High values of the DOF (>14) indicate vanishing bias.The latter ones are seldom and therefore bias-free cases are seen only temporarily in Fig. 3b.
In Fig. 1, there is a repetitive similarity between the two longer periods with nearly continuous samples of trees (AD 1250-1500 and 1600-1800) and two periods without new trees (AD 1500(AD -1600(AD and 1800(AD -1980)).The corresponding terms of the relative variance (Eq.5) will be discussed in the following.
The large first peaks of variance around 1356 AD (Fig. 3d the paucities due to missing trees in the data cause alternating data gaps leading to varying behavior in temperature reconstructions.Especially biases around 1600 and 1950 can be explained by the error terms connected to the missing younger age classes (due to the missing trees).The latter case is known as "divergence", a systematic underestimation of the temperatures (Briffa et al. 1998, D'Arrigo et al. 2008).As there is no formal difference between the cases of 1600 and 1950, it is natural to refer to "divergence" in both cases.
In summary, the long-term biased oscillations originate from the unsuitable distribution of the measurements over age classes and years and are independent of the reconstruction methods and the existing measurements

Conclusions
The presented method to estimate past temperatures from tree ring measurements is a new approach, where no age dependence of the tree rings is estimated.The method allows to estimate systematic errors due to the uneven distribution of the measurements caused by missing trees, i.e., by data gaps and paucities in the data.Because the structure of the tree populations varies slowly, such errors are mainly seen as long oscillations.
In the Torneträsk case studied, the biased temperature oscillations were well connected to the theoretical error terms derived.Especially the "divergence" cases (a systematic underestimation of the temperatures) around 1600 and 1950 could be explained by the error terms connected to the paucities in the data seen as missing younger age classes.
Several Torneträsk studies presented in the literature, in spite of clear differences between the different approaches, suffer similarly of paucities in the data.All of them show similar spurious low-frequency oscillations.Franke et al. (2013) observed that the low-frequency biases can be present in studies applying maximum densities of the tree rings as well as those studying tree ring widths.The bias can further be present both in estimates of precipitation and temperature.The presence of the low-frequency error was independent of the methods applied or aim of the work.This study strongly suggests that the paucities in the data are one explanation for such a general presence of the error.The approximate method derived here to estimate the degrees of freedom can be used to trace the potential impact of such paucities.
Fig. 2 for b max =300.The extrapolation corrections are more complicated if most of the older age classes are unknown (b n <30) or the impact of the missing youngest age classes (b 1 <≈20) is to be estimated.However, to avoid too complicated Clim.Past Discuss., doi:10.5194/cp-2016-27,2016 Manuscript under review for journal Clim.Past Published: 23 March 2016 c Author(s) 2016.CC-BY 3.0 License.
Clim.Past Discuss., doi:10.5194/cp-2016-27,2016 Manuscript under review for journal Clim.Past Published: 23 March 2016 c Author(s) 2016.CC-BY 3.0 License.Klingberj and Moberg (2003) composited a series of instrumental observations made in the Tornedalen region, some 300 km from Torneträsk.Their construction begins with instrumental data from Övertorneå (1802-1838).The observation hours were not stated explicitly for 1826-38 and the authors had to assume them.Here we correct that assumption by increasing their JJA temperature reconstructions by 1°C.The procedure can be interpreted as if an unknown Clim.Past Discuss., doi:10.5194/cp-2016-27,2016 Manuscript under review for journal Clim.Past Published: 23 March 2016 c Author(s) 2016.CC-BY 3.0 License.The yearly variations of the dimensionless variance terms for the Torneträsk case are shown in Fig. 3d.Note that they are relative error terms and the final error variance depends on the dimensional portions (b max r 1 ) 2 , (b max r n ) 2 and S 2 .
) result from missing intermediate (interpolated) and old age classes.These damp slowly out and new peaks are seen at 1600 AD.Here they are due to the intermediate (interpolated) and young age classes.The same structure is repeated between 1600-1800 AD and 1800-1980 AD.In this way Clim.Past Discuss., doi:10.5194/cp-2016-27,2016 Manuscript under review for journal Clim.Past Published: 23 March 2016 c Author(s) 2016.CC-BY 3.0 License.

Figure 1 .Figure 2 .
Figure 1.The distribution of the measurements or age classes vs. calendar year in the Torneträsk MXD data.Years with better (869 and 1500 AD) and worse (1365 AD) distributions of the measurements are shown.Upper limits of the age classes (270 and 370 years) applied in the computations are indicated.

Figure 3 .
Figure 3.The theoretical explanation of biased long-term oscillations between 500 and 1975 AD.Panel (a): Smoothed (85year spline) temperature reconstructions derived from Torneträsk MXD data 441-1980.Shown are the estimate from Briffa et al. 1992 and the present ones both with b max =270 and b max =370.Panel (b): The smoothed (85-year spline) difference between the present temperature estimate (b max =270) and that in Esper et al. 2012.Panel (c): Comparison of the sample count (number of observations) and degrees of freedom.Indicated are the seven and two cases of DOF<5 and DOF>14, 5 respectively.Panel (d): components of relative variances.Some years mentioned in the text and shown in Fig. 1 are indicated in panels (b), (c) and (d).

Figure 4 .
Figure 4. Comparison of Torneträsk temperature reconstruction and Tornedalen temperature observations 1802-1977.10 Reconstruction mean has been adjusted to fit the corresponding observational mean during 1850-1950 and both curves have been smoothed.