Temperature is the most often measured subsurface ocean variable. Historically, a variety of instruments have been used to measure temperature, with differing accuracies, precisions, and sampling depths. Both the mix of instruments and the overall sampling patterns have changed in time and space (Boyer et al., 2009), complicating efforts to determine and interpret long-term change. The evolution of the observing system for ocean temperature is summarized in Appendix 3.A. Upper ocean temperature (hence heat content) varies over multiple time scales including seasonal (e.g., Roemmich and Gilson, 2009), interannual (e.g. associated with El Niño, which has a strong influence on ocean heat uptake, Roemmich and Gilson, 2011), decadal (e.g., Carson and Harrison, 2010), and centennial (Gouretski et al., 2012; Roemmich et al., 2012). Ocean data assimilation products using these data exhibit similar significant variations (e.g., Xue et al., 2012). Sparse historical sampling coupled with large amplitude variations on shorter time and spatial scales raise challenges for estimating globally averaged upper ocean temperature changes. Uncertainty analyses indicate that the historical data set begins to be reasonably well suited for this purpose starting around 1970 (e.g., Domingues et al., 2008; Lyman and Johnson, 2008; Palmer and Brohan, 2011). UOHC uncertainty estimates shrink after 1970 with improved sampling, so this assessment focuses on changes since 1971. Estimates of UOHC have been extended back to 1950 by averaging over longer time intervals, such as 5-year running means, to compensate for sparse data distributions in earlier time periods (e.g., Levitus et al., 2012). These estimates may be most appropriate in the deeper ocean, where strong interannual variability in upper ocean temperature distributions such as that associated with El Niño (Roemmich and Gilson, 2011) is less likely to be aliased.
Since AR4 the significant impact of measurement biases in some of the widely used instruments (the expendable (XBT) and mechanical bathythermograph (MBT) as well as a subset of Argo floats) on estimates of ocean temperature and upper (0 to 700 m) ocean heat content (hereafter UOHC) changes has been recognized (Gouretski and Koltermann, 2007; Barker et al., 2011). Careful comparison of measurements from the less accurate instruments with those from the more accurate ones has allowed some of the biases to be identified and reduced (Wijffels et al., 2008; Ishii and Kimoto, 2009; Levitus et al., 2009; Gouretski and Reseghetti, 2010; Hamon et al., 2012). One major consequence of this bias reduction has been the reduction of an artificial decadal variation in upper ocean heat content that was apparent in the observational assessment for AR4, in notable contrast to climate model output (Domingues et al., 2008). Substantial time-dependent XBT and MBT biases introduced spurious warming in the 1970s and cooling in the early 1980s in the analyses assessed in AR4. Most ocean state estimates that assimilate biased data (Carton and Santorelli, 2008) also showed this artificial decadal variability while one (Stammer et al., 2010) apparently rejected these data on dynamical grounds. More recent estimates assimilating better-corrected data sets (Giese et al., 2011) also result in reduced artificial decadal variability during this time period.
Recent estimates of upper ocean temperature change also differ in their treatment of unsampled regions. Some studies (e.g., Ishii and Kimoto, 2009; Levitus et al., 2012) effectively assume a temperature anomaly of zero in these regions, while other studies (Palmer et al., 2007; Lyman and Johnson, 2008) assume that the averages of sampled regions are representative of the global mean in any given year, and yet others (Smith and Murphy, 2007; Domingues et al., 2008) use ocean statistics (from numerical model output and satellite altimeter data, respectively) to extrapolate temperature anomalies in sparsely sampled areas and estimate uncertainties. These differences in approach, coupled with choice of background climatology, can lead to significant divergence in basin-scale averages (Gleckler et al., 2012), especially in sparsely sampled regions (e.g., the extratropical Southern Hemisphere (SH) prior to Argo), and as a result can produce different global averages (Lyman et al., 2010). However, for well-sampled regions and times, the various analyses of temperature changes yield results in closer agreement, as do reanalyses (Xue et al., 2012).