Impact of Land Initial States Uncertainty on Subseasonal Surface Air Temperature Prediction in CFSv2 Reforecasts

The NCEP CFSv2 ensemble reforecasts initialized with different land surface analyses for the period of 1979–2010 have been conducted to assess the effect of uncertainty in land initial states on surface air temperature prediction. The two observation-based land initial states are adapted from the NCEP CFS Reanalysis (CFSR) and the NASA GLDAS-2 analysis; atmosphere, ocean, and ice initial states are identical for both reforecasts. This identical-twin experiment conﬁrms that the prediction skill of surface air temper-atureissensitiveto theuncertaintyoflandinitialstates,especiallyin soilmoistureandsnowcover.Thereisno distinct characteristic that determines which set of the reforecasts performs better. Rather, the better per-former varies with the lead week and location for each season. Estimates of soil moisture between the two land initial states are signiﬁcantly different with an apparent north–south contrast for almost all seasons, causing predicted surface air temperature discrepancies between the two sets of reforecasts, particularly in regions where the magnitude of initial soil moisture difference lies in the top quintile. In boreal spring, inconsistency of snow cover between the two land initial states also plays a critical role in enhancing the discrepancy of predicted surface air temperature from week 5 to week 8. Our results suggest that a reduction of the uncertainty in land surface properties among the current land surface analyses will be beneﬁcial to improvingthepredictionskillofsurfaceairtemperatureonsubseasonaltimescales.Implicationsofamultiple land surface analysis ensemble are also discussed.


Introduction
Useful predictability of deterministic weather forecasts is usually no more than 2 weeks, limited by the sensitivity to the atmospheric initial state, while longer memory from ocean heat content plays a dominant role in the climate predictability on seasonal and longer time scales (e.g., Lorenz 1963Lorenz , 1975;;Shukla 1985;Lorenz 1993).There is a gap between the two time scales of weather and climate predictions, where inertia in the land surface, such as soil moisture, snow, and vegetation states can provide a source of predictability (Dirmeyer et al. 2015(Dirmeyer et al. , 2018b)).
Land surface memory relevant to subseasonal to seasonal prediction is typically defined based on anomalies of soil moisture.Since soil moisture anomalies in nature can persist from a week up to two months or more (Vinnikov et al. 1996;Entin et al. 2000;Mahanama and Koster 2003;Seneviratne et al. 2006), the influence of soil moisture anomalies on atmospheric variability (viz., surface air temperature and precipitation) has been explored using climate models in many previous studies (e.g., Shukla and Mintz 1982;Delworth and Manabe 1989;Hong and Kalnay 2000;Douville et al. 2001;Wu and Dickinson 2004;Koster et al. 2004Koster et al. , 2006;;Guo et al. 2006;Dirmeyer andHalder 2016, 2017;Dirmeyer et al. 2018b;Halder et al. 2018).Soil moisture anomalies modulate near-surface air temperature: positive moisture anomalies in soil can give rise to more evaporation, which leads to increased evaporative cooling of the surface and decreased sensible heating of the overlying air, and vice versa.Therefore, positive (negative) soil moisture anomalies result in cooling (warming) of the lower troposphere, i.e., negative feedback between the soil moisture anomaly and air temperature tendency (Fischer et al. 2007;Koster et al. 2009a).
If a coupled forecast model is able to reasonably capture the land-atmosphere coupling in nature with realistically initialized land surface (soil moisture) states, the contribution of soil moisture memory to enhanced predictability of the atmospheric system might be realized.Consequently, studies in the past have attempted to quantify the impact of realistic soil moisture initialization on subseasonal and seasonal prediction skill (e.g., Fennessy and Shukla 1999;Dirmeyer 2000;Douville 2004Douville , 2010;;Koster et al. 2010Koster et al. , 2011;;Guo et al. 2011;van den Hurk et al. 2012;Materia et al. 2014;Prodhomme et al. 2016;Dirmeyer andHalder 2016, 2017;Dirmeyer et al. 2018b;Halder et al. 2018).Model fidelity in representing coupled land-atmosphere processes is also necessary, including proper simulation of variability, covariability, sensitivity, and critical transitions in the chain of processes linking land surface states to surface fluxes, near-surface atmospheric states, boundary layer characteristics, cloud formation, and precipitation (Dirmeyer and Halder 2017;Santanello et al. 2018).
Realistic soil moisture initialization has been standard practice in coupled climate forecasts systems for about a decade (e.g., Vitart et al. 2008).Nonetheless, land surface initial states in many current weather and climate forecast systems are far from perfect (Vitart et al. 2017), mainly due to the lack of operational near-real-time monitoring for the land surface unlike atmosphere and ocean surface.Satellite data assimilation shows great promise to address this shortcoming (Carrera et al. 2015;Al-Yaari et al. 2017;Reichle et al. 2019).A number of recent studies have investigated weather and climate models' ability to accurately represent various aspects of land-atmosphere coupled processes in nature using much improved datasets of land surface states in terms of their spatial and temporal coverage and quality (e.g., Trigo et al. 2015;Levine et al. 2016;Dirmeyer et al. 2016Dirmeyer et al. , 2018a)).
In this paper, we introduce ''identical twin'' sets of 32-yr (1979-2010) reforecasts initialized with land initial states based on two independent observationbased land surface analyses but with same initial states for other components such as atmosphere, ocean and sea ice.One land surface analysis is from the National Centers for Environment Prediction (NCEP) Coupled Forecast System (CFS) Reanalysis (CFSR; Saha et al. 2010) and the other is the National Aeronautics and Space Administration (NASA) Global Land Data Assimilation System Version 2.0 (GLDAS-2) analysis (Rodell et al. 2004;Rodell and Beaudoing 2015;Rui and Beaudoing 2015).Using these identical-twin sets of CFSv2 reforecasts, we investigate the uncertainty of soil moisture states between the two land surface analyses for 32 years  and quantify its impact on prediction skill and predictability of nearsurface air temperature on subseasonal time scales.We compare two different ''realistic'' land initializations in this study, whereas previous studies have compared initialization to more idealized (e.g., climatological) land initial conditions.This study sheds light on how uncertainties in land initialization may affect forecast skill.In our companion paper, we specifically examine sensitivity of U.S. drought prediction skill to land initial states (Shin et al. 2020).
Section 2 describes the coupled model used in this study and the identical-twin experiment design in details.Evaluation of 2-m air temperature prediction skill, soil moisture uncertainty between the two land surface analyses, and its influence on 2-m air temperature prediction as a function of lead time are presented in sections 3 and 4, respectively.A summary and discussion are given in section 5.

Model and identical-twin experiments
CFS version 2 (CFSv2) is a fully coupled dynamical climate system that has been used for operational seasonal prediction at NCEP (Saha et al. 2014).The atmospheric model of the CFSv2 is a lower resolution version of the Global Forecast System (GFS), which has a spectral horizontal resolution of T126 (equivalent to about 18 grid spacing) and 64 vertical levels.The oceanic component is the Geophysical Fluid Dynamics Laboratory (GFDL) Modular Ocean Model version 4 (MOM4; Griffies et al. 2004).It has 40 vertical levels and a 0.58 horizontal grid spacing poleward of 308 latitude, increasing to 0.258 within 108 latitude of the equator.The sea ice component is a three-layer global interactive dynamical sea ice model with predicted fractional ice cover and thickness (Winton 2000) while the land surface component is the Noah land surface model (LSM) version 2.7.1 (Ek et al. 2003) of four soil layers with interfaces at depths of 0.1, 0.4, 1.0, and 2.0 m.The version of the model used in this study follows the revisions described in Huang et al. (2015).
The Center for Ocean-Land-Atmosphere Studies (COLA) has recently produced a 60-yr (1958-2017) set of CFSv2 ensemble reforecasts of 12-month duration initialized at the beginning of January, April, July, and October (Huang et al. 2017(Huang et al. , 2019)).For the whole 60-yr period, the ocean initial states came from the instantaneous restart files of the ECMWF Ocean Reanalysis System 4 (ORA-S4) with a set of five-member ensemble assimilation runs (Balmaseda et al. 2013).After 1979, the atmosphere, land and sea ice initial states were taken from the restart files of the CFSR.Twenty-member ensemble reforecasts were generated by matching each of the five ocean initial states at 0000 UTC on the first of each initial month with the atmospheric and land initial states at 0000 UTC of the first four days, but the same sea ice initial state at 0000 UTC on the first was used for all ensemble members.More details of the initialization procedure for the whole 60-yr period can be found in Huang et al. (2017Huang et al. ( , 2019)), which also examined the prediction skill and predictability of ENSO for 1958-2014 and U.S. seasonal precipitation for 1958-2017, respectively.
As a companion experiment to the original 60-yr CFSv2 reforecasts  with respect to land initialization, we more recently completed a set of CFSv2 reforecasts for 1979-2010 using land initial states based on the NASA GLDAS-2 dataset, which are referred to as the GLDAS reforecasts.The 20-member ensemble GLDAS reforecasts initialized in early January, April, July, and October have been conducted, and the integration length of each is 12 months.This new set of the GLDAS reforecasts complements the 60-yr CFSv2 reforecasts in two ways.For the set of 60-yr CFSv2 reforecasts, the land initial states were adapted from NASA GLDAS-2 analysis before 1979 and NCEP CFSR after 1979, respectively.Therefore, its pre-1979 runs and the new GLDAS runs can be combined into a continuous set of reforecasts for a set of 53-yr (1958-2010) reforecasts with GLDAS land initialization.More importantly, the original CFSv2 reforecasts initialized with NCEP CFSR land states for the common period of 1979-2010 (32 years) have the same initial conditions as the new GLDAS reforecasts except for the land states (hereafter, they are referred to as the CFSR reforecasts), forming a pair of identicaltwin experiments.
The Noah LSM (Ek et al. 2003), having the same vertical soil layers (0-10, 10-40, 40-100, and 100-200 cm) as in CFSv2, was used to generate both the CFSR land surface analysis at T126 spectral spatial resolution (;18) and the GLDAS-2.0 data at 18 3 18 spatial resolution.For the CFSR land surface analyses, the Noah LSM was modified to have the identical setup as in the fully coupled CFS-Noah LSM, which has 13-category SiB vegetation classes, 9-category Zobler soil types, and associated vegetation and soil parameters (cf.Saha et al. 2010).The same modified Noah LSM was used to prepare the GLDAS-2.0 data, but utilizing the 20-category modified IGBP-MODIS vegetation classes and the STATSGO-FAO 16-category soil texture classes (Rui and Beaudoing 2015;H. Beaudoing 2016, personal communication).The state variables used for initialization of the land surface in CFSv2 were soil moisture and temperature at the standard LSM model layers, snow liquid water equivalent, skin temperature, and canopy water storage (see ftp://ftp.emc.ncep.noaa.gov/mmb/gcp/ldas/noahlsm/ver_2.7.1).For each day of our model experiments, the state variables were interpolated from their native grid to the T126 reduced Gaussian grid of the CFSv2 model using the nearest neighbor approach.
We focus on ensemble mean prediction in this study.Operational forecasts are usually based on the ensemble mean, so it is more representative of how operational forecasts would be affected by differences in land initialization.Any variables from the identical-twin experiments (i.e., the CFSR reforecasts and GLDAS reforecasts) indicate their own 20-member ensemble mean predictions in the remainder of the paper.We will also introduce all-inclusive 40-member ensemble mean predictions in the following section, which are referred to as ''Grand Ensemble (GE) reforecasts.''For verification, NOAA Climate Prediction Center (CPC) 0.58 3 0.58 global daily 2-m air temperature is used from 1 January 1979 to 31 December 2010, which is provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, from their website at https:// www.esrl.noaa.gov/psd/.These data were built upon a gridded monthly climatology of CRU (Climate Research Unit, University of East Anglia, United Kingdom), which may be replaced with PRISM over regions where PRISM is available.A gridded analysis of temperature anomalies was derived by interpolating GTS (Global Telecommunication System) station values through the Shepard algorithm that is a distance-weight technique with directional correction.Finally, gridded analyses of total temperature were computed by adding the anomaly to the CRU climatology (ftp://ftp.cpc.ncep.noaa.gov/precip/PEOPLE/wd52ws/global_temp/CPC-GLOBAL-T.pdf).More details about the data, including maps of typical station distribution, can be found there.As one may expect, Unauthenticated | Downloaded 03/20/24 03:28 PM UTC these verification data are far from ideal and, therefore, are associated with some uncertainty.

Evaluation of 2-m air temperature prediction skill in the identical-twin experiments
We validate prediction skill of weekly mean 2-m air temperature in both the CFSR and GLDAS reforecasts over 32 years .Figures 1-4 display global anomaly correlation maps of 2-m air temperature from week 1 to week 4 for the reforecasts starting from early January, April, July, and October, respectively.Note that week 1 covers from the fourth to tenth of each initial month, and week 2 represents 7-day mean from the eleventh to the eighteenth of each month, and so on.In both the CFSR and GLDAS reforecasts with January initial conditions (ICs), the correlation skill at week 1 is good with statistical significance over almost the entire globe, yet, as expected, the skill degrades quickly with increasing lead time (left and center columns of Fig. 1).At weeks 3 and 4, the skill in both sets of reforecasts decreases relatively faster over Europe, Russia, the southern United States, northwestern Africa, and Australia (left and center columns of Figs.1c,d).
For longer lead forecasts, a skill discrepancy between the CFSR and GLDAS reforecasts is apparent, especially at week 4.The CFSR reforecasts show higher skill over the Middle East than the GLDAS reforecasts, whereas the latter displays higher skill than the former over the regions such as Canada and the northern United States (including Alaska), northern China and Mongolia, eastern Russia, and eastern Australia (left and center columns of Fig. 1d).It is interesting to note that over those regions where the identical-twin experiments exhibit quite different skill, the spatial distribution of prediction skill in the GE reforecasts tends to be very close to the set of reforecasts that has the better skill (right column of Fig. 1).
General features seen in Fig. 1 are also commonly found in the identical-twin experiments starting from early April, July, and October (Figs.2-4).That is, correlation skill of both the CFSR and GLDAS reforecasts is reasonably good up to week 2 over most of the globe and then continuously decreases, although the rate of degradation varies by season and location for each set of reforecasts.Overall, skill in the reforecasts with July and October ICs decline faster at weeks 3 and 4 with smaller spatial coverage of statistical significance (Figs.3c,d and  4c,d), compared to the other seasons.
For each starting month, we highlight some areas where a skill discrepancy between the CFSR and GLDAS reforecasts is largest.At weeks 3 and 4 of the April IC runs, the GLDAS reforecasts perform better than the CFSR ones over southern Africa and Canadian Shield, but it is opposite over Kazakhstan and northern Russia (left and center columns of Figs.2c,d).For the July IC runs, the CFSR reforecasts show higher skill at week 3 over the western United States, Alaska, and northeastern Russia whereas the GLDAS reforecasts have higher skill over the northern United States and central Africa (left and center columns of Fig. 3c).In boreal fall, the CFSR reforecasts still show a statistically significant correlation skill at week 4 over northeastern Europe, Mongolia, and northern China in contrast to the GLDAS reforecasts, whereas the latter displays a better skill over Alaska and Canada than the former (left and center columns of Fig. 4d).It is noteworthy that over almost all of North America, the correlation of 2-m air temperature in the CFSR reforecasts is statistically insignificant at week 3 and Unauthenticated | Downloaded 03/20/24 03:28 PM UTC becomes negative at week 4, showing much faster degradation of skill than the GLDAS reforecasts, although skill in the latter is also not good at week 4.
It is clearly seen that the better performer between the CFSR and GLDAS reforecasts varies area by area and changes by season even in the same region.However, predictive skill in the GE reforecasts always tends to be as good as that of the better performer in regions where a skill difference between the two sets of reforecasts is noticeable, as we described above for each starting month (Figs.2-4).To substantiate this argument, we focus further on North America at week 3 for all seasons (Fig. S1 in the online supplemental material).The CFSR reforecasts are the better performer over the southeastern United States and the northern Great Plains for January ICs, the Midwest and Alaska for April ICs, and the western U.S. coastal region and the southwestern and southeastern United States for July ICs.On the other hand, the GLDAS reforecasts show better skill than the CFSR reforecasts, for example, over the southern Great Plains and the southwestern United States for January ICs, most of Canada and the southeastern United States for April ICs, the northern United States for July ICs, and almost all of North America for October ICs.It is also confirmed that predictive skill in the GE reforecasts looks nearly equivalent to that of the better performer over the North America although the better performer changes in region and season (Fig. S1).
For more quantitative comparison, we display percentages of the land grid cells with statistically significant skill over the globe (608S-708N) from week 1 to week 8 in Fig. 5.Both the CFSR and GLDAS reforecasts have significant skill over more than 90% of the global land grid cells at week 1 for all four ICs (orange and blue bars in Fig. 5), and greater than about 80% at week 2 for January and April ICs and about 70% at week 2 for July and October ICs, which are far above the percentage from a persistence forecast (black curves in Fig. 5).
Here, we define a persistence forecast as the anomaly of the initial states continued throughout the forecast lead time.For January ICs, percentages gradually decrease as lead time increases and become below 30% from week 5, but at least about 10% above that of the persistence forecast (Fig. 5a).Percentages for April ICs are below 30% from week 4 and an extra skill relative to the persistence forecast is marginal from week 7 for both the CFSR and GLDAS reforecasts (Fig. 5b).Percentages rapidly drop from week 3 for both July and October ICs, and they are below 20% from week 4 for July ICs with little extra skill relative to the persistence forecast, but from week 6 for October ICs (Figs. 5c,d).
More importantly, differences of significance area percentages between the CFSR and GLDAS reforecasts are also evident (orange and blue bars in Fig. 5).For instance, the GLDAS reforecasts are significant over a greater area than the CFSR reforecasts at week 2, weeks 4 and 5 for January ICs, at week 3 for April ICs, at weeks 4 and 5 for July ICs, and at weeks 5 and 6 for October ICs.In contrast, significant areas of the CFSR reforecasts are greater than those of the GLDAS reforecasts at week 3, weeks 7 and 8 for January ICs, at week 4 for April ICs, at week 3, weeks 6 and 7 for July ICs, and at week 3 and week 8 for October ICs.This demonstrates that the prediction skill of 2-m air temperature in CFSv2 is sensitive to the land initial states on subseasonal time scales.
It is noteworthy that percentages of the GE reforecasts (gray bars in Fig. 5) are equivalent to or even greater than the higher ones between the two sets of reforecasts for almost all lead times and all seasons.This suggests that a multiple land surface analyses forecast ensemble may reduce the impact of uncertainty in land initial states, resulting in better surface temperature prediction skill and reliability.This is similar to the previous results that show compositing of model soil moisture analyses improve skill over individual models (Guo et al. 2007) and that multiocean analyses ensemble initialization leads to better sampling of uncertainty in ocean initial states, which improves predictive skill and reliability of ENSO and associated Asian summer monsoon rainfall forecasts (Zhu et al. 2012(Zhu et al. , 2013;;Shin et al. 2019).
We also analyzed a new ensemble, ''GE_reduced'' that has the same number of ensemble members as the two sets of reforecasts (yellow bars in Fig. 5).It was constructed by randomly taking 10 ensemble members from the set of reforecasts with CFSR land ICs and 10 members from the other set with GLDAS land ICs.The GE_reduced shows higher percentages than both CFSR and GLDAS for almost all lead weeks for April and October ICs (Figs. 5b,c) and for more than half of all eight lead weeks for January ICs (i.e., weeks 1 through 3, weeks 5 and 6 in Fig. 5a) and April ICs (i.e., weeks 1-3, week 7 in Fig. 5c).Therefore, this demonstrates that the best performance of the GE reforecasts is not simply due to a large ensemble (i.e., 40 members versus 20 members) but mainly due to sampling multiple land surface analyses, which is also evident in the spatial pattern of predictive skill over the North America at week 3 for all seasons (two right columns in Fig. S1).

Effect of soil moisture uncertainty between two different land initial states on surface air temperature prediction
In this section, we examine the differences of soil moisture between the NOAA CFSR and NASA GLDAS-2 land surface analyses and, ultimately, the prediction of 2-m air temperature anomalies on subseasonal time scales.Figure 6 shows a spatial distribution of volumetric soil moisture differences in the first 10 cm below the surface and its standard deviation for 1979-2010.For January ICs, blue color areas in the north are clearly separated from red color areas in the south (left panel of Fig. 6a), indicating that CFSR land surface analysis is wetter than GLDAS-2 north of about 308N but drier to the south.This north-south contrast of 0-10-cm soil moisture is still evident for April and October ICs, but the boundary between the blue and red areas marches northward for April ICs especially over the North America and Europe, moves farther north toward the polar regions for July ICs, and then moves back southward for October ICs (left panels of Fig. 6).In general, relatively larger year-by-year variations of soil moisture inconsistencies between the two land initial states seem to be related to the melting of snowpack in the extratropics of the winter Hemisphere and in mountainous areas such as the Himalayas and Andes (right panels of Fig. 6).Inconsistencies in soil moisture and snow cover are primarily driven by the precipitation forcing in the two land surface analyses, but the other differences in soil and vegetation classes and associated parameters may also be responsible.It is demonstrated that the estimates of soil moisture at 0-10 cm in the NOAA CFSR land ICs and the NASA GLDAS-2 land ICs are quite different for all seasons, even in their 32-yr climatologies, suggesting large uncertainty of soil moisture initial states each year between the CFSR and GLDAS reforecasts.
Temporal and spatial evolution of the climatological difference of predicted soil moisture (0-10 cm) between the CFSR and GLDAS reforecasts are presented in Fig. 7.For January ICs, the spatial coverage and magnitude of the blue area in the land ICs shows little change with lead time whereas those of the red area in the land ICs rapidly reduce after week 1 (Fig. 7a versus Fig. 6a).Therefore, the CFSR reforecasts maintain wetter soil conditions than the GLDAS reforecasts to the north of about 308N from the initial states to week 8.For April ICs, it is interesting to see the enhancement and northward expansion of red color area over eastern Canada as well as the northeastward Unauthenticated | Downloaded 03/20/24 03:28 PM UTC gradual propagation and expansion of red color area over the Eurasian continent from northern Europe to Russia continuously up to week 8 (Fig. 7b and right column of Fig. 11).This is associated with the time gap of snowmelt between the two reforecasts, which is discussed below.
In boreal summer, on the other hand, the Northern Hemisphere (NH) displays little difference of predicted soil moisture at 0-10 cm as lead time increases, although the CFSR reforecasts consistently predict lower (higher) soil moisture content over the eastern United States (Kazakhstan), compared to the GLDAS reforecasts (Fig. 7c).This is probably because the persistence time scale of soil moisture in the summer Hemisphere is shorter due to larger insolation and forecasts from the two different analyses converge faster to the model climatology.For October ICs, the CFSR reforecasts tend to predict higher soil moisture (blue color) in high latitudes of the NH (e.g., north of 608N) as seen in the initial states while the GLDAS reforecasts show wetter soil conditions (red color) over the eastern United States, northern India, and eastern China, although its magnitude becomes smaller with lead time (Fig. 7d).The spatial distribution of predicted soil moisture difference at week 5 tends to persist up to week 8 (not shown) except in the extratropics of the NH for April ICs.It is also clearly seen that the GLDAS reforecasts predict wetter soil for almost all seasons in the Unauthenticated | Downloaded 03/20/24 03:28 PM UTC Southern Hemisphere (SH) than the CFSR reforecasts (Fig. 7).
We next analyze how 2-m air temperature and soil moisture anomalies are associated with each other in the climate forecasts system (CFSv2).Figure 8 shows correlation maps between 2-m air temperature and soil moisture anomaly (0-10 cm) in the CFSR reforecasts for 1979-2010, and it is evident that they are overall negatively correlated with each other.The drier the land is, the warmer the surface air temperature, and vice versa.In the NH for January ICs, the negative correlation intensifies in lead time over the United States and Europe where soil moisture differences between the CFSR and GLDAS reforecasts are less severe.On the other hand, little correlation is found over Canada and a majority of the Eurasian continent where soil moisture differences appear larger but snow cover decouples soil moisture from the atmosphere (Fig. 7a versus Figs.6a and 8a).Negative correlation in the NH seems to peak for April ICs and is still relatively strong for July ICs (Figs. 8b,c).For October ICs, little correlation is also found in high latitudes of the NH where dark blue areas predominate in Figs.6d and 7d (Fig. 8d).In the extratropics of the SH, negative correlations between 2-m air temperature and soil moisture anomalies seem to be strongest for October ICs, but weakest for July ICs.The correlation maps in Fig. 8 may indicate the spatial pattern of landatmosphere coupling strength and its change in lead time in the CFSR reforecasts, which can determine how significant the impact of underlying soil moisture anomalies on atmospheric predictability are, in addition to soil moisture memory (Guo et al. 2011).Note that the patterns of correlation in Fig. 8 are generally similar to those of the GLDAS reforecasts (not shown).
To examine more quantitatively the contributions of soil moisture differences in the land ICs to 2-m air temperature prediction, we first divide all land grid cells (608S-708N) into five equal size subsets, based on absolute values of the soil moisture differences between the two land surface analyses in the left panels of Fig. 6.As a result, each grid cell is assigned to one of the five groups according only to the magnitude of the initial soil moisture difference without regard to its sign.In Fig. 9, the 5th quintile of soil moisture difference for each month is above the red curve, the 4th quintile is between the red and orange curves, the 3rd quintile is between the orange and green curves, the 2nd quintile is between the green and blue curve, and below the blue curve is the 1st quintile.The initial soil moisture differences are largest in the January land ICs, the second largest in the April ICs, and the smallest in the July ICs.
We hypothesize that greater difference in initial soil moisture results in larger divergence of predicted 2-m air temperature between the two sets of reforecasts.Thus, we calculate the absolute value of a climatological difference of predicted 2-m air temperature between the CFSR and GLDAS reforecasts at each land grid cell and then average over all grid cells of each quintile from week 1 to week 8 (Fig. 10).Color curves in Fig. 10 are well stratified for all ICs from the bottom to the top, that is, from the 1st quintile with the smallest values to the 5th quintile with the largest values, validating our hypothesis.One exception is for week 5 to week 7 of January ICs (Fig. 10a).It should be noted that although the differences of soil moisture between the two land surface analyses are greatest for January ICs (Fig. 9), the largest difference of predicted 2-m air temperature between the CFSR and GLDAS reforecasts appears for April ICs (Fig. 10b).For instance, the predicted temperature difference of 5th quintile at week 1 is close to 1.08C for April ICs, around about 0.88C for January and October ICs, and about 0.78C for July ICs.For January ICs, the value of the red curve decreases relatively more quickly as lead time increases and becomes smaller than that of 4th quintile (orange curve), breaking the order FIG. 9.The 32-yr mean difference of volumetric soil moisture at 0-10 cm (fraction) between the land initial conditions (ICs) of the GLDAS and CFSR reforecasts (GLDAS minus CFSR) averaged over the land grid cells where lie in each quintile calculated based on its magnitude without regard to its sign for each starting month.The abscissa is the starting month from January to October.(See the text for more details.) of vertical color arrangement (i.e., the one exception mentioned above).This is because for January ICs, the grid cells that lie in the 5th quintile largely show little correlation between overlying air temperature and underground soil moisture anomalies, particularly in the NH (Fig. 8a), although a quite large difference of predicted soil moisture seems long-lasting, but mainly, under the snow cover (Fig. 7a).
The colored curves of the 1st-4th quintile for July and October ICs are close to each other with much smaller magnitude (below 0.48C), compared to those for January and April ICs.For example, the temperature differences of the 4th quintile (yellow curves) for July and October ICs are approximately equal to or even less than those of 2nd quintile (blue curves) for January and April ICs for almost all lead times up to week 8 (Fig. 10), while the magnitudes of soil moisture differences of the 4th quintile in the two land initial states are larger for July and October ICs than those of the 2nd quintile for January and April ICs (Fig. 9).This suggests that the magnitude of predicted 2-m air temperature on subseasonal time scales in the CFSv2 reforecasts is more sensitive to the initial soil moisture anomalies for January and April ICs than for July and October ICs.This may partly be because of the fact that most of the land areas in the NH have monsoon-like systems in boreal summer and anomalously wet soil moisture conditions during the postmonsoon season (i.e., October-November).Namely, the atmosphere is insensitive to soil moisture variations in the moist, energy-limited regimes where soil moisture content lies above a critical value that soil moisture becomes limiting for evapotranspiration (e.g., Koster et al. 2009a;Seneviratne et al. 2010).
As the initial soil moisture difference becomes reduced in intensity with increasing lead time (left panels of Fig. 6 versus Fig. 7), the magnitude of predicted 2-m air temperature difference, particularly for the 5th quintile, also declines with lead time.However, for April ICs, the difference of predicted 2-m air temperature of the 3rd-5th quintiles reintensifies beginning at week 5 (Fig. 10b).In particular, the mean difference of predicted 2-m air temperature at week 8 becomes about 0.958C with a 0.28C increase from week 5 for the 5th quintile (red curve of Fig. 10b) and is about 0.88C, the greatest magnitude for 8-week lead time, for the 4th quintile (orange curve of Fig. 10b).
What causes April (or boreal spring) to become a notable exception?To address this question, we further examine temporal and spatial evolution of predicted 2-m air temperature difference for April ICs in Fig. 11.The expansion of the orange-red area with its enhanced magnitude from weeks 4 and 6 to week 8 is clearly seen over Canada and Alaska and northeastern Russia, while the blue-purple area over eastern Canada and along a band from northern Europe to central Russia seems to remain the same with lead time (left panels of Fig. 11).If we compare predicted snow depth at week 2 between the CFSR and GLDAS reforecasts (Fig. 12a), it is noticeable that along the yellow curves (08C of 2-m air temperature in the GLDAS reforecasts, identical to the green curves in Fig. 11), the snow cover of the CFSR reforecasts is thinner than that of the GLDAS reforecasts over northeastern Canada, western Russia, and northern Europe where the CFSR reforecasts are warmer at the surface (blue color in the left panel of   11a) and drier (red color in the right panel of Fig. 11a) than the GLDAS reforecasts.On the other hand, over Mongolia and northern China, the CFSR reforecasts display thicker snow cover, colder surface air temperature and higher soil moisture than the GLDAS reforecasts (Figs.11a and 12a).In addition, the snow depth of the GLDAS reforecasts is much thinner (less than 10 cm in some places) over Alaska and northern Canada than that of the CFSR reforecasts, where the former displays warmer temperature at the surface than the latter.
Snow cover plays a role in modulating surface air temperature anomalies for the period of snowpack melting, for example, from boreal spring to summer (e.g., Xu  ICs, and (right) the difference between the GLDAS and CFSR reforecasts (GLDAS minus CFSR).Yellow curves denote 08C of 2-m air temperature in the GLDAS reforecasts, identical to the green curves in Fig. 11. and Dirmeyer 2011).Once the snow starts melting and is gradually thinned, shortwave radiative insolation at the surface continuously increases as albedo is diminished, which results in warming up surface air temperature.Since this radiative (or albedo) effect is sensitive to snow cover change, differences of initial snow cover between the CFSR and GLDAS reforecasts give rise to quite large discrepancies of predicted 2-m air temperature in lead times of week 1 and 2. As the yellow curve propagates northward in lead time due to increasing solar insolation during boreal spring, it results in further snowmelt to the south of the yellow curve (Fig. 12); the center of areas with large 2-m air temperature difference also marches poleward in week 4 and week 6 (Figs.11b,c).
When the snow cover becomes thin enough, melting is not only accelerated by the radiative (or albedo) effect, but existing differences in soil moisture due to different land initial states can also exacerbate snow melt, e.g., by rendering dry soils even drier.Negative (positive) soil moisture anomalies lead to less (more) evaporation and therefore, more warming up (cooling down) of the near-surface air during the model forecast, which in turn gives rise to accelerating (slowing down) snowmelt.Consequently, if this positive feedback of land-atmosphere coupling starts earlier in one set of reforecasts over some specific regions, it would more likely lead to increasing departure of predicted 2-m air temperature there, compared to the other set of reforecasts.It is also noteworthy that enhancement of 2-m air temperature differences over southern Canada and Kazakhstan (i.e., orange area) from week 6 to week 8 is accompanied by a strengthening of soil moisture differences (i.e., light blue area) (Figs.11c,d).
To exclude the positive feedback of land-atmosphere coupling associated with melting snow, we also perform the same analysis as Figs. 9 and 10 only over the land cells that are not snow covered or frozen simply by focusing on the areas between 508S and 358N, instead of between 608S and 708N (Figs.S2 and S3).Compared to those in Fig. 9, the initial soil moisture differences between the two land surface analyses become much reduced for January and April ICs but increase slightly for July and October ICs (Fig. S2), which result in relatively little change of the initial soil moisture differences in season.As a consequence, predicted 2-m air temperature discrepancies between the CFSR and GLDAS reforecasts look similar to each other for all seasons including boreal spring, which generally diminish over lead time (Fig. S3).This again confirms that the effect of melting snow in boreal spring causes predictions of 2-m air temperature to diverge again after week 5, which is shown in Fig. 10b (i.e.,April ICs).
Additionally, we examine soil moisture differences for a deeper layer, down to 1-m depth (i.e., the rootzone soil moisture) between the two land surface analyses and their contributions to discrepancies of 2-m air temperature predictions over the global land grid cells.The north-south contrasts of the climatological root-zone soil moisture between the two land ICs are overall similar to those of the surface layer (0-10 cm) soil moisture for all seasons with lower interannual variability in the former than the latter (Fig. S4 versus Fig. 6).Compared to the surface layer soil moisture, however, the differences of the root-zone soil moisture increase to the south of about 408N, while they become much reduced in the high latitudes of the NH, especially for January and April ICs (left of Fig. S4 and left of Fig. 6).This indicates that relative soil wetness (dryness) in the GLDAS-2 land ICs compared to the CFSR land ICs, shown in the red (blue) areas, is more enhanced (less severe) for the deeper layer of 10-100 cm than the surface layer.Compared with the 0-10-cm soil moisture difference (Fig. 9), therefore, the absolute magnitudes of initial soil moisture difference for the deep layer decrease for the 4th and 5th quintiles of January and April ICs, but increase for those quintiles of July and October ICs (Fig. S7).
The initial differences of the root-zone soil moisture (left of Fig. S4) seem to persist for predictions up to week 5 (Fig. S5), in contrast to those for the surface layer that tend to diminish gradually in lead time except in the NH extratropics for January ICs (Fig. 7).In general, predicted 2-m air temperature anomalies are also negatively correlated with predicted anomalies of the root-zone soil moisture (Fig. S6).More importantly, this negative correlation looks similar to that of the surface layer in the SH for all seasons in terms of its magnitude and pattern, while the predicted 2-m temperature is less correlated with the root-zone soil moisture in the NH, especially in the extratropics for April ICs, as compared to the 0-10-cm soil moisture (Fig. S6 versus Fig. 8).This explains main features of area averaged absolute values of predicted 2-m air temperature difference accompanied by the initial soil moisture difference for the deep layer (Fig. S8), which are consistent with what we found in Fig. 10.First, divergence of predicted 2-m air temperature between the two sets of reforecasts is largest at all lead times over the regions where the greatest disparity of soil moisture ICs is exhibited (i.e., red curves in Fig. S8).Second, the influence of the initial soil moisture difference on the predicted 2-m temperature discrepancy generally diminish over lead time, particularly for the 5th quintile.Last, a noticeable exception is again apparent for April ICs, that is, the Unauthenticated | Downloaded 03/20/24 03:28 PM UTC difference of predicted 2-m temperature diverges again after week 5 (Fig. S8b) 1 due to the aforementioned positive feedback of land-atmosphere coupling associated with melting snow in boreal spring.

Summary and discussion
We conducted NCEP CFSv2 ensemble reforecasts initialized with two land surface analyses for the period of 1979-2010.The two observation-based land initial states are adapted from the NCEP CFS Reanalysis (CFSR) and the NASA GLDAS-2 analysis and the 20-member ensemble means of the corresponding reforecasts are referred to as the CFSR and GLDAS reforecasts, respectively.Since atmosphere, ocean and sea ice initial states are identical for both reforecasts, the discrepancy in predicted 2-m air temperature between the CFSR and GLDAS reforecasts should result solely from the difference between the two land initial states.As a consequence, these identical-twin sets of 32-yr CFSv2 reforecasts enable us to evaluate the effect of the uncertainty in the land initial states on the prediction of the atmospheric surface temperature variability at subseasonal time scales.
We confirm that prediction skill of weekly mean 2-m air temperature is sensitive to the uncertainty in land initial states.When we compare regions with statistically significant skill between the CFSR and GLDAS reforecasts, a skill disparity between the two sets of reforecasts becomes evident from week 3 for all seasons.There is no distinct characteristic that determines which set of reforecasts performs better.Rather, the better performer varies with the lead week and location for each season.It is interesting to note that predictive skill in the grand ensemble reforecasts (i.e., all 40-member ensemble members included from both land surface initializations) tends to be as good as that of the better 20-member ensemble in regions where a skill difference between the two sets of reforecasts is noticeable for each starting month.Percentages of the land grid cells with statistically significant skill over the globe (608S-708N) provide more quantitative comparison of 2-m air temperature prediction skill from week 1 to week 8 (Fig. 5).One set of reforecasts does not always show higher percentage than the other.Instead, the higher one varies with lead time for the same starting month and is also different in seasons.Again, percentages of skillful land area in the grand ensemble reforecasts are equivalent to or even greater than the higher ones between the two sets of reforecasts for almost all lead times and all seasons.This suggests that multiple land surface analyses initialization in a forecast ensemble may reduce the effect of uncertainty in land initial states, resulting in more reliable and better prediction of 2-m air temperature.
It is seen that CFS Reanalysis displays much higher soil moisture in the NH extratropics but lower soil moisture elsewhere for almost all seasons, compared to GLDAS-2 land reanalysis, and the boundary of the north-south contrast in soil moisture migrates northward from boreal winter to summer and moves back southward from boreal summer to winter (Fig. 6).This indicates that estimates of soil moisture at 0-10 cm and the root-zone (1-m depth) between the two land ICs are indeed quite different even in the 32-yr climatology, implying large uncertainty of soil moisture initial states each year between the CFSR and GLDAS reforecasts.Area averaged absolute values of predicted 2-m air temperature difference substantiate that over the regions where the greatest disparity of soil moisture ICs is exhibited, divergence of predicted 2-m air temperature between the sets of reforecasts is largest at all lead times up to week 8 (i.e., 5th quintile in Fig. 10).
Greater impact of the uncertainty in 0-10-cm soil moisture on surface air temperature prediction on subseasonal time scales is found for April ICs than January ICs, which shows the largest magnitude of initial soil moisture differences.It is mainly due to there being little land-atmosphere coupling (i.e., no correlation between soil moisture and 2-m air temperature anomalies) where there is snow cover for January ICs.More importantly, although the discrepancy of predicted 2-m air temperature naturally decreases with lead time as forecasts lose the memory of initial soil moisture and the model drifts toward its climatology, a noticeable exception is found for April ICs, which diverge again after week 5.This feature is also obvious in the influence of the uncertainty in the root-zone soil moisture on surface air temperature prediction.However, the difference of predicted 2-m temperature between the two sets of reforecasts for April ICs shows very similar patterns to those of the other starting months if the regions that are not frozen or under the snow cover are considered (Fig. S3).
As solar insolation increases in the NH extratropics in boreal spring, the initial difference of snow cover is responsible for a time interval of snowmelt over some 1 Divergence of predicted 2-m temperature discrepancies after week 5 clearly appears for the 2nd and 3rd quintiles (green and blue curves in Fig. S8b), instead of the 4th and 5th quintiles in Fig. 10.This is because the largest differences of predicted 2-m temperature between the CFSR and GLDAS reforecasts for April ICs (left of Fig. 11) are largely located in the purple/dark-blue areas in Fig. 6b (left) but light-blue areas in Fig. S4b (left), roughly corresponding to the 4th and 5th quintiles of the initial difference of the 0-10-cm soil moisture and the 2nd and 3rd quintiles of the rootzone soil moisture difference, respectively.regions and consequently the radiative (or albedo) effect of snow cover discrepancies gives rise to relatively faster warming of surface air temperature in one set of reforecasts relative to the other set.Once the snow cover melts in one set of reforecasts, soil moistureevaporation-surface air temperature feedback sets in, resulting in enhanced differences of predicted 2-m air temperature between the CFSR and GLDAS reforecasts afterward.Therefore, in addition to inconsistencies of soil moisture, uncertainty of snow cover in the land initial states also influences predictability of near surface air temperature in boreal spring at high latitudes of the NH.This may suggest that more efforts should be made to reduce the uncertainty of land surface properties among the current land surface analyses, which will be beneficial to improving prediction skill of surface air temperature on subseasonal and seasonal time scales.
Last, we note that the CFSR soil moisture ICs will be more consistent with the land surface model climatology in this forecast model than GLDAS-2 mainly because of the use of similar soil and vegetation characteristics and parameters (cf.Saha et al. 2010) as explained in section 2,2 even when differences in variability and means are taken into account (cf.Koster et al. 2009b).Due to biases in coupled land-atmosphere feedback processes in the forecast model, the most accurate and realistic soil moisture initialization does not necessarily result in the best forecast.Initial states with errors that compensate for forecast model errors may actually provide better forecasts.This is an unsatisfying strategy, however-land data assimilation is the best way to produce consistent initial states (Al-Yaari et al. 2017), although data assimilation also has limitations in terms of consistency due to the evolution in the observing system (e.g., advent of new satellites).Validation of forecast models regarding their coupled process behavior, becoming possible now due to increases in the availability of the necessary observational data over land (Dirmeyer et al. 2016;Balsamo et al. 2018), can lead to informed model improvements and development, which will further enhance the harvest of predictability from land surface states.

FIG. 1 .
FIG. 1. Anomaly correlation coefficient maps of weekly mean 2-m air temperature for 1979-2010 from (a) week 1 through (d) week 4 in (left) the CFSR reforecasts, (center) the GLDAS reforecasts, and (right) the GE reforecasts with January initial conditions (ICs).Dashed curves denote 95% confidence level.See the text for more details about the CFSR, GLDAS, and GE reforecasts.

FIG. 5 .
FIG. 5. Percentages of the land grid cells with statistically significant skill of 2-m air temperature over the globe (608S-708N) for (a) January ICs, (b) April ICs, (c) July ICs, and (d) October ICs.Blue (orange) bars are for the CFSR (GLDAS) reforecasts and gray (yellow) bars are for the GE (GE_reduced) reforecasts.Black curves represent the percentage of the persistent forecast.The abscissa is the lead time from week 1 to week 8. See the text about the GE_reduced reforecasts.

FIG. 6
FIG. 6. (left) The 32-yr mean difference of volumetric soil moisture at 0-10 cm (fraction) between the land initial conditions (ICs) of the GLDAS and CFSR reforecasts (GLDAS minus CFSR) and (right) its standard deviation during the period of 1979-2010 for (a) January ICs, (b) April ICs, (c) July ICs, and (d) October ICs.

FIG. 8
FIG. 8. (a) Anomaly correlation coefficient between predicted soil moisture at 0-10 cm and predicted 2-m air temperature in the CFSR reforecasts at (left) week 1, (center) week 3, and (right) week 5 for January ICs.(b)-(d) As in (a), but for April ICs, July ICs, and October ICs, respectively.Note that the correlation coefficients are calculated based on the ensemble mean of each variable.
FIG. 10.(a) Area averaged magnitude of 32-yr mean difference of predicted 2-m air temperature (8C) between the GLDAS and CFSR reforecasts (GLDAS minus CFSR) over the land grid cells of each quintile determined in Fig. 9 for January ICs.(b)-(d) As in (a), but for April ICs, July ICs, and October ICs, respectively.The abscissa is the lead time from week 1 to week 8. (See the text for more details.)

FIG. 11
FIG. 11. (left) The 32-yr mean difference of predicted 2-m air temperature (8C) between the GLDAS and CFSR reforecasts (GLDAS minus CFSR) for April ICs, and (right) as in the left panels, but for predicted volumetric soil moisture at 0-10 cm (fraction) at (a) week 2, (b) week 4, (c) week 6, and (d) week 8. Green curves denote 08C of 2-m air temperature in the GLDAS reforecasts at each lead week.

Fig.
Fig.11a) and drier (red color in the right panel of Fig.11a) than the GLDAS reforecasts.On the other hand, over Mongolia and northern China, the CFSR reforecasts display thicker snow cover, colder surface air temperature and higher soil moisture than the GLDAS reforecasts (Figs.11a and 12a).In addition, the snow depth of the GLDAS reforecasts is much