With their origins in numerical weather prediction and climate modeling, land surface models aim to accurately partition the surface energy balance. An overlooked challenge in these schemes is the role of model parameter uncertainty, particularly at unmonitored sites. This study provides global parameter estimates for the Noah land surface model using 85 eddy covariance sites in the global FLUXNET network. The at‐site parameters are first calibrated using a Latin Hypercube‐based ensemble of the most sensitive parameters, determined by the Sobol method, to be the minimum stomatal resistance (*r*_{s,min}), the Zilitinkevich empirical constant (*C*_{zil}), and the bare soil evaporation exponent (*fx*_{exp}). Calibration leads to an increase in the mean Kling‐Gupta Efficiency performance metric from 0.54 to 0.71. These calibrated parameter sets are then related to local environmental characteristics using the Extra‐Trees machine learning algorithm. The fitted Extra‐Trees model is used to map the optimal parameter sets over the globe at a 5 km spatial resolution. The leave‐one‐out cross validation of the mapped parameters using the Noah land surface model suggests that there is the potential to skillfully relate calibrated model parameter sets to local environmental characteristics. The results demonstrate the potential to use FLUXNET to tune the parameterizations of surface fluxes in land surface models and to provide improved parameter estimates over the globe.

Terrestrial evapotranspiration links the water, energy, and carbon cycles. The incoming solar radiation can heat the land surface, be reflected back into the atmosphere, and evaporate available surface water including ponding water, soil moisture, water in the canopy, and water in the plant stomata. The partitioning of incoming solar radiation into latent heat and sensible heat is of special interest to atmospheric modelers since it is known to play a critical role in the diurnal growth and decay of the planetary boundary layer [*Betts*, ; *Roundy et al*., ; *Santanello et al*., ]. As a result, models used in numerical weather prediction, seasonal forecasting, and climate modeling use land surface models to simulate the water and energy exchange between the land surface and the atmosphere. Over the years, land surface models have grown in sophistication from simple bucket models to include multilayer hydrology, multilayer canopies, and carbon and nitrogen cycling [*Pitman*, ]. However, they continue to struggle to accurately simulate latent heat at subhourly time scales. This can lead to considerable uncertainty when trying to accurately assess and capture the interactions between the land and the atmosphere.

Most global land surface models use evapotranspiration parameterizations that capture bare soil evaporation, wet canopy evaporation, and transpiration. The plant transpiration schemes have evolved significantly over the years but many continue to rely on big leaf models [*Deardorff*, ]. These parameterizations assume that a single leaf approximates all leaves on a plant. The leaf area index is used to scale up a single leaf to the entire canopy. The most common big leaf model used in land surface modeling is Penman‐Monteith [*Monteith*, ]. This scheme is attractive since it is physically based, computationally efficient, and does not require an estimate of the leaf surface temperature. It is also appealing because it can capture the constraints of environmental conditions on plant transpiration through parameterizations of aerodynamic and canopy resistances. Monin‐Obukhov similarity theory is the most predominantly used parameterization of aerodynamic resistance [*Monin and Obukhov*, ]. It allows us to relate turbulent fluxes (e.g., latent heat) to the differences in mean wind speed, temperature, and humidity at two different levels (i.e., surface and measurement height). Currently, the predominant parameterization of canopy resistances in global land surface models continues to be Jarvis‐type formulations [*Jarvis*, ; *Noilhan and Planton*, ]. These schemes determine canopy resistance primarily as a product of a minimum canopy resistance and a set of empirical functions that aim to simulate plant stress due to environmental conditions [*Alfieri et al*., ]. These are slowly being replaced in land surface models with more physically based formulations that model canopy resistances as a function of CO_{2} assimilation rate, surface temperature, and ambient temperature [*Ball et al*., ; *Kumar et al*., ].

One of the often overlooked challenges with these parameterizations is prescribing the model parameters. In current global land surface models there can be over 20 parameters that play a role either directly or indirectly in evapotranspiration. Depending on the model sensitivity to the parameters, the parameter estimates can lead to considerable uncertainty in the model results. One of the primary appeals of land surface models is that they are physically based; in theory their parameters can be measured through fieldwork. However, in practice these parameters are hard to define since they can be highly spatially heterogeneous and vary according to climate, environmental conditions, and plant species. Current land surface models rely on outdated lookup tables that relate these model parameters to land cover type and soil type. Previous studies have shown how this practice is insufficient and should be revisited to provide improved estimates of the surface water and energy cycles [*Rosero et al*., ; *Hou et al*., , ].

It should be noted that lookup tables were the best that could be done given the available field data when land surface modeling was in its infancy. However, the available wealth of fieldwork and satellite remote sensing products suggests that the development of more robust parameter estimation techniques is plausible. One exciting possibility is the use of FLUXNET, a global network of eddy covariance sites [*Baldocchi*, ]. These sites provide not only measurements of the surface fluxes but also the different components that drive the evapotranspiration modules in land surface models. This data will also have uncertainty due to scale mismatch and their problems in closing the surface energy budget. However, it is a source of ground truth that should not be ignored since it allows previously unachievable robust validation of modeled surface fluxes over the globe.

The primary objective of this study is to use the FLUXNET network of eddy covariance sites to develop a global map of calibrated parameters for the Noah land surface model [*Ek et al*., ]. This goal is accomplished by first performing a Sobol sensitivity analysis at the FLUXNET sites. This helps discern the model parameters that drive the model uncertainty. A 1000 Latin Hypercube Sample is then used to thoroughly sample the sensitive parameters to select the top performing parameter sets per eddy covariance site. Finally, in an attempt to move beyond classic parameter lookup tables, the Extra‐Trees machine learning algorithm is used to develop a relationship between the calibrated parameter sets and readily available global land and meteorological data sets. The final product is a global 5 km product of the three most sensitive model parameters (*r*_{cmin}, *C*_{zil}, and *fx*_{exp}). These results are then analyzed to assess the improvement in model performance after optimization and to assess the physical meaning of the parameters after optimization and regionalization.

FLUXNET is an active global network of meteorological sites composed of over 650 sites in 30 regional networks covering 5 continents. These sites measure the exchange of water vapor, carbon dioxide, and energy between terrestrial ecosystems using the eddy covariance method [*Baldocchi*, ]. In an effort to provide a data set that can be used by the global Earth science community, the FLUXNET community has harmonized, standardized, and gap filled these sites to create the La Thuile data set. This summary database contains 253 eddy covariance stations with a total of 960 site years of data at a 30 min time resolution [*ORNL DAAC*, ]. These data are invaluable for land surface modeling as they provide high‐quality input data including incoming shortwave and longwave radiation, air temperature, wind speed, specific humidity, and precipitation. But more importantly, it provides data to diagnose the land surface model output including latent heat, sensible heat, net radiation, ground heat flux, surface friction velocity, and outgoing longwave radiation, among others. The data for each site includes a number of parameters obtained from the 1 km collocated grid cell of the moderate‐resolution imaging spectroradiometer (MODIS) global databases including land cover type and leaf area index. This study uses the free fair‐use subset of the La Thuile data set that contains 154 stations. Figure shows the spatial coverage of the free fair‐use subset of the FLUXNET data set and the number of years of data available per site; the average temporal coverage per site is ~4 years. Most stations are in North America and Europe. The most common land cover types per station are evergreen needleleaf forest, woodland, and cropland. Deciduous needleleaf forests and urban areas do not have stations in this database.

FLUXNET provides a wealth of information to run and validate land surface models. However, uncertainties in these data can lead to misleading conclusions. To ensure the land surface model output is being compared to ground truth, the data are quality controlled to minimize the uncertainties in the 30 min latent heat measurements. The following flags are applied at each site: (1) time steps with no data or low to medium quality gap‐filled estimates of latent heat are discarded; (2) time steps in which the error in energy balance closure is above 10% are discarded; (3) incoming shortwave radiation must be above 150 W/m^{2} to focus on daytime measurements; and (4) the air temperature must be above freezing (to focus on warm season processes). Stations that do not have at least 500 time steps that meet these criteria are discarded. These requirements lead to a reduction in the number of sites from 154 to 85. Although random errors will persist in the data after these constraints, the quality‐controlled database provides high‐quality data to robustly evaluate and improve the performance of the Noah land surface model at a subhourly time step.

Other than latent heat measurements, the eddy covariance sites also provide meteorological information. Using these data allows to minimize the land surface model input data uncertainties to ensure the focus is on improving the model parameterizations. However, the meteorological forcing data available at each site are rarely temporally continuous and have missing data at multiple time steps. This is a challenge for land surface models since they need to be run in continuous form to accurately capture the temporal dynamics in soil moisture, heat storage, and canopy storage. This requires continuous 30 min data of incoming longwave radiation, incoming shortwave radiation, wind speed, air temperature, precipitation, and specific humidity. We address this challenge by gap filling the missing time steps with the Princeton Global Forcing data set that provides global meteorological information at a 3 h temporal resolution and 1.0° spatial resolution between 1948 and 2010 [*Sheffield et al*., ]. Prior to gap filling, this data set is temporally downscaled to a 30 min resolution using linear interpolation. Furthermore, the data are bias corrected by matching the empirical cumulative distribution function of the data from the site's colocated grid cell to the site's available meteorological data [*Reichle and Koster*, ]. Although this technique removes the bias, it does not remove the random errors and cannot be trusted to provide high‐quality 30 min meteorological information. As such, this bias‐corrected meteorological data are only used to connect time steps that do not have input data to ensure the model can be run in continuous form. The model estimates of latent heat are only compared to the measured latent heat at time steps at which the FLUXNET site provides all the input meteorological data.

Noah is a physically based vertical one‐dimensional land surface model that captures the primary subsurface and surface terrestrial water and energy processes. It has its origins in the Oregon State University (OSU) model that is a coupling between a diurnally dependent Penman equation, a two layer soil hydrologic model, and a simple big leaf canopy model [*Mahrt and Ek*, ; *Mahrt and Pan*, ; *Pan and Mahrt*, ]. During the 1990s, the National Center for Environmental Prediction (NCEP) expanded its land surface modeling collaborations transforming the OSU model into the Noah land surface model [*Ek et al*., ]. Over the years, there have been numerous improvements that have led to the current land surface model. In the latter 1990s, upon realizing the role that subgrid soil moisture heterogeneity can play in macroscale estimates of surface fluxes, the runoff generating processes in the model were improved through the implementation of the simple water balance model [*Chen et al*., ; *Schaake et al*., ]. This was followed by an improvement in cold season processes [*Koren et al*., ], the implementation of a bare soil evaporation formulation suitable for macroscale land surface modeling [*Betts et al*., ], among many other model enhancements. For a comprehensive overview of the model's development, see [*Ek et al*., ].

Given that the primary focus here is to understand the strengths and weaknesses of the evapotranspiration module in the Noah land surface model, this section provides an overview of the module's parameterizations. Evapotranspiration is calculated as the sum of three components: direct evaporation from bare soil (*E*_{b}), wet canopy evaporation (*E*_{c}), and plant transpiration (*E*_{t}).* E_{c}*—The wet canopy evaporation parameterization uses the ratio of the water content (cmc) and the maximum water content possible in the canopy (cmc

The primary source of uncertainty in this formulation is *B _{c}*. In simple terms, it accounts for the environmental controls that limit plants from transpiring at the potential rate [

The *r*_{s,min} parameter is the smallest possible resistance that the plant will impose on transpiration—a lower value means higher maximum transpiration rates. The remaining parameters control how the incoming shortwave radiation limits photosynthesis (i.e., transpiration) (*F*_{1}), how the stomata react to the ambient vapor pressure (*F*_{2}), how the stomata react to ambient temperature (*F*_{3}), and the constraint that soil moisture content plays on transpiration (*F*_{4}).

Table provides a summary of all the relevant parameters in Noah's evapotranspiration module, a brief explanation, and the parameter ranges that are commonly found in the literature. These ranges are used in both the Sobol sensitivity analysis and the Latin Hypercube Sample.

As discussed in the previous section, Noah's evapotranspiration estimates are impacted by a large number of uncertain model parameters. Given the limited number of eddy covariance sites in FLUXNET, tuning the nine parameters at each site appears to be ill advised. This is especially true since the end goal is to develop a functional relationship between the optimal parameter sets and a site's local environmental characteristics. A plausible intermediate step is to reduce the number of tunable model parameters by identifying the most sensitive parameters. For the insensitive parameters, we can rely on the model's lookup table values [*Rosero et al*., ]. To determine the sensitivity of the model parameters, we use the Sobol sensitivity analysis. Prior studies have successfully used this technique to discern the role of the Noah model parameters at a limited number of sites [*Rosero et al*., ; *Hou et al*., ].

The Sobol sensitivity analysis [*Sobol*, ] is a global method that decomposes the variance of the model output *Y* into contributions from each parameter *X _{i}* and its interactions with other parameters:

The first‐order index *S _{i}* represents the expected reduction in variance if parameter

Estimators for *S _{i}* and

After calculating the first‐order and total‐effect sensitivity indices for each site, the results are summarized across all eddy covariance stations to assess the parameter sensitivity across the FLUXNET network.

Reducing the number of parameters from the original nine parameters simplifies the calibration exercise at each eddy covariance site. To obtain approximately optimal model performance while assessing the role of model parameter equifinality [*Beven*, ], the Latin Hypercube Sampling (LHS) technique [*McKay et al*., ] is used to assess model performance across the reduced model parameter space. This scheme generally outperforms simple random sampling since it more evenly samples the parameter space. LHS accomplishes this goal by splitting the distribution of each parameter (uniform in this study) into *n _{L}* regions of equal probability;

Having calibrated the parameters at the 85 eddy covariance sites in the FLUXNET network, we construct a functional relationship between the optimal parameter sets and each site's local environmental characteristics (e.g., LAI, annual temperature, and canopy height, among others). The end goal is to then use this functional relationship to estimate global maps of the optimized Noah model parameters using available global data sets of the environmental characteristics (see Table ). To represent this likely nonlinear functional relationship, an ensemble of *T* extremely randomized trees is used (Extra‐Trees [*Geurts et al*., ]). This method has shown to outperform other tree‐based methods including decision tree regressors, bagging of decision trees, and random forests. For more information on extremely randomized trees, see *Geurts et al*. [] and *Galelli and Castelletti* []. The output of Extra‐Trees is the arithmetic average of the output of the *T* extremely randomized trees.

Given the small sample size (85 calibrated parameter sets), instead of splitting the data into a single validation data set and a single training data set, a leave‐one‐out cross validation is performed. To accomplish this goal, the Extra‐Trees model (*T* extremely randomized trees) is fit 85 times—each time, the optimal parameter set for one tower is left out for validation. For this study, each Extra‐Trees model has 100 trees. The final model that results from combining the 85 models is equivalent to an Extra‐Trees model with 8500 trees. The model skill is assessed via the *R*^{2} of the 85 validation estimates of the *C*_{zil}, *r*_{s,min}, and *fx*_{exp} Noah LSM model parameters.

It is acknowledged that calibrating imperfect hydrologic and land surface models can lead to dissimilar parameter sets that are practically indistinguishable based on a given performance metric [*Beven*, ; *Chaney et al*., ]. This may limit the performance of the parameter regionalization algorithm since the differences in performance among the best performing parameter sets could be minimal. To address this concern, it is initially assumed that any of the top five parameter sets at each site are equally optimal. In other words, the difference in the performance of the Noah model among them is negligible. To reduce the negative impact of parameter equifinality, we create 3200 parameter data sets and determine the data set that leads to the best Extra‐Trees model. Each parameter data set consists of 85 parameter sets where each parameter set is drawn at random from the 5 optimal parameter sets at each site. The leave‐one‐out cross validation is performed for each of the 3200 Noah model parameter data sets.

To assess the sensitivity of the model parameters outlined in Table , the Sobol method is applied to the quality‐controlled and gap‐filled 85 FLUXNET sites. At each site, the Noah land surface model is run on 1100 different parameter sets; previous work has shown that 10^{2} model runs using models with a similar number of parameters as this study is sufficient to screen the dominant model parameters [*Sarrazin et al*., ]. For each parameter set, the KGE performance metric is used to compare the simulated latent heat flux to the measured latent heat flux. The KGE values at each site are then used to compute both the first‐order indices (individual parameter sensitivity) and total‐effect indices (individual parameter sensitivity and parameter interactions).

Figure shows the results for 10 representative sites. Each site's panel shows the first‐order indices and total‐effect indices for the 9 model parameters. For almost all these sites, the minimum stomatal resistance (*r*_{s,min}) and Zilitinkevich constant (*C*_{zil}) are the most sensitive parameters. In the case of *r*_{s,min} and *C*_{zil}, the largest sensitivity can be attributed to the first‐order index with a relatively small increase due to parameter interaction as apparent in the total‐effect index. The only other parameter that appears to be sensitive at these sites is *fx*_{exp}. At some sites, its sensitivity is negligible while at others the model skill is highly sensitive to this parameter—especially when considering parameter interactions. This result can most likely be attributed to the varying role that bare soil evaporation plays in the Noah land surface model across the 85 sites. In regions where the green vegetation fraction is close to 1 all year (e.g., evergreen needleleaf and broadleaf forests), the contribution from bare soil evaporation is negligible. Except for a small role that *cmc*_{max} plays at a few sites, the Noah land surface model is insensitive to the rest of the parameters at these 10 sites.

To test whether the conclusions drawn from Figure generalize to all 85 sites, Figure summarizes each parameter's total‐effect indices for all sites. The boxplots in the figure are the summary of all total‐effect index values in the network per parameter. The results are very similar to the 10 representative sites. Noah's ability to accurately simulate latent heat flux is highly sensitive to both *r*_{s,min} and *C*_{zil}. The same is true for *fx*_{exp} when considering sites that have a strong seasonality in vegetation coverage or have minimal vegetation coverage year round. The model performance is strongly insensitive to the rest of the parameters. This provides a strong argument that calibration should focus on the *r*_{s,min}, *C*_{zil}, and *fx*_{exp} parameters, fixing the remaining parameters with values from the current model lookup tables.

Having reduced the model parameters down to *r*_{s,min}, *C*_{zil}, and *fx*_{exp} simplifies model parameter calibration. To this end, Latin Hypercube Sampling is used to assemble 1000 parameter sets using the ranges defined in Table . The Noah land surface model is then run at each of the 85 sites for each of these 1,000 parameter sets; this leads to a total of 85,000 model simulations. For each simulation, the KGE metric and its components are calculated to compare the model estimates of latent heat flux at a 30 min temporal resolution to the quality controlled measurements.

Figure shows the model performance at the 10 representative sites used in section 4.1 and the KGE metric after optimization at each of these 85 sites. For each site, a scatterplot is used to compare the simulated latent heat flux (lookup table and calibrated parameter values) to the measurements. The primary outcome after calibration at each site is the minimum bias. It is also apparent that calibrating these three parameters does little to reduce the spread around the one‐to‐one line. This suggests that calibration struggles to minimize the random errors. As will be discussed later, this can most likely be attributed to both the inability to capture the 30 min temporal dynamics of the canopy and aerodynamic resistances using oversimplistic parameterizations and the random noise in the observations. There are no apparent spatial properties in the KGE metric after calibration.

Figure summarizes the performance at each of the 85 sites by showing histograms of the KGE metric and its three components (mean bias, temporal variability, and linear correlation) for the simulations using the lookup table and optimized parameter values. The results confirm the results in Figure . Optimization leads to a large reduction in the bias in the mean and the standard deviation at each site. The network average of the mean bias shifts from 24% to 9%; the network average of the standard deviation bias shifts from 18% to 6%. Unfortunately, the improvement in the linear correlation is not as dramatic. The network average linear correlation shifts from 0.70 to 0.75; this improvement is not as impressive as that of the bias. Upon combining these components to create the KGE metric, there is a shift in the network average KGE from 0.54 to 0.71. The network median of the KGE shifts from 0.63 to 0.73. The difference between the mean and the median suggests that there are a number of sites that perform very poorly using the lookup table parameter values that improve significantly after optimization. The inability to improve many stations that have a KGE under 0.5 after optimization could be indicative of biases and uncertainties in the input data or structural deficiencies in Noah's evapotranspiration module and its parameterizations.

As mentioned in section 3.4, calibrating an imperfect land surface model can lead to dissimilar parameter sets that are practically indistinguishable based on a given performance metric. Figure evaluates model parameter equifinality at the 10 representative sites used in Figures and by plotting the value of the performance metric for each of the 1000 LHS simulations against the corresponding values for each parameter. Model parameter equifinality plays an important role for most of the parameters at many of the sites. For example, at the US‐Bar site, in many cases there is a similar model performance for *r*_{s,min} values between 10 and 100. This can most likely be explained by *r*_{s,min} and *C*_{zil} exchanging roles—both act to reduce the overall model bias. Parameter equifinality is even more discernable for the *fx*_{exp} parameter at many of the sites. This is most likely due to most of these sites being insensitive to this model parameter (see Figure ). Furthermore, it is immediately apparent that the sites do not converge on similar parameter values; this provides further evidence that different parameter values are necessary at each of the sites. The strong presence of model parameter equifinality suggests that it cannot be disregarded in the parameter regionalization.

Although parameter calibration leads to a large reduction in bias in Noah's estimates of latent heat flux, it does not ensure that the optimized parameters (*r*_{s,min}, *C*_{zil}, and *fx*_{exp}) can be transferred to other sites using climate, vegetation, and soil characteristics. To test whether this is possible, an Extra‐Trees model with 8500 extremely randomized trees is fit using the local environmental characteristics (see Table ) to the optimal parameter sets. The model is validated using a leave‐one‐out cross validation for the 85 eddy covariance sites. The role of model parameter equifinality is accounted for by running the cross validation on 8500 plausible optimal parameter data sets. Each parameter data set consists of one parameter set per site; the parameter set is chosen at random from the top five performing parameter sets. Although not shown here, the impact of parameter equifinality is small relative to the role of the environmental characteristics. The remainder of this study uses the parameter data set that leads to the optimal Extra‐Trees model.

The leave‐one‐out cross‐validation results for the optimal parameter data set are shown in Figure . When the 8500 trees are used to estimate the Noah model parameters, the coefficient of determination is above 0.94 for all three parameters. When only the trees that were not trained on a given site (100 trees) are used to estimate the parameters at that site, the skill decreases. In this case, the coefficient of determination of *r*_{s,min} is 0.20, *C*_{zil} is 0.25, and *fx*_{exp} is 0.21. This provides a robust evaluation of the model's ability to estimate the optimal parameters at sites not included in the training process. Although there is clearly room for improvement, the cross‐validation results suggest that a skillful functional relationship between the optimized model parameters and the local environmental characteristics does exist.

Another method to assess the parameter regionalization algorithm is to compare the performance of the Noah land surface using the lookup table parameter values, the optimized parameters, and the estimated parameters. Figure shows the histogram of KGE values for both the training (8500 trees) and validation (100 trees per site) cases. When using the parameters estimated from the 8500 trees, the model performance decreases only slightly with respect to the optimized parameter sets. However, when using the parameters that are estimated using the 100 trees that were not trained on a given site, there is a noticeable decrease in the network average mean KGE compared to when using the optimal parameter values. Nonetheless, there is a noticeable improvement compared to when using the lookup table model parameter values. These results coincide with the parameter comparison shown in Figure . Overall, although Noah's performance is not as good as when using the optimized parameter values, it does appear that this parameter estimation technique leads to noticeable model improvement when compared to using the parameters from the existing lookup tables.

The use of global data sets of environmental characteristics allows for an estimation of these parameters using the optimal Extra‐Trees model over the globe. Figure shows the mapped estimates for both the mean (prediction) and the standard deviation (uncertainty estimate) of *r*_{s,min}, *C*_{zil}, and *fx*_{exp} at a 5 km spatial resolution over the globe. The mean and standard deviation are calculated from the predictions at each grid cell of the 8500 decision trees in the fitted Extra‐trees model. The spatial patterns of the *C*_{zil} do not agree with the physical understanding of the parameter [*Chen and Zhang*, ]; we would expect the highest values to be over dry climates and short vegetation—this is not the case. The spatial properties of *r*_{s,min} are not physically consistent either. We would expect the highest values to be in the water and energy limited regions and the lowest in the areas that are not water or energy limited—this does not seem to be the case. Although further investigation is required, this suggests that the role that optimized *r*_{s,min} parameter values play in the model might not be related to its physical meaning but simply as a bias correction term that absorbs other sources of uncertainty in Noah's estimates of evapotranspiration—including the errors in the resistance functions in the model's Jarvis‐type formulation of canopy resistance. Finally, the *fx*_{exp} parameter shows distinct spatial patterns. The values are highest in regions where we do not expect a large role of bare soil in the latent heat flux. This suggests that the parameter optimization attempts to shut off the signal of bare soil evaporation in the model. Over drier regions, there is a tendency toward lower *fx*_{exp} parameter values; it appears that the model attempts to increase the contribution of bare soil evaporation by allowing it to extract more water from the soil's top layer. This outcome for both the high and low *fx*_{exp} values could be indicative of model weaknesses in the reliance on the green vegetation fraction to define the bare soil contribution to evaporation.

The results from the Sobol sensitivity analysis in section 4.1 show that the evapotranspiration module in the Noah land surface model is only sensitive to a subset of the model parameters described in Table . These results coincide with previous studies that suggest that *r*_{s,min}, *C*_{zil}, and *fx*_{exp} are the most sensitive parameters in the Noah land surface model [*Rosero et al*., ; *Hou et al*., ]. The large number of eddy covariance towers used in our study provides evidence that these results extend to multiple climates and land cover types.

The results in section 4.2 show that calibrating *r*_{s,min}, *C*_{zil}, and *fx*_{exp} leads to noticeable improvements in both the mean bias and temporal variability. Given that three components (mean bias, temporal variability, and linear correlation) contribute to the KGE metric, if a parameter plays a role in improving the linear correlation but does not compare to how significantly *r*_{s,min}, *C*_{zil}, and *fx*_{exp} can reduce bias, this parameter could be diagnosed as insensitive. To determine if other parameters could help improve the temporal dynamics, future research should use different performance metrics to assess if the sensitivity of the model parameters varies. Furthermore, although the quality control constraints used in section 2.1.1 are able to provide high‐quality 30 min observations of latent heat, random noise will persist in the measurements [*Richardson et al*., ]. Future work should seek to understand the role that random errors in the measurements have on limiting the linear correlation between the observed and simulated latent heat.

It is important to also note that Noah's evapotranspiration module has other parameters that were excluded from the sensitivity analysis (e.g., leaf area index (LAI), the green vegetation fraction (GVF), and the soil hydraulic parameters). This was primarily due to an interest to focus on parameters that are not estimated in existing global data sets. However, the uncertainties in these data sets likely impact the overall results—the sensitivity of model parameters might be conditioned by the biases in the prescribed model parameters. Future work should address these concerns by investigating the sensitivity of the model to uncertainties in the prescribed vegetation indices and the soil hydraulic parameters. It is likely that the optimized *r*_{s,min}, *C*_{zil}, and *fx*_{exp} parameters absorb these additional uncertainties; this would help explain the lack of physical meaning in the optimized global parameter maps and would suggest that they are model dependent.

The parameter optimization results (see section 4.2) show a considerable improvement in model performance. However, these improvements are limited to bias reduction, while changes in linear correlation between the simulated and observed latent heat (see Figure ) are small. As discussed in section 5.1, it is probable that identifying the model parameters that focus on improving linear correlation would lead to further improvement in model performance. However, this approach cannot fully address one of the primary weaknesses—model structure. The three‐source model of evapotranspiration in the Noah land surface model is overly simplistic; in the case of canopy transpiration, the treatment of vegetation as a single big leaf disregards the effects that shading can play on available solar radiation throughout the canopy [*Niu et al*., ]. More importantly, the improvement in Jarvis‐type formulations of stomatal resistance is limited since they do not explicitly model the relationship between carbon assimilation and transpiration. A clear path forward is the use of Ball‐Berry type formulations. This has already been implemented in the Noah‐MP [*Niu et al*., ] and Noah‐GEM models [*Kumar et al*., ] and could be validated using FLUXNET. Finally, although the evapotranspiration module accounts for bare soil evaporation, it still disregards the interaction between the canopy and the underlying bare soil. Using schemes that account for the interactions between the canopy, the understory, and the bare soil should ameliorate this problem. Future work could use the sensitivity analysis and calibration method presented in this study with these updated schemes to assess whether addressing these known model structure deficiencies leads to improved model performance.

This study presents a path toward moving beyond the oversimplistic lookup tables in land surface models. A statistical model is used to relate existing high‐resolution environmental data to each site's optimized parameters. As shown in section 4.3, this approach can provide skillful estimates of the model parameters and subsequently improve latent heat simulations using the Noah land surface model at the 85 FLUXNET sites used in this study. These parameter maps that cover the globe have the potential to improve the modeling of the water and energy cycle and land‐atmosphere interactions. Future work should evaluate the change in model performance by running the Noah land surface model over the globe using the updated parameter maps. This will help understand the benefits of moving beyond lookup tables in land surface models.

More research is also necessary to understand the apparent lack of physical meaning in the global maps of the parameters (see Figure ) and the discrepancy between the performance of the fitted Extra‐Trees model on the training and validation data sets. This will help guide future efforts to improve the parameter calibration and regionalization algorithms. Future work should also explore using other parameter regionalization algorithms [e.g., *Samaniego et al*., ]; this approach might be more suitable to avoid the challenges of regionalizing model parameters under parameter equifinality (see section 3.4 for more details). There should also be a more rigorous selection of environmental covariates for the statistical model. The variables chosen for this study (see Table ) could be supplemented with other satellite remote sensing data sets including other MODIS variables, the recently released global Web‐enabled Landsat data, advanced very high resolution radiometer, and Advanced Spaceborne Thermal Emission and Reflection Radiometer, among others.

Future updates to the global parameters should also include an increase in the number of eddy covariance stations used to train the regionalization model—especially over the tropics and the Southern Hemisphere. The lack of sites in these regions most likely explains the large uncertainty in the parameter estimates (see Figure ). The recently released update of the FLUXNET database may ameliorate this problem by providing a more complete temporal and spatial coverage.

It is possible that the suggested improvements will not lead to substantial differences in the skill of the regionalization algorithm; this will be the case if the tuned model parameters are mostly site dependent. Future research that incorporates these improvements will provide a clearer picture of the viability of these methods to provide skillful global maps of land surface model parameters.

This study uses 85 stations from the global network of eddy covariance sites (FLUXNET) to validate and improve the parameters of the evapotranspiration module in the Noah land surface model. A comprehensive sensitivity analysis using Sobol's method shows that the model's skill to simulate evapotranspiration is strongly tied to the *r*_{s,min}, *C*_{zil}, and *fx*_{exp} model parameters. A 1000 Latin Hypercube Sample is then used to find the optimal values of these three parameters at each of the quality controlled 85 eddy covariance sites. Overall, calibration of the most sensitive parameters leads to a large decrease in biases in both the mean and temporal variability with a relatively small improvement in linear correlation. The Extra‐Trees machine learning algorithm is then used to relate the optimal parameter sets at each site to local environmental characteristics. Evaluation of the fitted Extra‐Trees model shows that this parameter estimation technique can lead to improved simulations of the latent heat fluxes at both sites that were used to train the parameter regionalization model and those that were left out for validation. Finally, this functional relationship is used to produce 5 km maps of the *r*_{s,min}, *C*_{zil}, and *fx*_{exp} parameters over the globe. The parameter regionalization technique developed in this study has the potential to enable the land surface modeling community to move beyond outdated parameter lookup tables. This research demonstrates the potential for integrating the extensive and growing network of FLUXNET observations to improve the parameterization and process representation of terrestrial evapotranspiration in macroscale land surface models.

This study was supported by funding from NOAA grant NA11OAR4310175 (Improving land evaporative processes and land‐atmosphere interactions in the NCEP Global Forecast System (GFS) and Climate Forecast System (CFS)). We wish to give a special thanks to the FLUXNET community for making the free fair‐use subset of the La Thuile database available for this study. The data used in this study are hosted at Princeton University and are available from the authors upon request (