Lidar‐measured snow depth and model‐estimated snow density can be combined to map snow water equivalent (SWE). This approach has the potential to transform research and operations in snow‐dominated regions, but sources of uncertainty need quantification. We compared relative uncertainty contributions from lidar depth measurement and density modeling to SWE estimation, utilizing lidar data from the Tuolumne Basin (California). We found a density uncertainty of 0.048 g cm^{−3} by comparing output from four models. For typical lidar depth uncertainty (8 cm), density estimation was the dominant source of SWE uncertainty when snow exceeded 60 cm depth, representing >70% of snow cover and 90% of SWE volume throughout the basin in both 2014 and 2016. Density uncertainty accounts for 75% of the SWE uncertainty for a broader range of snowpack characteristics, as measured at SNOTEL stations throughout the western U.S. Reducing density uncertainty is essential for improved SWE mapping with lidar.

In many catchments worldwide, seasonal snowpack is an important determinant of the timing and magnitude of water availability for human use and natural ecosystems. A key variable is the spatial distribution of snow water equivalent (SWE, the amount of water in the snowpack), but there is a historical lack of SWE data [*Bales et al*., ]. The high spatial variability of snowpack makes extrapolation of available SWE data problematic [*Clark et al*., ; *Rice et al*., ]. Satellite remote sensing does not provide direct measurements of SWE in all settings [*Nolin*, ; *Dozier*, ]. Hence, advances in snowpack monitoring are needed to refine understanding of SWE distributions for research and watershed operations [*Bales et al*., ; *Viviroli et al*., ].

A path for quantifying SWE variations across large basins is through measuring snow depth with airborne lidar and estimating bulk snowpack density with models (e.g., NASA Jet Propulsion Laboratory Airborne Snow Observatory (ASO) [*Painter et al*., ]). Airborne lidar can measure submeter variations in depth, with typical vertical uncertainties reported in the 2 to 30 cm range [*DeBeer and Pomeroy,* ; *Grünewald et al*., ; *Deems et al*., ; *Harpold et al*., ; *Grünewald and Lehning*, ; *Painter et al*., ]. Lidar snow depth uncertainty varies with factors such as vegetation, topography, flight characteristics, sensor specifications, and horizontal resolution of the gridded snow depth [*Deems et al*., ]. Generally, snow depth uncertainty decreases as lidar data are aggregated to coarser scales. Other depth measurement approaches [*Kinar and Pomeroy*, ; *Sturm*, ] are available at equivalent or lower uncertainty and are appropriate at different spatiotemporal scales (see Text S1 and Figure S1 in the supporting information) [*Larson et al*., ; *Varhola et al*., ; *Kerkez et al*., ; *Parajka et al*., ; *Pohl et al*., ; *Vander Jagt et al*., ; *Buhler et al*., ]. These generally are unable to match the capacity of lidar to map snow depth in space.

In contrast to this technological revolution in snow depth measurement, there has been no concurrent advance in the measurement of snowpack bulk density across space. Snow pit profiles remain the most reliable measurement of bulk density [*Kinar and Pomeroy*, ], but measurements can differ by 10% [*Conger and McClung*, ; *Proksch et al*., ]. For a 100 cm snowpack with 0.30 g cm^{−3} density, this translates to uncertainties of 0.03 g cm^{−3} in density and 10 cm in SWE. Although snow depth varies more spatially than density [*Balk and Elder*, ; *López‐Moreno et al.,* ; *Wetlaufer et al*., ], density measurement in snow pits is disproportionately limited in space relative to depth measurement. Intensive field campaigns can sample density at less than 10^{2} snow pits per day [*Elder et al*., ], whereas airborne lidar systems can sample snow depth at 10^{5} points per second [*Deems et al*., ]. Ground‐penetrating radars (GPR) [e.g., *Marshall and Koh*, ] can increase spatial sampling of density, but are not reliable in all conditions [*Lundberg et al*., ]. Furthermore, GPR remains a specialized research tool and yields data that are time and labor intensive to postprocess.

A solution is to model snow density across the domain of the snow depth data set [e.g., *Painter et al*., ]. Statistical [*Elder et al*., ; *Wetlaufer et al*., ], empirical [*Jonas et al*., ; *Sturm et al*., ; *Bormann et al*., ; *McCreight and Small*, ], and physically based models [*Jordan*, ; *Feng et al*., ; *Shi et al*., ; *Painter et al*., ] have been developed and evaluated against observations (see Text S2 for review). While uncertainty depends on the model, location, evaluation period, and metric, density model uncertainty is generally in the 0.04 to 0.10 g cm^{−3} range for root‐mean‐square differences (RMSD) and the 0.02 to 0.08 g cm^{−3} range for mean absolute differences (MAD). For context, continental snowpack density typically starts near 0.15 g cm^{−3} early in the season and increases to values exceeding 0.35 g cm^{−3} during snowmelt. The ~10% uncertainty in manual density measurements implies that minimum uncertainties in modeled density must be 0.025 g cm^{−3} or greater. Hence, uncertainty in density measurements imposes a fundamental challenge on improving snow density models.

Quantifying and minimizing the uncertainty of SWE estimated from snow depth and density requires identifying the dominant sources of uncertainty. To date, there has been little attention to the relative uncertainty contributions of depth and density. Given that density measurements are limited in number and tend to be biased toward easily accessible locations (e.g., flat forest clearings), the full range of conditions are not usually sampled. Hence, density models are difficult to test everywhere in a basin. A more spatially comprehensive approach to characterize uncertainty in density is to examine variations across multiple models, a common approach in climate and hydrology studies [e.g., *Rodell et al*., ]. This approach is appropriate for SWE estimation from lidar snow depth, as any application of this approach requires selection of a density model (and associated parameters) with limited evaluation data to guide those decisions.

Here we compare uncertainty in lidar snow depth measurement to that from modeled snow density for SWE estimation across basins. We analyze airborne lidar snow depth data from ASO near‐peak conditions during two years (2014 and 2016) over the Tuolumne River Basin (California). We consider a range of uncertainties in lidar measurement based on the literature. For tractability, we focus on density model selection rather than uncertainty due to meteorological data [*Raleigh et al*., , ] or model parameters [*Reba et al*., ]. The analysis is relevant to current efforts to map SWE with lidar, including the ASO campaign and the NASA SnowEx experiment.

We used airborne lidar snow depth and elevation data from ASO [*Painter et al*., ] over the Tuolumne River Basin (California), gridded to a horizontal resolution of 50 m (Figure S2). The stated uncertainty of the 50 m snow depth data is RMSD < 2 cm, which is at the bottom of the range in reported lidar depth uncertainty. The drainage basin has an area of 1180 km^{2} and varies in elevation from 1080 m to 3940 m and has been described previously [e.g., *Rice et al*., ; *Lundquist et al*., ].

We examined two ASO data sets for the Tuolumne Basin: 7 April 2014 and 16 April 2016. We selected early April for our analyses because this is near the typical peak in snow accumulation. After filtering out thin snow cover (<5 cm depth), the mean snow depth across the basin was 94 cm on 7 April 2014 and 117 cm on 16 April 2016. In both cases, approximately 80% of the basin had snow deeper than 10 cm. The snow volume during the 2013–2014 winter ranked second lowest in the 30 year period from 1985 to 2015 [*Margulis et al*., ], while the 2015–2016 snowpack was near average (i.e., ~10% below average).

As a measure of snowpack conditions near the 2014 ASO acquisition date, we examined snow density and SWE measured with a federal sampler at seven snow courses across the basin by the California Department of Water Resources (CDWR). The snow courses ranged in elevation from 2042 to 2987 m and were generally flat meadows. Snow course data were taken 1.5 to 2 weeks before the 7 April ASO flight. The CDWR snow course data were not appropriate for density model evaluation—approximately 50 to 150 mm (water equivalent) of new snow fell in the final week of March, followed by compaction and melt. However, these data offered a glimpse of density variations in the basin.

To place the 2 year ASO analysis in the context of a broader range of snowpack conditions, we also examined snow depth and density near‐peak accumulation from NRCS SNOTEL data [*Serreze et al*., ]. We screened SNOTEL sites with valid SWE and depth data, resulting in 811 sites and *n* = 9013 station years. For each sample with valid data, we first found peak SWE and then divided that value by depth to estimate bulk snowpack density. We excluded cases with depth less than 20 cm at peak SWE.

Estimating SWE from lidar requires selecting a density model. Every model will yield different density estimates, and thus, model selection introduces uncertainty to the final SWE estimates. To gauge snow density uncertainty due to model selection, we applied two empirical models and two physically based models to simulate spatial variations in density across the basin on the analysis dates. We randomly selected 1000 analysis points in the basin where ASO snow depth was greater than 5 cm and ran all four models at each of these points. These analysis points reasonably represented snow depth across the basin (Figure S2).

We selected the empirical models of *Sturm et al.* [] and *Jonas et al.* []. Both models require snow depth (taken from ASO lidar data) and day of year. Additionally, the Sturm model requires seasonal snow climate classification, which was taken from the *Sturm et al.* [] map at 0.5° resolution. We classified 83% of the 1000 analysis points in the maritime class and 17% in the alpine class. The Jonas model applies different parameters depending on elevation zone. Because over 99% of the 1000 points were in the Jonas high‐elevation zone, we simplified the model by classifying all analysis points as high elevation.

For physically based models, we selected Snobal [*Marks and Dozier*, ; *Marks et al*., , ] and SHAW [*Flerchinger and Saxton*, , ]. We selected Snobal for consistency with ASO model selection [*Painter et al*., ]. We selected SHAW to include a more detailed multilayer snow model of compaction and densification, in contrast to the two‐layer Snobal model. Whereas Snobal uses empirical density‐time curves to represent snow densification with compaction and snowmelt [*Sandells et al*., ], SHAW utilizes physically based parameterizations similar to *Anderson* []. The models also represent new snowfall density differently: Snobal indexes a look‐up table based on dewpoint temperature, while SHAW uses an empirical relationship with wetbulb temperature. We used standard IPW routines for computing albedo and net shortwave radiation as input into Snobal, while SHAW simulated albedo within the model. Although SHAW is more complex, model complexity does not guarantee improved representation of bulk snowpack properties.

Both Snobal and SHAW were run at an hourly time step with the same meteorological forcing data at each of the 1000 study points. To ensure identical forcing between models at each point, we disabled the canopy in SHAW and hence vegetation had no influence on model forcing or outputs. Forcing data originated from the 1/8° gridded NLDAS‐2 data set [*Xia et al*., ], which we downscaled to the 50 m ASO grid using the forcing preprocessor of the Alpine3D system [*Lehning et al*., ; *Bavay and Egger*, ] to account for fine‐scale topographic influences and gradients (see Text S4 and Figures S3 and S4). By supplying Snobal and SHAW with the same forcing, the differences in density were due to differences in snow model physics and parameters. An auxiliary analysis found that a simplified downscaling of meteorological forcing (i.e., based on lapse rates from PRISM [*Daly et al.*, ]) had a negligible effect (<0.01 g cm^{−3}) on density differences between models (see Text S4 and Figure S5).

At each study point, we calculated the “best estimate” of SWE and associated uncertainty (ΔSWE) as

*H*_{s}) and a multimodel mean estimate of density (*ρ*_{mean})

We use delta (Δ) to denote uncertainty of a variable. ΔSWE was calculated in quadrature (i.e., assuming independent errors) based on fractional uncertainties in lidar snow depth and modeled density:

From , we calculated the relative (i.e., percent) contribution of snow depth (*f*_{H}) and density (*f*_{ρ}) uncertainties to SWE uncertainty at each point. For the physically based models, we utilized lidar‐measured snow depth to estimate SWE (equation ) and ΔSWE (equation ), not the modeled snow depth.

Different metrics of uncertainty (Δ) are used in the literature for both snow depth and density. We used RMSD because it is a commonly reported metric for both parameters. Quantification of lidar snow depth uncertainty (Δ*H*_{s}) is still an active area of research and thus we considered uncertainty ranging from 2 to 30 cm. Within this range, we highlighted RMSD values reported in five specific studies, including (i) 2 cm [*Painter et al*., ], (ii) 5 cm [*Grünewald et al*., ], (iii) 8 cm [*Painter et al*., ], (iv) 17 cm [*DeBeer and Pomeroy*, ], and (v) 23 cm [*Harpold et al*., ]: the smaller uncertainties are at grid resolutions of 10^{1} to 10^{2} m and larger uncertainties are at the meter scale. The analysis focused on the median depth uncertainty (8 cm) in this range. We applied snow depth uncertainty uniformly to all points, regardless of snow depth or geophysical characteristics (e.g., elevation and vegetation). For snow density uncertainty (Δ*ρ*), we examined RMSD across all four models, an approach similar to *Rodell et al.* []. In each case, we treated the mean across all four models as the best estimate of density (*ρ*_{mean}), similar to studies that examined density measurement uncertainty [*Conger and McClung*, ; *Proksch et al*., ].

We focused on the dry, low‐snowpack of 2014, as this represented a lower bound on the effects of density uncertainty on SWE estimation (described below). The four density models yielded contrasting distributions of snowpack density across the basin (Figure ). The two physically based models produced lower mean density (Snobal = 0.267 g cm^{−3}, SHAW = 0.298 g cm^{−3}) but higher standard deviations in space (0.041 g cm^{−3} for Snobal and 0.033 g cm^{−3} for SHAW). SHAW density was greater than Snobal in part because SHAW had more snowmelt by early April 2014. In contrast, the empirical models had higher mean density (Jonas model = 0.339 g cm^{−3}, Sturm model = 0.381 g cm^{−3}) and lower standard deviation in space (0.005 g cm^{−3} for Jonas and 0.012 g cm^{−3} for Sturm). Because the empirical models were developed for regional applications, they did not explicitly represent processes that influence density at local scales (e.g., radiation variations with slope). Across all models and analysis points, the mean density was 0.321 g cm^{−3} and the mean uncertainty across density models (Δ*ρ*) was 0.048 g cm^{−3} in terms of RMSD (0.041 g cm^{−3} for MAD). The same level of density uncertainty (0.048 g cm^{−3}) was found in the near‐average snowpack of 2016 (Text S5 and Figure S6).

The CDWR snow courses from late March 2014 showed density varying from 0.292 g cm^{−3} to 0.417 g cm^{−3} (Figure c), with a mean of 0.351 g cm^{−3} and a standard deviation of 0.043 g cm^{−3}. The empirical models captured the mean CDWR density better while the physically based models captured the CDWR spatial variability better. Given limitations in the CDWR data (e.g., small sample size and differences in acquisition date), these comparisons were only qualitative.

We applied the mean multimodel density and lidar snow depth at each point to estimate SWE magnitude (Figure d and equation ) and uncertainty (Figure e and equation ). In this analysis, we assumed a lidar depth uncertainty of 8 cm. Other magnitudes of depth uncertainty are discussed below and in the supporting information (Text S6 and Figures S7 and S8). Mean basin‐wide SWE was 30.3 ± 5.6 cm (18.5% relative SWE uncertainty) in 2014. Not surprisingly, the spatial and statistical distributions of SWE were similar to that of snow depth (compare Figure S2 and Figure ). SWE uncertainty (Figure e) exhibited a similar spatial pattern as the magnitude of SWE (Figure d), and most locations (86% in 2014, 74% in 2016) had relative SWE uncertainty between 10 and 30%.

Absolute SWE uncertainty was greater at locations with deeper snow (gray points in Figure a). The fraction of ΔSWE due to Δ*ρ* was also greater at locations with deeper snow (red points in Figure a), and thus the contribution of Δ*H*_{s} to ΔSWE diminished with increasing depth (blue points). Figure a shows a specific case with Δ*H*_{s} = 8 cm at all points and Δ*ρ* characterized by the spread of the four models at each point (Figure b). We found a crossover point at a snow depth of 60 cm above which Δ*ρ* (0.048 g cm^{−3} on average) dominated ΔSWE and below which ΔH_{s} (8 cm) dominated. Seventy percent of the analysis points in the 2014 analysis and 80% in the 2016 had snow depths exceeding 60 cm and were hence in the zone where Δ*ρ* was dominant. Because the role of Δ*ρ* was greater in deeper snowpacks that also store more water (Figure a), Δ*ρ* was the most important determinant of uncertainty in SWE volume for the basin. For these density and depth uncertainties, density was the dominant source of uncertainty for 90% of the SWE volume basin‐wide (Figure c). The results were replicated in the 2016 analysis (no figures shown).

Across a wider range of uncertainties in snow depth and density, the crossover point where *f*_{ρ} > *f*_{H} shifted to lower snow depths with decreasing Δ*H*_{s} or increasing Δ*ρ* (Figure b). This corresponded to changes in the percent of basin SWE volume where Δ*ρ* dominated (Figure c). For example, Δ*H*_{s} = 17 cm had a crossover point of 125 cm snow depth (Figure b). Approximately 25% of the basin in 2014 (38% in 2016) had snow depth exceeding 125 cm, and this zone comprised 50% of the SWE volume (both years) in the basin (Figure c). If Δ*H*_{s} was reduced to 5 cm (e.g., through spatial aggregation), the crossover point reduced to 35 cm snow depth (Figure b). About 72% of the basin snowpack was deeper than 35 cm in both years, and this comprised >95% of the basin SWE volume in both years (Figure c). As accuracy in snow depth improved, density uncertainty increased in importance.

We compared the relative contributions of Δ*H*_{s} and Δ*ρ* to ΔSWE across a broader range of snowpack conditions and uncertainty levels (Figures , S7, and S8). In Figure , the red region denotes conditions where *f*_{ρ} > *f*_{H}, the blue region denotes conditions where *f*_{ρ} < *f*_{H}, and the dashed line is the crossover point (see Figure ). The gray contours show absolute ΔSWE (cm). Generally, there was a greater range of possible conditions where Δ*ρ* dominated over Δ*H*_{s} (compare red versus blue areas). In terms of absolute uncertainty, ΔSWE was greatest for snowpack with higher depth and density (Figure ). ΔSWE and the relative zones of dominant uncertainty changed with different levels of Δ*H*_{s} and Δ*ρ* (see Figures S7 and S8).

The competing contributions to ΔSWE were compared for the typical depth and density conditions found at peak SWE in the SNOTEL network (Figure ). The vast majority of observations from times of peak SWE fell in the zone where Δ*ρ* dominated ΔSWE. Approximately 70–90% of SWE uncertainty was due to density uncertainty. The SWE uncertainty was typically about 5 to 7 cm for snowpack conditions sampled by SNOTEL. The results from the 2014 and 2016 ASO surveys were within the range of the SNOTEL data. The contribution from Δ*ρ* was slightly greater in the 2016 ASO survey than for 2014. Snow depth was greater in 2016, and thus the depth errors were smaller on a percentage basis. This supported the general result that the contribution of density uncertainty to SWE uncertainty increased at greater snow depths, either in portions of a basin where snow was deep or in years when snow accumulation was higher.

Lidar makes it possible to map snow depth with uncertainty of ~10 cm across a catchment. Our results show that snow density estimation becomes the dominant source of uncertainty in lidar‐based SWE mapping when snow depth exceeds ~50 cm. The importance of snow density uncertainty increases with snow depth, greatly exceeding snow depth uncertainty in areas (or years) with deeper snowpack and hence zones with greater SWE volume (Figures and ). Snow density uncertainty exceeds depth uncertainty even in a historically dry and low snowpack year (2014). Because snow depth uncertainty is typically greater for lidar than many other techniques, density uncertainty will also dominate SWE uncertainty for estimating SWE with other approaches.

The typical snow density uncertainty across the four models here (~0.05 g cm^{−3}) falls within the range of model uncertainty reported in the literature (see Text S2 and Table S1). This level of uncertainty was found in both an extremely dry year (2014) and a near‐average year (2016). We therefore consider our results a reasonable approximation of the uncertainty expected from density estimation. This result is based on the assumption that differences between models can be used as a proxy for uncertainty and that the specific models we selected portray typical intermodel differences. In actual applications, the uncertainty in modeled density may be higher. We only considered one source of uncertainty (model selection), and ignored other factors, including canopy influences on snow density, and errors in forcing data. The physically based models typically differed by 0.06 g cm^{−3} (based on MAD), which is lower than differences documented in other studies (on the order of 0.10 g cm^{−3}) [*Feng et al*., ]. This clearly illustrates the importance of model uncertainty, suggesting that density model selection can introduce large uncertainty into SWE estimates in the absence of site‐specific tuning. Prior model intercomparison studies [*Etchevers et al*., ; *Rutter et al*., ] evaluated representation of SWE, snow depth, and energy balance variables, but there has been less attention to modeled bulk density. A more systematic analysis [e.g., *Essery et al*., ] is needed to isolate the structural and parametric reasons for these large differences in modeled density.

The uncertainty in modeled density (prior to elevation correction) documented by *Painter et al.* [] is about half of the model uncertainty reported here and is the lowest found in our literature review (see Text S2). Uncertainty in modeled density can be reduced when in situ measurements of snow density are available to tune model parameters or develop a model correction [*Painter et al*., ]. Our analysis assumed no knowledge of density conditions on the ground and hence reflected a general scenario of SWE mapping. While in situ data can constrain snow density models, there are shortcomings to this practice. First, a correction is only straightforward when the model residuals exhibit a coherent relationship with a geophysical parameter (e.g., elevation). Second, corrections are not likely transferable to other catchments. Density uncertainty is also likely to be underestimated, given the tendency for sampling locations to be biased to easily accessible flat clearings. Corrections based on these data may not be applicable to other areas in a basin. Finally, corrections are bound to be model‐specific and do little to identify specific model deficiencies.

Our multi‐model approach can spatially map model agreement (Figure b) to guide selection of field evaluation sites for more targeted testing of models. There have been few efforts to evaluate models by sampling snow density systematically across a range of model uncertainty levels and physiographic settings [*Bormann et al*., ]. With the development of technologies like lidar that measure snow depth through space, it is important to assess how models represent the mean and spatial variation of snow density for mapping SWE.

We recognize that uncertainties in lidar‐derived snow depth are not constant in space, as assumed above. Lidar uncertainty varies with landscape and measurement characteristics [*Deems et al*., ]. Uncertainty in lidar snow depth may increase by 10 cm or more in forests [*Deems et al*., ; *Harpold et al*., ]. Ongoing research is quantifying variability in lidar uncertainty across diverse landscapes. Uncertainty in snow density estimation may also be enhanced in these same areas (e.g., slopes and forests), as most prior studies evaluated density models in flat clearings, and thus more targeted evaluations are needed.

Historically, in situ measurements of snow depth have outnumbered SWE and density measurements by a factor of 30 [*Sturm et al*., ]. As the snow depth measurement revolution continues, the disparity in availability of snow depth versus density data will widen by many more orders of magnitude. Likewise, the accuracy of snow depth measurements will improve with technological advances. Considering the dominance of snow density uncertainty over depth uncertainty and the ongoing proliferation of depth measurements, advances are needed in the measurement and modeling of snow density to resolve specific landscape influences on density. More plentiful and more accurate density measurements in space are essential for process understanding and for reducing uncertainty in modeled density.

This work was supported by NSF‐EAR 1521474. The data used in the ASO analysis are described and included in the supporting information. We thank Tom Painter and the ASO team for sharing the lidar snow depth and elevation data. Thanks also to Danny Marks for providing a thoughtful review that improved the manuscript and for making the Snobal code available. Thanks to Gerald Flerchinger for making the SHAW model code available. We thank Ryan Webb, John Knowles, and Dave Barnard for providing preliminary feedback on the manuscript.