The Surface Water and Ocean Topography (SWOT) satellite mission planned for launch in 2020 will map river elevations and inundated area globally for rivers >100 m wide. In advance of this launch, we here evaluated the possibility of estimating discharge in ungauged rivers using synthetic, daily “remote sensing” measurements derived from hydraulic models corrupted with minimal observational errors. Five discharge algorithms were evaluated, as well as the median of the five, for 19 rivers spanning a range of hydraulic and geomorphic conditions. Reliance upon a priori information, and thus applicability to truly ungauged reaches, varied among algorithms: one algorithm employed only global limits on velocity and depth, while the other algorithms relied on globally available prior estimates of discharge. We found at least one algorithm able to estimate instantaneous discharge to within 35% relative root‐mean‐squared error (RRMSE) on 14/16 nonbraided rivers despite out‐of‐bank flows, multichannel planforms, and backwater effects. Moreover, we found RRMSE was often dominated by bias; the median standard deviation of relative residuals across the 16 nonbraided rivers was only 12.5%. SWOT discharge algorithm progress is therefore encouraging, yet future efforts should consider incorporating ancillary data or multialgorithm synergy to improve results.

Rivers link atmospheric, terrestrial, and oceanic processes and route approximately two fifth of the global total rainfall over land back into the ocean [*Vörösmarty et al*., ; *Oki and Kanae*, ]. In doing so, they represent an important resource for agriculture and urban development as well a major hazard during flood events. In these contexts, accurate estimation of river discharge (also “streamflow” or “runoff,” units of volume per unit time) is vital, as it quantifies the amount of water resources available for human consumption, defines the quantity of water that must be routed during a flood event, and indicates overall watershed response to atmospheric forcing. Despite its importance, our knowledge of global river discharge is surprisingly poor. This lack of knowledge represents an acute problem, given the possible acceleration of the water cycle due to global warming [*Huntington*, ]. Improved river discharge estimates, with greater spatial coverage of the global system of rivers, are needed to develop process‐based scientific understanding of runoff at large spatial scales (i.e., how water is routed into and through rivers) and to calibrate and constrain hydrologic models to forecast effects of future changes in the terrestrial hydrologic cycle. The forthcoming Surface Water and Ocean Topography (SWOT) mission measurements of river water surface elevation (WSE), top width, and free‐surface slope may allow periodic estimation of river discharge for all rivers wider than 100 m, with a goal of estimating discharge for rivers wider than 50 m [*Biancamaria et al*., ; *Pavelsky et al*., ]. Accurate discharge estimates from these measurements would enable tremendous advances in global hydrologic studies.

Given that discharge is the product of flow area and velocity, in situ measurement of river discharge requires spatially explicit measurements of the vertical velocity profile using a current meter in a transect orthogonal to river flow [*Turnipseed and Sauer*, ]. While in situ measurements of discharge can be highly accurate, they are time consuming and impractical for continuous monitoring. Such monitoring is often achieved via river gauges that leverage periodic, simultaneous measurements of river stage (height above some arbitrary datum) and discharge to develop a “rating curve.” With this rating curve defined, measurement of river stage is performed (usually) by pressure transducer, allowing for nearly continuous prediction of river discharge via the rating curve.

SWOT‐based discharge will never be a replacement for in situ discharge measurements. SWOT overpasses have a 21 day cycle, and will sample midlatitude locations irregularly in time approximately 3 times per cycle [*Biancamaria et al*., ], rather than nearly continuously, as in a gauge estimate. While this is adequate for addressing scientific questions related to the global water cycle, it is inadequate for many local‐scale questions on rivers where SWOT may not fully observe temporal dynamics [*Biancamaria et al*., ], and for which data are often required at subhourly time scales. Additionally, SWOT discharge estimates are not expected to be as precise as gauged discharge. On the other hand, the spatially continuous nature of the SWOT measurements will provide data in currently ungauged basins as well as measurements of spatially distributed phenomena such as the propagation of floodwaves along rivers [*Durand et al*., ; *Pavelsky et al*., ; *Paiva et al*., ]. SWOT will also complement river discharge modeling. Global water balance models use climate forcings to determine river runoff, but rely on gauges for parameter tuning [*Hunger and Doell*, ]. SWOT can provide discharge estimates at continental scale in order to help reduce current model annual runoff errors that range from 10 to 80%, and are commonly 40% [*Oki et al*., ; *Rawlins et al*., ; *Widen‐Nilsson et al*., ; *Gosling and Arnell*, ]. SWOT will complement both gauges and water balance modeling and can form a key component in understanding the global water budget—if discharge algorithms with sufficient accuracy can be developed.

River discharge estimation from satellite remote sensing of river hydraulic variables including width, stage, slope, surface velocity, and channel pattern has been explored and discussed in recent decades [*Smith et al*., ; *Smith*, ; *Bjerklie et al*., ; *Kouraev et al*., ; *Bjerklie et al*. ; *Dingman and Bjerklie*, ; *Bjerklie*, ; *Birkinshaw et al*., ; *Michailovsky et al*., ]. In some of these studies, altimetry measurements were utilized to estimate discharge with a rating curve available from an in situ discharge gage [e.g., *Kouraev et al*., ]. Other studies envisioned estimating discharge in ungauged rivers, and utilized traditional flow laws where only a subset of the hydraulic quantities specified by the flow laws were observed remotely, and pointed to the need for estimating the unobserved variables, such as roughness coefficient or river bathymetry [e.g., *Bjerklie et al*., ]. *Roux and Dartus* [] proposed methods for estimating unobserved hydraulic parameters from water surface width observations, on a synthetic case and using observed maximum flood extents. *Roux and Dartus* [] estimated a synthetic flood hydrograph by minimizing the distance between flood extent observations and 1‐D model outputs, assuming the channel geometry and roughness are known with a given uncertainty. *Lai and Monnier* [] assimilated water levels into a 2‐D model and demonstrated that the inflow hydrograph could be estimated. The anticipated SWOT data have sparked development of new efforts to develop discharge estimation methods (discussed in detail, below), which draw from these previous studies and decades of heritage in the fields of hydraulics, remote sensing, and fluvial geomorphology.

Despite the future promise of the SWOT mission, discharge algorithms designed to utilize its data have not been systematically tested across a range of river types. *Bonnema et al*. [] have made one such test, but only for three rivers in the same basin. In this paper, we compare six discharge algorithms designed for SWOT output (averaged observations of WSE, slope and inundation area over ∼10 km reaches [*Fjortoft et al*., ]) to estimate river discharge. Algorithms include the previously published at‐many‐stations hydraulic geometry (AMHG) method [*Gleason and Smith*, ; *Gleason et al*., ], GaMo [*Garambois and Monnier*, ], MetroMan [*Durand et al*., ], and the novel mean flow and geomorphology (MFG) and the mean flow and constant roughness (MFCR) algorithms, in addition to an ensemble median product. We use synthetic daily observations of WSE and width generated from hydrodynamic models for 19 rivers covering a wide range of hydrologic regimes as a stand‐in for SWOT data. We first describe each of the algorithms in detail and then compare their estimation results before concluding with a summary of algorithm strengths and weakness and a prognosis for the future SWOT mission.

We used hydrodynamic model output from 19 rivers to assess algorithm performance in this study (Figure ). For each river, daily synthetic measurements of water surface slope, elevation, and top width corresponding to different flows were generated by a hydraulic model forced by in situ bathymetry, simulated or gauged inflows at the top of the reach, and downstream water elevation boundary conditions. The philosophy of the experiment design was to evaluate discharge estimation under essentially ideal conditions: e.g., daily observations were utilized, although SWOT will measure less frequently—see *Biancamaria et al*. [] for discussion of SWOT space‐time coverage. Moreover, only minimal random measurement errors were added to the height, width, and slope time series. Future studies will build upon what is shown here to examine the effect of SWOT measurement errors, and space‐time sampling.

In some cases, multiple models on the same river have been used (e.g., an upstream and downstream reach of the Garonne River are both included); each of the 19 is hereafter referred to as “rivers” as each is an independent set of data representing a different range of hydraulic conditions that can be used to evaluate algorithm performance. Outputs from six different modeling platforms were used in this study, including HEC‐RAS (12 rivers) [*Brunner*, ], LISFLOOD‐FP (2 rivers) [*Bates et al*., ], H2D2 (2 rivers) [*Heniche et al*., ], BreZo (1 river) [*Kim et al*., ], Mascaret (1 river) [*Goutal and Maurel*, ], and ProSe (1 river) [*Vilmin et al*., ]. Table provides references for each individual model, and Table summarizes simulation time, reach lengths, and hydraulic characteristics for the 19 rivers in this study.

The models used here were developed for purposes other than those in this study, and so each uses different procedures to generate the hydraulic variables of interest and each is run with different spatial and temporal resolutions. A full discussion of the differing model solution schemes and solvers is outside the scope of this paper, and the interested reader is referred to the citations for each model given above and in Tables and for further information. However, there are a number of core similarities across all of the models. All of the models represented study reaches with discrete units: either cross sections perpendicular to flow or 2‐D grid elements. At each of these units, the models solve for conservation of energy or momentum, using variants of either the 1‐D or 2‐D St. Venant or shallow water equations. Note that all models were built using field measurements of bathymetry. In all cases, flow boundary conditions from a stream gauge were imposed at the top of the reach and a water surface elevation or “free‐surface slope” boundary condition imposed at the downstream of the reach, allowing the models to apply their particular solver to attain a stable solution for the hydraulics of each solution unit. Models were calibrated by adjusting the roughness coefficient in order to match water surface elevation and discharge measurements. These numerical solutions to the conservation equations, coupled with the model's geometric representation of the river channel, yield the hydraulic parameters of interest for this study at each unit: water surface elevation, top width, and water surface slope. In the case of multiple channels, hydraulic quantities were summed across channels to derive a single value, which matches *Schubert et al*.'s [] “integrated” form for multiple channels. Reaches on the order of 5–10 km were defined based on inflection points in the water surface elevation data; methods to automatically identify optimal reach boundaries are needed, and are currently in development. Reach‐averaged hydraulic quantities were utilized in the Manning's‐based discharge algorithms, while cross‐section data were used directly in the hydraulic geometry‐based approach. Time series of simulated river height, width, and slope for all rivers were produced by adding Gaussian errors with standard deviations of 5 cm, 5 m, and 0.1 cm/km, respectively; time series of height, width, and slope are shown in Figures , and , respectively. The error standard deviations are admittedly somewhat arbitrary, but were chosen to be small enough to resolve temporal variations visible in Figures .

Hydraulic regime of each river reach was characterized by two nondimensional numbers: the Froude number and the kinematic wave number, as defined by *Vieira* [], and used, e.g., by *Trigg et al*. []. These nondimensional numbers were calculated for each day for each reach, then averaged in time for each reach. The mean Froude and kinematic wave numbers across all reaches, as well as the minimum and maximum values across all reaches, are shown in Table for each river. Based on comparison with the regime diagram shown in *Vieira* [], the hydraulics of nearly all reaches can, perhaps unsurprisingly, be considered to be diffusive. The only river that can be considered kinematic is the Platte River, which exhibited both the highest average Froude and kinematic wave numbers (0.28 and 426.6, respectively). All reaches exhibit subcritical and gradually varied flow, and no reach showed fully dynamic wave behavior.

While each data set corresponds to a particular reach of a real‐world river, our use of simulated inflows and often sampled (rather than acquired via side‐scan sonar) or simplified channel and floodplain geometry results in model outputs that may not necessarily represent true hydrologic conditions for each reach: these data are model‐simulated representations of fluvial behavior rather than observations. This is an important distinction for this study, as using these model outputs allowed us a large degree of control over algorithm inputs and allowed us the ability to test algorithm performance without considering the effect of measurement error or noise on the data. Further studies utilizing field data and airborne swath altimetry are in process. Despite this caveat, validation and benchmark tests of hydraulic models built, calibrated, and validated using field and remotely sensed observations do show surprisingly good and consistent performance [e.g., *Hunter et al*., ], with water elevation predictions accurate to <10 cm [e.g., *Jung et al*., ] and inundation extent prediction accuracies up to 90% [*Bates et al*., ].

The experiment here is purposefully highly restrictive, as every algorithm under consideration was required to use the exact same input data and no river‐specific assumptions or ancillary data (aside from mean annual flow, as described below) were allowed. All the discharge algorithms (reporting on a given river) employed identical station data and reach lengths over the same period of record (although note that AMHG operated on station data, whereas the other algorithms utilized reach‐averaged data). For the future SWOT mission, and in other river discharge studies, ancillary data will form one of the pillars of discharge estimation: it is foolish to not leverage all available information. However, in this study, we seek to answer the most basic of questions regarding discharge algorithm performance, and therefore to make a fair comparison all methods are restricted to the same input data. This enables an honest assessment of the base principles of discharge estimation solely from remotely sensed data.

We assessed discharge estimation performance according to a suite of nine error metrics proposed by *Bjerklie et al*. []. Of these, we found that the RRMSE, mean of the relative residuals (MRR), and the standard deviation of the relative residuals (SDRR) to be of greatest value in discriminating algorithm performance. These metrics allow fair comparison between rivers and allow us to assess how much total discharge error was due to bias; e.g., a large MRR with low SDRR indicates that an algorithm correctly matched river dynamics but exhibits an offset from true flow. We computed all three of these metrics by first computing a spatial average of discharge across all reaches, for each time step, and in each algorithm. We compute these three statistics as follows: relative residuals are first computed via: ^{2} = MRR^{2} + SDRR^{2}, which allows discussion of the proportion of the RRMSE due to bias versus time‐varying errors.

Beginning approximately a decade ago, several approaches were developed using virtual SWOT observations to test discharge estimation schemes under the aegis of the SWOT virtual mission. *Andreadis et al.* [] and *Biancamaria et al*. [] assimilated virtual SWOT observations into a hydraulic model, assuming that river bathymetry and the Manning's friction *n* were known. In further developments, *Durand et al*. [] and *Yoon et al*. [] used assimilation approaches to estimate river bathymetry and discharge simultaneously; *n* was assumed to be known a priori [*Yoon et al*., ] or known to vary within a relatively small range [*Durand et al*., ]. The computational burden of these data assimilation schemes and difficulty in estimating bathymetry, roughness, and discharge within the assimilation scheme led to a search for simpler methods, which would be more amenable to global application, despite the continued importance of assimilation in regional and local‐scale applications. *Durand et al*. [] first showed that river bathymetry and depth could be estimated without running a hydraulic model via analysis of a virtual SWOT observation time series, if *n* were assumed known. This represented the first demonstration of river discharge estimation with minimal a priori data requirements. *Mersel et al*. [] then demonstrated that SWOT observations could be used to estimate channel bed elevation, provided observations were made at enough different stages, yielding another means of obtaining prior information. Following these early developments, there are now five proposed algorithms for use in the SWOT mission; the median of these five is evaluated as a sixth algorithm herein. The basics of the algorithms are summarized in Table ; each algorithm is described below.

At‐a‐station hydraulic geometry (AHG), equations , where *a, b, c, f, k*, and *m* are empirical best fit parameters, were first described by *Leopold and Maddock* [], and are an often‐used framework in river remote sensing [e.g., *Smith et al*., ; *Smith and Pavelsky*, ; *Pavelsky*, ]. AMHG is a recently discovered geomorphic phenomenon holding that the coefficients and exponents in traditional AHG are stably and predictably related for a given river, thus linking individual cross sections to one another along a river [*Gleason and Smith*, ]. Gleason and Smith found that the relationship between these quantities takes a semilog form, and used in situ measurements of width (*w*), depth (*d*), velocity (*v*), and discharge (*Q*) to demonstrate the phenomenon. *Gleason and Wang* [] further showed that AMHG arises because individual, independent rating curves all pass through the same values of *w* and *Q*, given in practice by the spatial modes of time mean *w* and *Q* at each cross section.

The AMHG discharge algorithm's base assumptions are that AHG parameters are constant in time and that mass is conserved in a reach. Discharge is calculated by estimating *a* and *b* in equation and inverting to solve for *Q* in a pairwise permutation assuming that *Q* is constant across all pairs. Thus, this algorithm only requires inputs of remotely sensed width at‐a‐station, which differentiates it from the other methods in this study. From a remotely sensed standpoint, this system is underconstrained even in a mass conserved reach: there are four variables per cross section (i.e., *w*, *Q*, *a*, and *b*), and only one of them can be remotely sensed (*w*). Thus, when assuming cross sections share a common discharge given a time series of width observations, there are always 2*N _{c}* + 1 unknowns for

Importantly, *Gleason et al*. [] recommended “global” parameterizations of the AMHG method for use in ungauged basins, and we follow these parameters here. *Gleason et al*. [] also showed that rivers in arid regions, braided rivers, and low‐*b* rivers (where all AHG *b* exponents are less than 0.1) reliably resulted in poor discharge inversion. We expect that these same exclusions will apply here, so we also include some “blind” data filters intended to improve AMHG performance. Since equation breaks down during overbank/floodplain flow, we first filter any widths that are 1–3 standard deviations above the mean, depending on the shape of the distribution of input width data. Second, in order to increase the amount of rivers available to AMHG, we filter out cross sections that have a coefficient of variation (standard deviation divided by mean) in observed widths less than 10%; this is a much less stringent filter than is typically used, although it still results in exclusion of 8 of the 19 rivers by AMHG. Thus, we are able to estimate discharge for some rivers whose width data would suggest that they are unsuitable for AMHG estimation (as they are low *b*).

Both the GaMo and MetroMan algorithm described later utilize the following form of Manning's equation:*n* is the Manning's roughness coefficient, *n* and *n* is assumed to be a constant in both space and time, while *Garambois and Monnier* [] showed that including inertia terms in the inverse hydraulic model would require high spatial resolution (i.e., subreach) knowledge of the river bathymetry profile.

The GaMo algorithm described by *Garambois and Monnier* [] invokes continuity among reaches, assumes that flow is constant in space (*N _{R}* and

MetroMan begins with the same governing equation as GaMo, equation , and is similar to GaMo in that it analyzes a time series of height, width, and slope in order to optimally estimate *n* and *Durand et al*., ]. A difference between the two comes in the formulation of the set of equations to be minimized, where equation is substituted into the reach‐averaged continuity equation *q* represents lateral inflows into the reach; for all cases considered herein, *q* is zero. Note that *W*. This formulation leads to a set of equations with more constraints than unknowns when Manning's equation is substituted into the mass balance equation.

To solve for the optimal *n* and *Metropolis et al*., ; *Gelman et al*., ] is invoked, to obtain both an optimal parameter estimate and an estimate of uncertainty. A Markov Chain Monte Carlo (MCMC) approach is utilized. MetroMan was demonstrated on the Severn River using water elevation measured at three gauges and cross‐sectional area measured in situ [*Durand et al*., ]. Several experiments with different assumptions regarding *q* led to relative RMSE ranging from 10% to 36%. A follow‐on study compared and contrasted algorithm performance on the Sacramento and Garonne Rivers with synthetic data and demonstrated that discharge can be estimated to within 9% and 15% RMSE with MetroMan, and also demonstrated different sensitivity to measurement errors [*Yoon et al*., ]. The MetroMan approach has been tested on a small range of rivers with good success in previously published literature, so success was expected here. Note that no roughness coefficient variability with stage is included within the algorithm, so when those conditions exist (e.g., during out‐of‐bank flow events), the algorithm is not expected to perform well.

We changed two relatively minor components of the algorithm for this study. First, MetroMan was adapted to utilize a prior estimate of mean annual flow and *n*; previous studies utilized prior estimates of *A*_{0} and *n*, directly. A prior *n* value of 0.03 was assumed for each reach. The prior mean and standard deviation of *A*_{0} were calculated for each reach using the same approach as the MFCR (described in section 3.6). This was done using a Monte Carlo method, assuming that uncertainty in the mean annual flow estimate followed a lognormal distribution with coefficient of variation equal to one. Possible mean flow values were simulated from this distribution; for each possible mean flow value, a corresponding *A*_{0} value was calculated by solving equation , yielding a mean and standard deviation for *A*_{0}. Values of *A*_{0} are furthermore limited to be greater than the minimum *A*_{0}; this has the nontrivial advantage that base flow is the same for all reaches within a given river, and therefore many fewer iterations are required to explore a wide range of flow values. The prior for base flow is calculated from the prior on mean flow, using similar methods to those described for *A*_{0}.

The MFG algorithm uses the so‐called wide‐channel approximation [*Tinkler*, ], leading to a form of Manning's equation that approximates river depth as the difference between WSE (*H*) and the cross‐sectional average river bathymetry (*B*):

MFG assumes that an acceptably accurate estimate of mean annual flow will be available for SWOT rivers. Mean annual flow values are calculated by averaging daily streamflow predictions of the water balance model (WBM) [e.g., *Wisser et al*., ] at a spatial resolution of 6 min, from 1961 to 2010. Grid resolution is on the order of 100 km^{2}, and thus this model is adequate to provide a mean annual flow estimate for rivers SWOT will observe, which typically have a drainage area on the order of 10,000 km^{2} or greater [*Pavelsky et al*., ]. An estimate of the roughness coefficient is computed from a relation that scales a mean value of the roughness from observations of width and stage. The relation, given by equation , was derived from analysis across a number of USGS field data and gauges:*c*_{0} and *x* are empirical coefficients, *n*_{0} is a static, reference value of *n*, and the overbar indicates time averages. The value of *x* is typically less than zero, such that *n* increases at low flow, in agreement with expectations. The MFG algorithm is not applicable during overbank flows, and does not allow, e.g., *n* to increase at high flow to account for out‐of‐bank conditions. Values of *x* are calculated from time series observations of *w* and *H*; *B* is calculated in order to match the time series of discharge estimates with and expected value of the mean annual flow, derived in this study from WBM.

MFG relies on a series of observations of width and stage that covers a range of flow conditions such that mean values of the flow dynamics begin to be approximated.

Equation was developed from data base of streamflow measurement data published in *Bjerklie et al*. []. With these relations, the variable Manning n and cross‐section shape are estimated and the value of *B* is calibrated by fitting the mean of the observed time series estimates of discharge to the mean discharge obtained from a water balance model or some other independent source.

Both the MetroMan and GaMo algorithms perform optimization not on the entire time series, but rather on a subset. This is done in general to reduce the computation costs of matrix inversions [see *Garambois and Monnier*, ; *Durand et al*., ]. In this paper, the identification periods for both algorithms for each river are chosen in order to include significant changes in stage, and to avoid extreme out of bank events. Identification periods are given for each algorithm in Table . For a given river, *A*_{0} and *n* inverted from the identification period are then used to calculate *Q* over the whole time series.

The “mean flow with constant roughness” (MFCR) approach simply assumes that *n* is 0.03 and uses the WBM mean annual flow estimate described in the previous section. Then, the MFCR algorithm estimates *A*_{0} in equation such that Manning's equation applied to the average height, width, and slope gives discharge mean to equal the WBM prior mean, subject to the constraint that the sum of *A*_{0} and any of the *Smith et al*. []. The second additional approach is an ensemble product, calculated by taking the median of the flow produced by each of the five algorithms at each time step. We hypothesize that this ensemble product will be more stable than any single algorithm, and discharge estimated this way should be the most consistent from river to river.

Time series of river discharge estimated on each river are shown in Figure . Table summarizes the performance of each of the six algorithms for every river according to each of the three metrics described in section 2. RRMSE is perhaps the most important assessment metric, as this error determines how well each algorithm performed across the entire time series of observations. Thus, RRMSE can be considered the central “accuracy” metric in this study, and is directly comparable across scales for the 19 rivers here. We use an RRMSE of 35% as a threshold for algorithm performance; this number is admittedly arbitrary, but was chosen to improve upon the accuracy values cited for global models (40%) in the introduction. RRMSE values ranged from 5% to greater than 100% across all rivers and all algorithms. Viewed collectively, at least one algorithm had an RRMSE less than 35% for 14/19 rivers, and the grand median RRMSE (across all six algorithms and 19 rivers) was 55%; note that some of the algorithms yielded highly biased results, such that the mean across all rivers and algorithms is 87%. Five of the six algorithms had the best estimation RRMSE for at least one river, with MetroMan having the best performance for eight rivers, MFCR performing best for four rivers, GaMO and MFG performing best for three rivers each, and AMHG giving the best estimate on one river. Interestingly, the ensemble median algorithm did not give the best result on any river, despite selecting the median flow value of the other five algorithms at every point in time. At least one algorithm had an RRMSE greater than 100% in 9/19 rivers. Figure summarizes and compares the performance of each algorithm.

Bold typeface indicates best performer among algorithms for each river and each metric.

Braided rivers were particularly difficult to estimate, as none of the Ganges, Platte, or Tanana Rivers had an algorithm register an RRMSE less than 35%. In addition, the Ganges and Platte Rivers each had three algorithms record an RRMSE greater than 100%. This mirrors the findings of *Gleason et al*. [], and further confirms the difficulty of estimating discharge via remote sensing in braided rivers. Removing these braided rivers from consideration results in 14/16 rivers having at least one discharge RRMSE less than 35%.

The Cumberland and Kanawha Rivers are the two nonbraided rivers for which no algorithm hits the 35% RRMSE mark or better. The case of the Cumberland River highlights the unforgiving nature of the RRMSE metric in characterizing algorithm performance in periods of low flows. As can be seen in Figure , beginning from approximately day 90, discharge was reduced to less than 250 m^{3}/s, though discharge had averaged 1481 m^{3}/s from days 1 to 90, and all of the algorithms capture this rapid drop in flow. For example, from days 1 to 90, MetroMan performance (characteristic of other algorithms for the Cumberland) was arguably quite good: MRR was −21.8%, and SDRR was 2.8%, giving an RRMSE of 22.0%. From days 91 to 162, MetroMan performance was quite poor: MRR was only −5.2%, but SDRR was 62.5%. Indeed, from days 91 to 162, low‐head dams on the Cumberland constrained the slope to less than 1 cm/km, as shown in Figure : the average slope from days 91 to 162 was 0.4 cm/km. These low‐slope conditions generally were still handled fairly well by Manning's equation; note that the upstream reaches (e.g., reach 1 on the Cumberland in Figure ) have steeper slopes than downstream, a fact that the algorithms can exploit. However, for a period of 20 days (days 125–144), the slope dropped to below 0.1 cm/km for all four reaches, and was sometimes negative; the average during this time was 0.07 cm/km. Recall that a slope error of 0.1 cm/km was added to all slope observations. Manning's equation led to very high relative errors, as both slopes and flows approached zero, even though absolute errors were quite minor, during this time. This combination of hydraulic conditions meant that overall RRMSE for MetroMan was 44.6%. In contrast to the Cumberland, the SDRR for the Kanawha was generally negligible for all algorithms; the RRMSE is dominated by bias, with MetroMan again representative, with an MRR of −59.6%. The reason for this poor performance can likely be attributed to a poor prior estimate of flow. The MFCR in this case has an MRR of −60.2%; this is the most negative estimate except for the Tanana River. Sensitivity tests have shown that the MetroMan estimation ability is especially impacted by low‐biased prior estimates of mean streamflow. Thus, MetroMan estimation accuracy is expected to be better for a prior with a positive bias than for a negative bias. AMHG was not run for the Kanawha, as only 1.5% of cross sections passed the low‐b filter. This combination of a poor, negative bias in the prior, and the inability to use AMHG, highlights a weakness in the overall set of algorithms.

In addition to overall performance, the stability of the algorithms across rivers is of critical importance to the SWOT mission and to discharge inversion more generally. Stability is assessed by SDRR and its relationship to the MRR per river, and also by summarizing RRMSE across rivers. For all 19 rivers, at least one algorithm had an SDRR less than 31%. We had hypothesized that the ensemble median would be the most stable algorithm, and indeed it had an SDRR less than 30% in 13 cases, tying for best SDRR performance with GaMo, which also had 13 rivers less than 30% SDRR. For the ensemble median algorithm, the median SDRR value across the 16 nonbraided rivers was 12.5%. Considering the summary metrics in Table , MFG and MFCR, two approaches that require an a priori flow, performed quite well; one of these two had the lowest SDRR for 8/19 rivers.

Overall, most algorithms had an even distribution of negative and positive residual bias across rivers, as indicated by MRR. However, AMHG had a positive bias for 7/11 rivers (note that eight rivers were considered unsuitable for AMHG estimation as they were either braided or did not pass the width variability filters described in section 3.1), and MFG had a strong negative bias, overestimating flow in 16/19 cases. The difference between the WBM mean annual flow estimate and the mean true flow for each river (see Table ) can be compared with the MRR results in Table . Across all 19 rivers, the median WBM bias is −37.9%. Five of the six algorithms (AMHG, GaMo, MetroMan, MFCR, and the median) have a median MRR that is less than WBM.

Figure shows that certain rivers were more easily estimated than others. In particular, all algorithms had an RRMSE less than 45% for the Seine, confirming it as the most easily estimated river. In addition, we might expect that the MetroMan and GaMo algorithms would perform similarly from river to river, as each is based on solving for unknown parameters in Manning's equation. This is confirmed in our results, as these two algorithms both had an RRMSE less than 35% for the Downstream Sacramento, Seine, Po, Upstream Mississippi, and Ohio Rivers (Figure ). Also, while estimations from the Cumberland and Kanawha rivers were poorer than other rivers, all algorithms but AMHG had very similar estimation accuracies in these rivers. Beyond predictably poor performance in braided rivers, there are no apparent trends in algorithm performance based on river size, flow range, WSE variability, latitude, or hydraulic regime. A possible exception here is the Platte, for which algorithms performed quite poorly, and was the only kinematic river analyzed. Note that kinematic rivers show no time variability in water surface slope, which for some algorithms similar to those tested here leads to poor performance [*Durand et al*., ]. However, since the Platte is also braided, and other kinematic rivers were not included, no firm conclusion can be drawn; further work is required. Given that the data here are models of rivers forced with imposed flows, comparison of algorithm performance against morphology is inappropriate. However, *Gleason et al*. [] found no apparent trends in estimation accuracy for the AMHG algorithm based on river morphology in their study, so the lack of clear physical controls on discharge estimation accuracy is perhaps unsurprising.

This discharge algorithm intercomparison has highlighted both successes and failures of algorithms thus far. First, we have seen that there is always an algorithm that estimates discharge to within 35% or less RRMSE for 14/16 nonbraided rivers. Second, bias has proven itself to be a significant component of these errors; the median algorithm had an SDRR less than 30% for 13/19 rivers, but an RRMSE less than 30% for only 5/19 rivers. Third, we have seen that one single algorithm has not emerged as the ideal approach. Instead, we hypothesize that the community will likely require moving forward with multiple approaches. These results can be used to map future algorithm developments within the SWOT community.

Simply put, the addition of more a priori data and site‐specific assumptions should make each of the algorithms more effective. However, this study was truly blind. Thus, in all cases, there was information we knew about certain rivers that we purposefully did not include in our algorithms: all methods received the same data and were forced to make the same blind assumptions. Some prior estimate of roughness, AHG, discharge, or some other variable is available for every measured or modeled river on the planet. Leveraging these data should improve the performance of the methods here, as will using these methods in conjunction with hydrologic models in assimilation schemes, which is the likely way forward for using SWOT to estimate global river discharge.

The variable nature of algorithm performance in this study confirms the difficulty of the problem and the value of such a large‐scale comparison. Each of the AMHG, GaMo, and MetroMan algorithms performed worse than their previously published accuracies in some cases, although *Gleason et al*. [] also found varied performance for the AMHG algorithm in their study of 34 rivers. These mixed results occur despite the fact that inputs to the algorithms were “perfect,” i.e., not degraded to match expected temporal sampling and observation error likely from SWOT. *Yoon et al*. [] found that MetroMan accuracy was not significantly affected by using SWOT temporal sampling and expected measurement errors on the Sacramento River as compared with daily sampling, but this result will likely vary among algorithms and rivers. When faced with real‐world data, these algorithms will behave differently, and likely worse, than they have here if algorithms are not improved or ancillary data as described above are not included.

Our results strongly point to the need for algorithm synergy when conceiving of a SWOT discharge product that will be viable for all of the world's rivers. We have tested the easiest likely approach for this synergy (the ensemble median), but each of these algorithms should be able to inform and constrain one another. For example, we may find that some algorithms tend to perform poorly for certain kinds of rivers, and in these cases, we ought to exclude those algorithms from the ensemble. As another example, it may be beneficial to create hybrid approaches built by combining components of the various algorithms. For example, the variable *n* of MFG could be combined with MetroMan or GaMo. Exploring these kinds of synergy should be a key goal of future algorithm development, together with identification of the best ancillary data for use with these methods.

It is critical to begin to be able to predict algorithm performance based on river characteristics. Perhaps surprisingly, the hydraulic regime analysis (Froude and kinematic wave numbers shown in Figure ) did not correlate with algorithm performance, except that the one truly kinematic river (the Platte) performed quite poorly. We continue to explore this by discussing several of the cases included in the test cases: floodplain flow, low‐head dams, and multiple channels. It had been hypothesized prior to this algorithm intercomparison that performance would suffer during floodplain flow. An interesting case in this regard is the Po River. Figure clearly shows out‐of‐bank flows: some reaches increase their width by an order of magnitude during the high flow event following day 300 of the simulation. The MFG algorithm is formulated with the assumption of in‐bank flow, and overestimates flow by a large margin during this period; this leads to an underprediction of flow during the low‐flow periods and an overall MRR of −66.1%, and an RRMSE of 73.6%. However, the GaMo and MetroMan algorithms perform well, with RRMSE of 29.8% and 13.9%, respectively. Thus, even the simple Manning formulations appear in‐and‐of‐themselves to be adequate even in the presence of significant floodplain flow. This is in spite of the fact that the calibrated hydraulic models used to generate the synthetic observations in this study typically used different values of *n* in the floodplain versus the channel, a fact not accounted for in the discharge algorithms. As a caveat to this point, it should be noted that the Po River model is built in 1‐D; thus, we can really only conclude that 1‐D floodplain flow is resolved by these algorithms.

Performance in the presence of low‐head dams and human management was an open question prior to this study. Significant dams are present on the Ohio River, leading to quite low slopes, as shown in Figure . However, the MetroMan algorithm performs adequately on the Ohio, with RRMSE = 33.5%, and SDRR = 11.5%. Manning's equation captured the friction losses, even in this case with low slopes, and significant backwater profiles. Performance in the case of multiple channels was also expected to degrade performance. The Seine is an example of a multichannel river; throughout the domain utilized here, the river is often two channels. The simplest solution of merging the top widths and averaging the water surface heights [*Schubert et al*., ]) led to adequate performance, here: AMHG, GaMo, and MetroMan had RRMSE of 33.9%, 22.5%, and 9.1%, respectively. Thus, the presence of multiple channels is not necessarily a cause of a great deal of algorithm performance degradation.

Finally, one source of error not considered or evaluated here is error in the hydraulic models themselves. As each model is comprised of approximate hydraulic solutions imposed on in situ data of varying quality at varying spatial sampling and with simplified physics compared to real‐world flows, it is quite likely, indeed almost certain, that reported widths and WSE values corresponding to reported discharges contain some error. How this error contributes to discharge estimation will vary by algorithm and by river, and assessing this effect on our estimations is well outside our purposes here. However, to the extent that the flow laws as implemented in the algorithm represent reality (such as with the empirical components of MFG), this effect should contribute little to discharge algorithm error.

AMHG performed much more poorly in this study than in previous publications of the method, and this can be almost exclusively attributed to issues caused by the greatly increased number of observations used here. Previously, the method had only been tested with a maximum of 20 observations, and the increase to yearly or longer periods of observations caused significant issues with both AHG and AMHG. This increased number of observations also rendered the previous remotely sensed proxy for AMHG [*Gleason and Smith*, ] ineffective at adequately characterizing AMHG, as it is mathematically constrained to a value of 0.5 when there are order‐of‐magnitude changes in observed widths, which occurred in numerous study rivers. An incorrect AMHG proxy gives an incorrect relationship between AHG parameters, and therefore inverted AHG curves are constrained to a hydraulic space outside observed values. In addition, the minimum and maximum imposed flows in the AMHG algorithm were developed and tested on rivers other than those here (*Gleason and Smith* [] give “global” values of *Q _{min}* = minimum observed width × 0.5 m depth × 0.1 m/s velocity,

There are also hydrologic and geomorphological reasons why more observations lead to poorer AMHG performance. For example, *Bonnema et al*. [] show that for the Ganges River, AMHG exhibits a sharp improvement in accuracy when wet and dry seasons are estimated separately, while *Gleason and Hamdan* [] show the same result for the Ganges (using different data from *Bonnema et al*. []), noting a reduction from 56% to 28% RRMSE when considering dry‐season flows only. This is because these wet and dry flows (or flood and nonflood regime for temperate rivers) exhibit different AHG behavior and have much different time mean *w* and *Q* values, leading to a situation where discharge cannot be inverted [*Gleason and Wang*, ]. Additionally, when flows vary widely, AHG breaks down and must be represented with multiple power laws: one for each distinct channel geometry. Thus, it is recommended that AMHG be performed in a stepwise manner when many observations are available, as binning observations into those with similar magnitude widths should assure a single AHG curve per bin. As with all other algorithms, including these kinds of river‐specific constraints will improve AMHG performance.

The GaMo algorithm relied upon *n* given usual values found in the literature [e.g., *Chow*, ], and for low flow cross‐sectional area from observed cross‐section variations. Presumably, it is because of the use of these bounds that some of the very large RRMSE values observed with MetroMan are not observed with GaMo (compare Table ), even though they are based on similar hydraulic flow laws. Note too, that despite relying on

Future improvements to GaMo could involve adding additional constraining equations, or adding additional prior information, for example, to reduce the optimization zone in hydraulic parameter space. If discharge was supposed constant in space, the number of unknowns is *Garambois and Monnier*, ]. This approach could be of interest for (large) rivers with small changes in flow between several reaches. *Garambois and Monnier* [] also derived a robust and accurate inference method in the case that one in situ water depth measurement is available. Indeed, in this case, an explicit expression for the channel bed elevation is available as a function of the water surface slope and width, independent of *n*. Next, given this channel bed elevation, an approach similar to GaMo allows accurate computation of the inflow discharge.

This study highlighted one limitation of MetroMan in its current formulation, namely its sensitivity to the prior estimate of discharge. From a Bayesian point of view, this is not a drawback, but a philosophical decision: if prior information about the river is available, it ought to be used, as in any of the algorithms. However, this does represent a limitation in this study; if the prior estimate of discharge is poor, then the MetroMan discharge estimate will also be poor. This places MetroMan in a middle ground of the algorithms, as AMHG utilizes no prior information and MFG and MFCR completely rely on the prior estimate of flow. Having these different philosophies as part of the overall algorithm suite is important at this point in discharge algorithm development. Interestingly, MetroMan appears to be more sensitive to an underestimation of the prior mean annual flow than to overestimation. Sensitivity tests on the Po, Sacramento Downstream, and the Tanana indicated that MetroMan performs far better for a 50% overestimation of mean annual flow than it performs for a 50% underestimation. The cause of this behavior will be explored in future work, and could in principle be addressed by running several MCMC chains with different mean values.

An additional drawback of MetroMan (and indeed MFCR and GaMo, as well) is that it assumes that *n* (or AHG in AMHG) does not change in time, i.e., with flow depth. Thus, any changes in bed or bank material, increases in debris, or reorganization of channel geometry will lead to suboptimal discharge estimation. This decision was originally made in algorithm formulation in order to limit the number of unknowns, but it agrees with the hydraulic models in this study. In most of these models, *n* does not vary in time at each cross section, except for marginal changes during out of bank flow. Therefore, this assumption is secure for these model data where debris and changes in substrate are generally not considered, but less secure for future observational data.

An additional drawback is that MetroMan invokes Manning's equation at a river reach, not at a cross section. It is possible that Manning's equation does not hold for reaches, even when using the true reach‐averaged roughness coefficient, and even if Manning's equation holds at each cross section. To analyze this, we estimated the “effective” reach average roughness coefficient for each river by calculating it at each time step from reach‐average hydraulic quantities:

In Figure (left), the effective roughness *n*; the former is calculated from (8), while the latter is calculated as the average value of the roughness coefficient across all cross sections in a given reach. As flow decreases,

Figure (right) shows that the more *n* in MetroMan, and presumably other algorithms; work to incorporate the time‐varying *n* from MFG into MetroMan framework is ongoing.

It would be reasonable to assume that the significant increase in *n* to be temporally invariant, and solve for *A*_{0} and *n* by requiring continuity among reaches. However, the GaMo discharge results do not diverge for high values of

Due to the empirical determination of some parameters, this method is not expected to work well for braided rivers and tidal reaches, where the controls on the flow resistance (and thus Manning *n*) are very different than in single or slightly anastomosed channels. Additionally, the accuracy and applicability of the algorithm is expected to be limited by the need for the prior mean annual flow. By design, the MRR for the MFG algorithm is highly correlated (R^{2} = 0.86) with the WBM bias. For example, the WBM mean annual flow estimate for the upstream Garonne underestimates the flow during the experiment by 65%, while the MFG MRR is −63%. One additional limitation of MFG is that it is not expected to work well for floodplains, since out‐of‐bank flow was not used in deriving the relations. Indeed, MFG performs relatively poorly for the Po (RRMSE = 73.6%) due in part to significant overestimation of high flows (see Figure ).

For the application of this algorithm, future efforts should focus on improving the empirical aspects of the method with additional site‐specific information, and over time with more observations. This is particularly important in reaches that are braided or tidally influenced where unique more specific relations may be developed. It is also possible that other geomorphologic information about the channel planform (including meander length, sinuosity, and channel type) can inform the estimation of Manning n, as indicated by *Bjerklie* []. Additionally, the Manning equation, which forms the physical basis for this method, also forms the basis for the MetroMan, GaMo, and MFCR algorithms and as such the three independent methods should eventually converge to similar values provided the assumptions and relations used in both corroborate each other. For example, the optimization scheme used in MetroMan or GaMo to optimize *A*_{0} and *n* could also be used to optimize the value of *B*, and the value of *n* in MetroMan and GaMo could be estimated a priori from various empirical relations and channel morphology.

The philosophy of the MFCR algorithm was to preserve the mean flow estimated by WBM, while simply assuming a default *n* value of 0.03. Any error in WBM flow should propagate directly in error in the MFCR algorithm. However, there are cases where the WBM estimate is too low but the MFCR is biased too high, e.g., in the case of the Connecticut River. The mean HEC‐RAS flow for the Connecticut in this study (June–December, 2011) was 1208 m^{3}/s, whereas the WBM estimated flow for this time was 394.1 m^{3}/s. Therefore, the MFCR method should have resulted in a lower estimated discharge, but this was not the case as MRR for the Connecticut was 150%. This resulted because a stage invariant roughness coefficient of 0.03 was applied to all cross sections in the MFCR. However, there was dramatic variability among the cross sections being averaged together to create river reaches, leading to issues like those described above for MetroMan. Moreover, there were relatively few cross‐sections available: for the first reach of 3.7 km in length, only three cross sections were used to build the reach with time‐averaged top width varying from 241 to 608 m, further increasing the within‐reach hydraulic variability. This resulted in maximum

Intriguingly, there is a slight dependency of the overall MFCR bias on the average Froude number of each river; lower Froude numbers had more tendency to have a low bias, and vice versa; the mean Froude number explains approximately 53% of the variance in the bias of the median estimate. In general, larger higher Strahler order streams have lower slope and lower Froude numbers, and these rivers generally had more negative bias. Presumably, this has to do with the WBM simulations used to estimate the MFCR mean annual flow.

These results highlight interesting conclusions about simple “prior” type flow estimations that point to the need for the more complex algorithms discussed before. First, the low value of mean flow from WBM did not agree with the large observed cross‐sectional area changes, underscoring the dangers of relying on prior conditions to estimate discharge. Second, discharge estimation via the prior resulted in the opposite bias as WBM, pointing to how critical it is to develop a better understanding of how

Thus far, it is difficult to predict which algorithm will perform best on each river; thus, the ensemble median is a highly attractive option. It is somewhat unexpected that the ensemble median is not the top performer on any of the 19 rivers, on the basis of RRMSE. However, it is quite encouraging that overall, the median is the least biased of any of the six estimates; the ensemble median algorithm has a mean MRR across the 19 rivers of just 4.8%. However, the standard deviation of the ensemble median algorithm's MRR is 57.1%, suggesting that variation in algorithm performance at each river was varied enough to render the ensemble median unreliable. It is expected that this issue will be addressed in the future, as the described major issues with reach averaging are explored, and prior and/or river‐specific information is incorporated in discharge estimation. As individual algorithms improve, so too will their ensemble products.

We are encouraged by the current state of SWOT discharge algorithms: 14/16 nonbraided rivers had an algorithm estimate discharge within 35% RRMSE of true flow. These results include rivers with complex real world hydraulics like 1‐D floodplain flow, low‐head dams, and simple multichannel rivers. Some of these complex hydraulics and geomorphologies can have a strong effect on the methods discussed here: extreme flood events, braided rivers, and two‐stage AHG all lead to decreased performance. Discharge algorithms must be improved to handle these cases, and methods to quantitatively predict algorithm performance based on river morphology must be developed. Moreover, the experiments conducted in this study used idealized synthetic daily observations with minimal noise; future studies designed to test SWOT's ability to estimate discharge need to use SWOT‐like temporal sampling and error characteristics and work with observed field and airborne data sets when possible. Our results from this experiment also indicate that algorithm improvement is needed if a robust, global discharge product is to be delivered from SWOT, as no single algorithm or their ensemble median performed with consistently accurate results. The MetroMan algorithm comes closest, as it was the most accurate algorithm in 9/16 nonbraided rivers, but even this algorithm is subject to very large discharge errors in other cases. We conclude that our restrictive experiment design, where no ancillary data or river‐specific assumptions were allowed is a likely cause of many of the poor results, substantiated by previous studies using each of the AMHG, GaMo, and MetroMan algorithms. Future work seeking to estimate discharge from remotely sensed platforms should include as much ancillary data as are available, and also seek to develop multialgorithm synergy to improve stability and accuracy of derived discharge retrievals.

Funding for this work was provided by NASA SWOT Science Definition Team grants NNX13AD96G and NNX13AD88G, NASA Terrestrial Hydrology Program grant NNX13AD05G, NASA SWOT Algorithm Definition Team, and CNES SWOT Science Definition Team grant (TOSCA). The authors thank Alison Macneil of the NOAA/National Weather Service Northeast River Forecast Center for providing the Connecticut River HEC‐RAS model, and Albert Kettner for providing WBM discharge estimates. Mike Jasinski and two anonymous reviewers provided comments that helped improve the quality of the manuscript. If interested in gaining access to data or codes utilized in this study, contact Michael Durand (