This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

A method to apply an empirical feature track correction (FTC) in a new observation operator for atmospheric motion vectors (AMVs) is proposed. The FTC AMV observation operator determines the background estimate of the observed AMV vector wind, adjusting the background profile by determining an optimal height adjustment, averaging the profile over a layer of optimal thickness, and applying a linear correction to the averaged profile wind. The FTC observation operator is tested in the context of a collocation study between AMVs projected onto the collocated Aeolus horizontal line‐of‐sight (HLOS) and the Aeolus HLOS wind profiles. This study is a prototype for a variational FTC for numerical weather prediction data assimilation systems in which the Aeolus wind profiles take the place of the background in the FTC observation operator. Compared to a collocation where the Aeolus profile is interpolated linearly in height to the AMV height, a simple

A feature track correction (FTC) observation operator for atmospheric motion vectors (AMVs) is proposed and tested. The FTC has four degrees of freedom corresponding to wind speed multiplicative and additive corrections ($\gamma $ and $\delta \mathbf{V}$), an estimate of the depth of the layer that contributes to the AMV ($\mathrm{\Delta}z$), and a vertical height assignment correction (*h*). In practice, a regular vertical grid results in discretized values for $\mathrm{\Delta}z$ and *h*. As a result, optimizing these four parameters requires an inner optimization for $\gamma $ and $\delta \mathbf{V}$ that uses the standard linear model and an outer optimization (pictured) for $\mathrm{\Delta}z$ and *h* that uses a brute force search. In a collocation study in terms of horizontal line‐of‐sight vector winds (HLOSV), in which the Aeolus HLOSV profiles take the place of the background in the FTC observation operator, the variance of the misfit is reduced by 43%.

**Funding information** NOAA, NA14NES4320003; NA19NES4320002

The need for high‐quality wind observations documented by the NRC (2007) remains valid. This need has been the motivation for many proposed space‐based Doppler wind lidar (DWL) missions (Baker *et al*., 2014). Since 2018 the Aeolus mission has made this a reality (Stoffelen *et al*., 2005; Rennie *et al*., 2021). Note that Rennie *et al*. (2021) give a very useful and detailed description of the Aeolus data processing and observation characteristics. On the other hand, for decades there have been plentiful atmospheric motion vectors (AMVs) created by tracking features in imagery from a variety of platforms and centres (Key *et al*., 2003; Velden *et al*., 2005; Santek *et al*., 2019a). AMVs include any type of feature‐tracked wind including cloud track winds (CTWs) as well as so‐called 3D winds created by tracking features in retrieved humidity imagery (Santek *et al*., 2019b). Several million AMVs are produced daily in centres around the world. However, AMVs have complex error characteristics – often attributed to height assignment errors – which hinder their use in data assimilation (DA) for numerical weather prediction (NWP) (e.g., Rao *et al*., 2002; Velden and Bedka, 2009; Salonen *et al*., 2015; Lee and Song, 2017; Cordoba *et al*., 2017), and only a tiny fraction are used for that purpose.

A variational DA system makes use of all information presented to it, including all observations, the model forecast, and *a priori* constraints. With an appropriate bias model, a DA system can also predict and thus mitigate the effects of observational error bias by estimating the coefficients in the bias model as auxiliary parameters in the overall minimization. In what follows we describe a feature track correction (FTC) observation operator for AMVs that includes such a bias model. The variational bias correction (VarBC) for radiances (Zhu *et al*., 2014), is an example of a very successful bias correction scheme. VarBC uses a number of predictors to estimate and correct the biases of individual channels and sensors, within the variational minimization. These corrections include linear and quadratic terms for predictors such as lapse rate and incidence angle. It is critical in VarBC that there is a sufficient number of unbiased observations that serve to anchor the analysis. In the same way that global navigation satellite system radio occultation (GNSS/RO) observations have provided a source of highly accurate observations, which are necessary to make VarBC of radiances successful, it had been anticipated that space‐based DWL winds would do the same for AMVs, provided that the DWL observations are very accurate and bias‐free. The parallels would be striking: GNSS/RO and DWL observations have global but sparse coverage, are (or should be) extremely accurate, and have high vertical resolution. Unfortunately, the Aeolus DWL observations have larger than anticipated random error and noticeable biases, which so far have only been corrected via comparison with a NWP system (Weiler *et al*., 2021), and so cannot be considered anchoring observations. In any case, variational FTC (VarFTC) would, like VarBC, take into account all observations used by the DA system, including wind anchoring observations from radiosonde and aircraft reports and (hopefully one day) DWL observations.

The quality of recent and current Aeolus wind products is still under study, but recent published results (e.g., Baars *et al*., 2020) are encouraging. As significant efforts have been made in terms of additional calibration and enhancements to the Aeolus processing system, the resulting data stream may eventually prove suitable to act as high‐quality anchoring observations for AMV VarFTC. Furthermore, comparisons to AMVs are valuable, because Aeolus measurements should not have a height assignment error, but rather are directly related to an observing level. (In our discussion we refer to the Aeolus making observations at levels, but it should be kept in mind that these are really the mid‐levels of the Aeolus observing volumes, which range from 0.5 to 1.0 to 2.0 km in thickness as elevation increases.) Therefore, in this study we prototype and test an FTC for AMVs that compares AMVs projected onto the Aeolus horizontal line‐of‐sight (HLOS) to the corresponding Aeolus HLOS wind profiles.

Both radiances and AMVs are imperfectly calibrated and have horizontally correlated errors (Bormann *et al*., 2003; Le Marshall *et al*., 2004; Cordoba *et al*., 2017; Lee and Song, 2017) based on geophysical variables not properly accounted for. In the case of AMVs, it is expected that the errors depend on a number of factors including tracking algorithm, cloud type, height assignment algorithm, and the channel used (visible, infrared, water vapour). There are notable differences in the observation functions and form of the bias correction for radiances and AMVs (Bormann *et al*., 2014; Hernandez‐Carrascal and Bormann, 2014). First, it is thought that the most critical bias of AMVs is due to height assignment errors; e.g., Folger and Weissmann (2016). Further, Salonen and Bormann (2013) and Salonen *et al*. (2015) note that height assignment errors are the main source of errors in AMVs. This will entail the correction of the coordinate, not the value of the observation. Second, AMVs undoubtedly have additional wind speed biases once height assignments are corrected. Bresky *et al*. (2012) discuss the slow speed bias of some AMVs. Third, AMVs are representative of a layer of the atmosphere and not the cloud top. An estimate of the layer depth may help to correct biases. It is possible that the cloud motion is representative of an atmospheric layer different from the cloud layer. For clear water vapour AMVs the thickness of the layer is expected to be related to the width of the layer contributing to the radiance of the channel used. In either case (CTWs or clear‐sky AMVs), the weighting through the representative layer may not be (is probably not) uniform. The method presented here could make use of any known weighting function, or even estimate such a weighting function, which would likely depend on cloud type in the case of CTWs (Hasler *et al*., 1977).

There are several important limitations of the collocation dataset used here. First, of the four modes of Aeolus operation, only the Rayleigh clear observations are used. The Aeolus project recommends that Rayleigh cloudy and Mie clear observations should not be used. Since the objective of the FTC observation operator is to compare wind profile data to AMVs, the Aeolus Rayleigh clear observations, which provide continuous vertical profiles, are appropriate to use here while the Mie cloudy observations, which are few and scattered in the vertical, are not. This is clear from the curtain plot of Aeolus L2B HLOS vector wind observations for one orbit depicted in Figure 1 of Rennie *et al*. (2021). However the Mie winds are known to be more accurate than the Rayleigh winds. Further, we are comparing Aeolus observations in clear scenes to nearby AMVs, which except for clear water vapour AMVs are necessarily from cloudy scenes. Both of these effects are expected to increase the difference between the collocated Aeolus and AMV observations. While it is possible to combine Mie and Rayleigh winds in one profile, we have not done this since it would result in an inhomogeneous collocation dataset due to differences in error characteristics between levels within a single profile and between pure Rayleigh and mixed Mie–Rayleigh profiles. (Of course, both types of Aeolus winds would be used quite naturally within a VarFTC implementation.) Second, the Aeolus data are known to have some biases even after the M1 mirror temperature bias correction (Weiler *et al*., 2021). Third, Aeolus is in a polar twilight orbit, with all observations near dawn or dusk. Additional AMV limitations are discussed by Lukens *et al*. (2021) who further analyze and discuss the collocation dataset used here. In spite of the limitations of the current study, we suggest that the FTC approach for AMVs has potential for improving (a) collocation studies comparing AMVs with wind profiles from various sources, (b) understanding of AMVs and the characterization of their errors, and (c) the use of AMVs in DA and NWP.

The FTC AMV observation operator determines the background estimate ($\widehat{\mathbf{V}}$) of the observed AMV vector wind, adjusting the background profile by determining an optimal height adjustment, averaging the profile over a layer of optimal thickness, and applying a linear correction to the averaged profile wind. We note at the outset that estimating these different sources of AMV biases simultaneously may produce ambiguous results. For example, in the case of AMVs that are too slow, the FTC could reduce the mismatch with a constant correction, with a multiplicative factor, by shifting the height, or by changing the layer averaged over. The last two adjustments are possible because of the general increase of wind speed with height. Restricting the degrees of freedom in the general formulation of FTC may be appropriate in different settings.

The general form of the FTC observation operator is*z*) may be any vertical coordinate, but in this study will be geometric height (km) relative to the reported height of the AMV. Positive values of *z* correspond to levels above the AMV. Note that, except for $\delta \mathbf{V}$, all the adjustments made by the FTC observation operator are related to a vertical weighting function $w(z)$. The weights $w(z)$ will usually be non‐zero only near the reported height of the AMV, i.e., for small $|z|$. It is useful to normalize $w(z)$ so that the integral corresponds to a weighted average of $\mathbf{V}$. Then Equation (1) may be written as

The case where $w(z)$ is a boxcar shape corresponds to $\stackrel{\u203e}{\mathbf{V}}$ being a simple average over the boxcar layer. In this case, the free parameters are $\mathrm{\Delta}z$, the width of the boxcar (i.e., the thickness of the averaging layer), *h*, the midpoint of the averaging layer with respect to the reported height of the AMV, $\gamma $, the height of the boxcar (i.e., the multiplicative correction term), and $\delta \mathbf{V}$, the additive constant correction term. In the present study, $\mathbf{V}$ is the HLOS vector wind (HLOSV), and $\delta \mathbf{V}$ is a scalar. Different shapes could be specified for $w(z)$ such as triangular or trapezoidal. More complicated schemes would determine the shape of $w(z)$, but might require some constraints on $w(z)$. The correction parameters may depend on several factors, such as AMV type, location (i.e., latitude and pressure), and other predictors evaluated from the background. For example, the averaging layer might be smaller for window channel infrared (IR) CTW AMVs compared to that for hyperspectral clear‐sky water vapour (WV) AMVs.

FTC could be implemented in a variational DA system in a manner similar to VarBC. In such an implementation, which we will call VarFTC, the FTC parameters would be optimized at the same time as the DA minimizes the misfit between $\widehat{\mathbf{V}}$ and the observations. In such a scheme, the normal control vector is augmented with the FTC parameters and the FTC observation operator replaces the normal AMV observation operator. An additional constraint cost function would add prior information about the FTC parameters – $\delta \mathbf{V}$ and either ${w}_{k}$ or ($h,\mathrm{\Delta}z,\gamma $). The VarFTC constraint cost function might parallel that for VarBC for radiances. In the VarBC, the constraint cost function is a sum of squares of the differences between the VarBC predictor coefficients and an *a priori* estimate of those coefficients, taken to be the solution from the previous DA cycle. However, note that VarBC is usually formulated as an additive correction to the radiance observations, whereas VarFTC reformulates the observation operator for AMVs.

The VarFTC observation cost function for AMVs, under the assumption that there are no correlated observation errors, is*i*th AMV observation. In practice ${\sigma}^{\mathrm{o}}$ should include contributions from instrument and representativeness errors, as well as from errors of the observation operator. Within an operational DA system, VarFTC should use the existing estimates of AMV ${\sigma}^{\mathrm{o}}$ adjusted by a multiplicative factor to reflect the smaller variance of the observation minus background innovations due to using the FTC observation operator.

When implementing the FTC observation operator in a variational data assimilation it should be noted that:

For VarFTC, a number of DA system implementation issues and mitigations must be considered. First, the initial proposed observation operator (Equation (2)) would be difficult to linearize, and therefore might not preserve the convergence properties of the DA optimization inner loop. However, we can separate the FTC observation operator within a variational DA as follows. First, the outer loop determines all the FTC free parameters as an independent optimization. This would be handled at the same level as any QC decisions. During the inner loop, the standard AMV cost function is replaced by Equation (5), the parameters defining the averaging layer – $\mathrm{\Delta}z$ and *h* – are held fixed, and the other parameters – $\gamma $ and $\delta \mathbf{V}$ in the case of boxcar weighting – are optimized in what is essentially just a linear least‐squares problem. In both the inner and outer loop it is possible to add quadratic constraints on the difference between the parameters and an *a priori* estimate of the parameters. Second, additional variables which provide a significant improvement in fit could be included in an implementation of the FTC observation operator. Some variables such as shear and wind speed that might be included as predictors could instead be used to subset the sample. If region or height are used as subsetting variables to divide up the domain, this might create discontinuities in the resulting analysis. In VarBC this is handled with interpolation of the fit coefficients. For example, the Tropics and Northern Hemisphere Extratropics solutions might be interpolated in the latitude band 25°–35°N. However, this cannot be done for the parameters defining the averaging layer – $\mathrm{\Delta}z$ and *h* – if these only take on discrete values. Instead the observation operator can be written in terms of $w(z)$ (Equation (1)), and $w(z)$ can be extended with zero values to all vertical levels outside any particular averaging layer. In this form the observation operator is linear in $w(z)$ and $\delta \mathbf{V}$ and these parameters can be interpolated or averaged as needed even as the averaging layer changes. Alternatively, one could calculate the observation operator for all possible regions and interpolate those results linearly in latitude or height as needed.

We apply the FTC formulation described in Section 2 (and in the Appendix) with the HLOS winds from Aeolus providing the background profiles and the AMV winds projected onto the collocated Aeolus HLOS as the observations. Results presented depend on the error characteristics of the Aeolus data and should not be considered representative of the potential performance of VarFTC in an NWP DA system because in the collocation study the FTC observation operator must account for errors in both the AMVs and the Aeolus observations. As we will see, this complicates the interpretation of the results because of the Aeolus observation error characteristics.

Ten days of data are used. The Aeolus profile is interpolated to a regular 0.5 km vertical grid relative to the AMV height. The method of solution is to minimize the mean squared observation minus background (OMB) difference, with respect to the free parameters. The minimization is only constrained by the data fit. Since this mean squared difference (MSD) is equal to the sum of squared OMB differences divided by the sample size, minimizing the OMB MSD is equivalent to minimizing ${J}_{\mathrm{o}}$ in Equation (5) with ${\sigma}_{\mathrm{o}}$ set to one. We felt that the neglect of variations in ${\sigma}_{\mathrm{o}}$ was justified in the collocation study, especially where we optimize separately for a number of subsets within which the ${\sigma}_{\mathrm{o}}$ are fairly uniform. However, we did find that variance of OMB does increase with AMV wind speed and with the magnitude of Aeolus HLOS wind shear.

The preliminary steps of our analysis described in the following subsections are data acquisition, data collocation, quality control (QC), and data pre‐processing. Collocations passing QC (the QC sample in what follows) are divided into two samples – one for training and one for independent verification (the training sample and independent sample in what follows). Then solutions are compared for a range of linear and *ad hoc* models with varying numbers of free parameters as listed in Table 1. Finally, a number of different stratifications of the dataset are used to explore the dependence of FTC solutions on AMV type, height, geographic region, wind speed, and wind shear. Results for the training and independent samples are very similar, and (with the exception of the discussion of Figure 2 in Section 4.5) only the results for the independent sample are shown or discussed.

*Note*: The number of free parameters is given in the column labelled ‘DOF’ for degrees of freedom. The profile interpolated to the height of the observation is denoted ${\mathbf{V}}_{0}$. Mathematical formulations are given in the Appendix.

The key metrics reported below are the OMB RMSD and the coefficient of determination (CoD). Both are related to the goodness of fit and the OMB MSD objective function that is minimized. The CoD is the reduction in MSD with respect to some reference solution, which is typically the Null solution. For arbitrary solution *A* we will calculate the CoD (as a percent) with respect to reference solution *R* as

Data were acquired for 10 days (40 six‐hour DA cycles). Each cycle has about 4,000 collocations. The first six‐hour period is centred at 0000 UTC on 21 April 2020 and the last is centred at 1800 UTC on 30 April 2020. The period chosen occurs after the large telescope mirror (referred to as M1) bias correction was applied operationally. The Aeolus wind observations were obtained from ESA as Level‐2B (L2B) Earth Explorer (EE) format files. These data are all “Baseline B10” products which means they are retrieved using the so‐called redundant flight model (FM‐B) laser. Aeolus observations sample the atmosphere horizontally over the whole globe except for small cut‐outs near the poles and vertically from the boundary layer to the lower stratosphere. The highest concentration of Aeolus observations is in the upper troposphere.

The AMV observations are those available for use operationally by NOAA's global DA system. After quality control and thinning (typically to 200 km resolution) only a small percentage of the available AMVs are actually assimilated in the DA. (Details are summarized at

BUFR = Binary Universal Form for the Representation of meteorological data; SATWND = Satellite Wind

. The dataset includes operational AMVs from both geostationary and polar‐orbiting satellites derived from cloudy scenes in the infrared, visible, and water vapour channels or clear scenes in the water vapour channel. The geostationary satellites are GOES‐16, GOES‐17, METEOSAT‐8, METEOSAT‐11, Himawari‐8, INSAT‐3D, and INSAT‐3DR. The polar‐orbiting satellites are NOAA‐15, ‐18, ‐19, and ‐20, Suomi National Polar‐orbiting Partnership (S‐NPP), MetOp‐A and ‐B, Aqua and Terra. All the AMVs are from imagers. For the geostationary satellites, these are the Advanced Baseline Imager (ABI) onboard the GOES satellites, the Spinning Enhanced Visible and Infrared Imager (SEVIRI) onboard the METEOSAT satellites, the Advanced Himawari Imager (AHI) onboard Himawari‐8, and the INSAT Imager onboard the INSAT satellites. The AMVs from the polar satellites are from the Advanced Very High Resolution Radiometer (AVHRR) instrument onboard the NOAA and EUMETSAT satellites, except that the Visible and Infrared Imaging Radiometer Suite (VIIRS) is onboard NOAA‐20 and S‐NPP. The Moderate Resolution Imaging Spectroradiometer (MODIS) is onboard the NASA satellites – Aqua and Terra. More details are provided by LukensAeolus estimates HLOSV from both Rayleigh (or molecular) scattering and Mie (or aerosol) scattering. As noted in the Introduction, only Aeolus Rayleigh clear‐sky winds were collected because (a) ESA does not recommend the use of Rayleigh cloudy winds at this time, and (b) Aeolus Mie winds are generally localized to heights in the boundary layer and near cloud tops and hence do not provide adequate profiles of winds, which are needed in the FTC observation operator. However, this choice comes with some limitations and requires some interpretation of the results. Rayleigh clear winds collocated with AMVs will either be above clouds or from nearby clear locations. In the first case, there might be missing HLOSV values below the AMV (Section 4.4). In the second case, collocation errors are expected to be larger both because the collocation distance will tend to be larger and because the nearby clear column may not be representative of the air containing the cloud feature tracked.

For each AMV, the Aeolus dataset is searched for collocations. First, all Aeolus observations are found that satisfy the collocation criteria for that AMV. Following Santek *et al*. (2021a); Santek *et al*. (2021b), the collocation criteria are a time difference of 60 min or less, a ${\mathrm{log}}_{10}$ pressure difference of 0.04 or less, and a great circle distance of 100 km or less. In practice, AMV data in each 6 hr DA cycle were compared to Aeolus data in that cycle as well as neighbouring cycles so that all potential collocations including those that cross cycle boundaries are found. Second, the Aeolus observations closest in terms of great circle distance to the AMV observation are selected. Third, the Aeolus observation from that selection that is closest to the AMV in the vertical is chosen as the collocated Aeolus observation. All Aeolus observations with the same latitude, longitude and time as the collocated Aeolus observation are collected and sorted to form the collocated Aeolus profile. During the collocation process, the AMV HLOSV is obtained by projecting the AMV on the collocated Aeolus HLOS. A total of 162,055 collocations was found in the ten‐day period.

As points of reference, Lukens *et al*. (2021) compile AMV statistics from studies comparing AMVs to radiosondes and find a range from roughly 4.5 to 9.0 m$\xb7$s${}^{-1}$ for the vector RMS error. Assuming a middle value and converting to a component standard deviation yeilds 4.75 m$\xb7$s${}^{-1}$. Meanwhile, Rennie *et al*. (2021) estimate the random error standard deviation for tropospheric Aeolus HLOS winds at 5 m$\xb7$s${}^{-1}$ for the period of this study (their Figure 2). For the independent sample the closest in $\mathrm{ln}(p)$ collocation RMSD is 7.96 m$\xb7$s${}^{-1}$, and the Null solution, which only interpolates the Aeolus profile to the reported height of the AMV, has an RMSD of 7.35 m$\xb7$s${}^{-1}$. Since collocation differences combine the two observation errors as well as errors due to the collocation itself, i.e., due to differences in location and scale in an RMS sense, then these estimates are consistent with RMS collocation differences on the the order of 4 m$\xb7$s${}^{-1}$ or, if the height difference is accounted for by interpolating the Aeolus profile, 2.5 m$\xb7$s${}^{-1}$.

Several QC tests are applied to each collocation. Collocations passing all the tests are retained for further analysis with no additional QC done during the analysis. The QC tests are as follows:*Missing value QC* eliminates collocations where there are missing values for any of the necessary AMV variables. A very small number ($n=42$) of collocations failed the missing value QC test.*Gross check QC* trims 5% of the collocations with the largest absolute OMB differences between collocated HLOS winds.*Aeolus QC* eliminates collocations with Aeolus observations that fail the ESA suggested QC for Rayleigh clear winds. This includes mid‐layer heights less than 2 km; pressures greater than 800 hPa; estimated errors greater than 12 m$\xb7$s${}^{-1}$; and accumulation lengths less than 60 km. These criteria are consistent with the advice of Rennie *et al*. (2021). 24% of collocations failed the Aeolus QC test.*AMV QC* eliminates collocations with AMVs having a quality indicator (QI, %) less than 60%. The QI is a forecast independent metric computed by the data providers (Santek *et al*., 2019a). This is not a stringent test. The cut‐off of 60% is considered a minimal QC; an 80% cut‐off is ideal (personal communication, Illiana Genkova (EMC), November 2020). Collocations with AMVs that do not have a QI assigned (3.4% of the sample) also fail this test. In all 19.4% of collocations failed the AMV QC test.

There are 99,726 collocations passing all the QC tests for a yield of 61.5%.

To apply the method in Section 2, we first interpolate each Aeolus profile to a regular grid in height referenced to the level of the collocated AMV. The Aeolus QC (described above) that was applied when selecting the collocations is also applied to each Aeolus wind in the collocated profiles. We will refer to these two QC steps as collocation QC and profile QC, respectively. The profile QC eliminates approximately 10.5% of all the Aeolus observations. This is a much smaller percentage than the 23.5% for the subset of Aeolus winds collocated with AMVs. In other words, a much greater percentage of Aeolus winds fail the QC in the generally cloudy conditions at and below the elevation of the AMVs. Experiments without the profile QC (not shown) produced larger RMSDs, but only by a few percent (5.66 versus 5.57 m$\xb7$s${}^{-1}$ for the standard *lm1* model).

From the AMV–Aeolus collocation, we know the *p* and HLOSV for the AMV and the profiles of *p*, *Z*, and HLOSV for Aeolus, where *p* is pressure (hPa) and *Z* is geometric height above mean sea level (m). (Aeolus L1B data are relative to the WGS84 ellipsoid, but L2B processing implements the conversion to the EGM96 geoid (Tan *et al*., 2008).) Therefore, interpolation of the Aeolus data to the regular grid centred on the AMV begins by first finding ${Z}_{\text{AMV}}$, the height for the AMV, by interpolating Aeolus *Z* in $\mathrm{log}p$ to the AMV reported pressure. The regular grid is then defined by*Z* to this grid. If a value of ${Z}_{k}$ is outside the vertical range of the Aeolus profile, then the corresponding Aeolus interpolated HLOSV value (${\mathbf{V}}_{k}$) is set by constant extrapolation.

Constant extrapolation is less than ideal. Specifically, the number of missing data (equivalently, the number of extrapolated data) is much greater below the AMV because the Aeolus Rayleigh mode is obstructed by cloud. We investigated the alternative of calculating the weighted average of the background wind (Equation (3)) by averaging over just the levels with non‐missing values. In experiments without profile QC, the results of the two methods are essentially the same because there are few missing/extrapolated data in this case. With profile QC, the method without extrapolation gives a slight improvement over constant extrapolation (MSD of 5.56 versus 5.57 m$\xb7$s${}^{-1}$). However, results are shown below for the constant extrapolation method because it permits solving the *lmn* model and assigning a shear value to each collocation.

Figure 1 illustrates these findings. In the figure we plot the RMSD between the AMVs and the Aeolus interpolated profiles and the percent of missing/extrapolated data from 5 km below (thick lines) to 5 km above (thin lines) the AMV for the QC sample. First, and most surprisingly, the RMSD reaches a minimum 0.5 km above the AMV. Second, compared to differences above the AMV, differences below the AMV by a similar vertical distance are larger by 0.5–2.0 m$\xb7$s${}^{-1}$ and increase with vertical distance. Third, constant extrapolation increases the RMSD by as much as 0.4 m$\xb7$s${}^{-1}$, which occurs for a vertical distance of $-$1.5 km. Fourth, the percentage of data missing/extrapolated above the AMV is tiny, only reaching 1.5% at 5 km. The percentage of data missing/extrapolated below the AMV ranges from 10% to 50%. This percentage is substantially reduced, but still large without the profile QC. Given the RMSD profiles in Figure 1a, we can anticipate that the optimal values of *h* will be positive, both because the minimum RMSD occurs at +0.5 km and because RMSDs tend to be larger below the AMV. Further, the third point is consistent with the slight improvement of averaging over non‐missing values compared to constant extrapolation. Moreover, it is possible that larger RMSD values below the cloud level signify that the Rayleigh clear Aeolus winds below nearby AMVs are less clear or less representative of the wind at the AMV location than those higher in the atmosphere. If this is the case, then the fact that the RMSD reaches a minimum 0.5 km above the AMV might not be due to an error in height assignment, but might instead simply be due to the fact that the Aeolus profile interpolated to the AMV height is a combination of Aeolus observations just above and just below the AMV height, and the observation below the AMV height may be contaminated by cloud.

Figure 1a has implications for the optimal averaging layer, since roughly speaking, the optimal averaging layer will minimize $1/n$ times the average of the MSD between the AMV and the *n* individual Aeolus winds averaged over.

Consider a boxcar average – Equation (A2) with ${w}_{k}^{\prime}=1/n$. The MSD between the AMV, ${\mathbf{V}}^{\mathrm{o}}$ and this boxcar average is equal to average of all the correlations of the differences between ${\mathbf{V}}^{\mathrm{o}}$ and ${\mathbf{V}}_{k}$. If we ignore off‐diagonal terms, this MSD is just $1/n$ times the average of the MSD between ${\mathbf{V}}^{\mathrm{o}}$ and ${\mathbf{V}}_{k}$. While this is useful heuristically, the off‐diagonal terms should be included in any calculation.

These MSDs are the squares of the values plotted in Figure 1a. Thus, averaging over more and more levels surrounding the AMV level is beneficial due to the $1/n$ effect, until the MSD become too large. This is illuminating because it explains the balance between the following factors:the more Aeolus data are averaged the better, because this reduces random errors;

averaging over a layer that is more representative of what is tracked in determining the AMV is beneficial, and

averaging of a layer that is too thick reduces random error, but ultimately is no longer representative of the AMV.

The solution for $\mathrm{\Delta}z$ and *h*, and $\delta \mathbf{V}$ and either $\gamma $ or ${w}_{k}$ is obtained in a two‐step process. First, $\delta \mathbf{V}$ and either $\gamma $ or ${w}_{k}$ are determined by the standard linear model (i.e., regression analysis) for each cell in a grid of $\mathrm{\Delta}z$ and *h* described in the Appendix. Second, the grid is searched for the global minimum of MSD. For this search to be meaningful, the sample in each grid cell must be the same. This requires a procedure for handling missing data in the Aeolus profiles. Results presented here are based on constant extrapolation but, as indicated in Section 4.4, the alternative of averaging over non‐missing data in the Aeolus profiles yeilds similar results. Figure 2 shows the $\mathrm{\Delta}z$ and *h* grid for the case of the standard (*lm1*) model for the training sample ($n=66,715$). The minimum occurs for $\mathrm{\Delta}z=4.5$ km and $h=0.5$ km (circled). Here MSD is normalized by MSD for the OLS solution, i.e., for $\mathrm{\Delta}z=0.5$ km and $h=0$ km, which are indicated by the thick horizontal and vertical lines. The quantity plotted is 100 times the normalized MSD and is therefore equal to 100–CoD(OLS). This is the percent of the total sum of squares of the OLS solution remaining after the *lm1* fit. At the minimum the plotted value of 64 therefore corresponds to CoD(OLS)=36%. (For the training sample, the *lm1* and OLS models remove the bias, so the MSD is equal to the variance of OMB.) Note that the cost function displayed in Figure 2 has a reasonably well‐defined minimum, but there are three additional grid cells near the minimum where the normalized MSD is also 64. As a result, similarly good solutions are available for values of $\mathrm{\Delta}z$ and *h*one or two grid cells away (i.e., for differences of 0.5 or 1 km.)

The solution parameters and key statistics for this model and the other models listed in Table 1 are given in Table 2. The solution parameters are $\mathrm{\Delta}z$, *h*, $\delta \mathbf{V}$ and $\gamma $. The statistics are the RMSD, mean OMB, CoD(OLS), and CoD(Null), all for the independent sample ($n=33,011$).

*Note*: MOMB is mean OMB. Statistics are for the independent sample ($n=33,011$).

The different models in Table 1 are examined to investigate which free parameters are necessary. As mentioned earlier, winds could be increased by increasing $\gamma $, increasing $\delta \mathbf{V}$, or increasing the heights of the layers averaged over. First, note that the models are ordered from the best to the worst in terms of CoD or MSD. For the OLS and LSO models, which do not include averaging over a layer, including the multiplicative parameter $\gamma $ with a value of about 0.9 reduces the MSD by 10% relative to the Null solution (6.96 m$\xb7$s${}^{-1}$ RMSE compared to 7.35 m$\xb7$s${}^{-1}$ RMSE). Note that for all the cases in which $\gamma $ is a free parameter, both cases including and not including an averaging layer, the optimal value of $\gamma $ is approximately 0.9. Averaging over a layer (4.5 km in most cases) significantly improves the fit, even in the one‐parameter $ad\phantom{\rule{0.3em}{0ex}}hoc$ averaging (AHA) model. In part this may be due to the AMV being representative of a layer, but averaging the random errors of the Aeolus profile winds is certainly playing a part in improving the comparison. For the series of models including layer averaging, Table 2shows that increasing complexity does provide better data fits. For the *lmn* model the ${w}_{k}$ are optimized level by level (Figure 3). (The ${w}_{k}$ for the other models are simply equal to $\gamma $ divided by the number of levels averaged over ($\mathit{n}=\mathrm{\Delta}z/\delta z$), and are not reported.) Interestingly, the shape of the weights in Figure 3 is close to a triangle. However, this solution is not well constrained and for a wide range of $\mathrm{\Delta}z$ and *h*, the optimum values of ${w}_{k}$ and $\gamma $ give nearly the same MSD.

As noted in the Introduction, AMV errors may depend on how they were produced, on location and on local conditions (e.g., Velden *et al*., 1997). Posselt *et al*. (2019) recently showed that errors of water vapour AMVs depend on wind speed as well as water vapour content and gradient. Accounting for related variables in the FTC observation operator might improve the OMB statistics. To determine which variables might be helpful in this regard, solutions for different stratifications (subsetting) of the data are presented here. In these calculations each subset is independently optimized. Since the samples used in the calculation are smaller than before, the *lm1+0* model is used here. This model uses one less parameter than the standard model, but reduces MSD to about the same degree. Following the terminology used in statistical modelling, a variable used to create sample subsets is called a factor. A factor takes on a small number of values called factor levels. A factor may be a categorical variable or a continuous variable that has been divided into a small number of intervals.

The different factors and levels are defined as:**Method** describes the method used to generate the AMV and has levels IR, Vis, WV, and Clear, indicating that IR, visible, cloudy WV or clear WV imagery was used.**Height** has levels High, Mid, and Low and is defined by breakpoints 450 and 750 hPa applied to AMV pressure.**Region** has levels Southern Hemisphere Extratropics (SHX), Tropics, and Northern Hemisphere Extratropics (NHX) and is defined by breakpoints $-$30 and +30 degrees applied to AMV latitude.**Speed** has levels Slow, Medium, and Fast and is defined by breakpoints 10 and 20 m$\xb7$s${}^{-1}$ applied to the AMV speed.**Shear** has levels corresponding to Very negative, Negative, Neutral, Positive, and Very positive shear and is defined by breakpoints $-$3, $-$1, +1, +3 m$\xb7$s${}^{-1}\xb7$km${}^{-1}$ applied to Aeolus HLOS wind shear. This is calculated as the difference between the Aeolus HLOS wind 2 km above and 2 km below the AMV height, divided by 4 km. The extrapolated Aeolus winds are used to calculate shear.**Sanity** is used to estimate the uncertainty of the solutions. The collocations are randomly assigned to one of the four levels (First, Second, Third, or Fourth).**All** is used for the global solution (i.e., the case of no subsets). All collocations are assigned to the single level (also named “All”). The All solution is thus the *lm1+0* solution of Section 4.5.

Optimized solutions were obtained for each factor level. Summary solutions and statistics for all factors are given in Table 3. For each factor, the summary parameters and statistics are the weighted averages over the factor levels of the solution parameters and statistics, except that the summary RMSD value is calculated from the weighted average of the MSD values. (The weights are the number of collocations for each factor level.) For more granularity, level solutions and statistics for all factors and levels are given in Table 4. Note that in both tables the first CoD is with respect to the LSO solution, which, like the *lm1+0* model, has no constant term.

*Note*: The summary is a weighted average, except that RMSD is calculated from the weighted average of MSD. MOMB is mean OMB. Statistics are for the independent sample ($n=33,011$).

*Note*: MOMB is mean OMB. Statistics are for the independent sample.

Considering first Table 3, the one factor that stands out is Speed. Factor Speed reduces the RMSD to 5.15 m$\xb7$s${}^{-1}$ and increases the CoD to 51%. Factor Speed's solution is also different in having $\gamma =0.79$, whereas all the other values of $\gamma $ are in the range 0.9–0.93. Otherwise all average parameters are similar, with $\mathrm{\Delta}z$ ranging from 4 to 4.7 km and *h* from 0.5 to 0.63. From the point of view of minimizing MSD, except for Speed, none of the other factors seem worth the extra degrees of freedom. However, there are some interesting features when we turn to the factor level solutions and statistics, all of which are presented in Table 4.

In Table 4 we see that for different factor levels the solutions differ considerably for factors Method, Height, Speed, and Shear, but not for factors Region and Sanity. For Region and Sanity, it is true that one factor level (SHX and First, respectively) has a different value of $\mathrm{\Delta}z$ (3.5 km). However, this is not significant recalling that in the discussion of Figure 2 several grid cells (including the two in question here – $\mathrm{\Delta}z=4.5$ km with $h=0.5$ km and $\mathrm{\Delta}z=3.5$ km with $h=0.5$ km) had nearly identical MSD values.

The next three figures show the level solutions for factors Method, Speed, and Shear, factors for which there are interesting differences between factor levels. As mentioned, factor Speed has the best performance with a CoD(Null) of 51%, compared to 42.5% for the factor All solution. Factor Method with a CoD of 44.2% and factor Shear with a CoD of 43.2% perform only slightly better than factor All.

In Figure 4 for factor Method there are notable differences between IR and visible CTW AMVs and the WV clear and cloudy AMVs both in terms of solution and fit. The WV AMVs, especially in the Clear case, have a wider optimal averaging layer which is more nearly centred on the reported AMV height. Potential reasons for this, given in Section 2, include the fact that the WV channels average over a considerable depth of the atmosphere characterized by their weighting functions, while the IR and Visible channels are sensitive to the actual cloud tops. The WV AMVs have distinctly larger RMSD and smaller CoD values. Both of these factors indicate that it is more difficult for the FTC observation operator to fit the WV AMVs.

For factor Speed, there is a very significant increase in RMSD from Slow to Medium to Fast AMV winds. In situations with higher wind speeds, we expect greater variation both spatially and temporally, resulting in larger collocation differences. In terms of the FTC parameters graphically presented in Figure 5, $\gamma $ ranges from near 0.5 to near 1.0 with increasing AMV wind speed, thus greatly reducing the Aeolus layer mean HLOS wind estimate when AMV wind speed is low. This finding, as well as the fact that factor Speed stands out in improving the overall fit, is expected since AMV speed is correlated with AMV HLOS wind. Except for $\mathrm{\Delta}z=3.5$ km for the Fast speed level, the averaging layers are the same for all three factors and the same as that for factor All (i.e., $\mathrm{\Delta}z=4.5$ km and $h=0.5$ km). Thus, it is the variation of $\gamma $ which provides the improved fit for factor Speed.

For factor Shear, there are several interesting features. First, the RMSD increases with shear magnitude and the CoD decreases with shear magnitude. Second, $\mathrm{\Delta}z$ is smallest (3.5 km) for the most extreme shears (Figure 6). Since averaging over the Aeolus profile tends to reduce the random observation errors, the optimal solutions favour larger averaging layers. For these two features, note that in a strong shear environment, winds far from the midpoint of the averaging layer are very different from the other layers, opposing the tendency for large averaging layers and increasing the RMSD. In contrast, in weak shear, adding an extra level has little impact on the layer mean wind but does decrease its random error.

In this study we propose a feature track correction (FTC) observation operator for atmospheric motion vectors (AMVs). We discuss the potential use of the FTC observation operator in a variational data assimilation (DA) system and point out that variational FTC (VarFTC) parallels in several ways the variational bias correction (VarBC) used for assimilating radiance observations. The objective of the FTC observation operator is to account for differences between AMVs and the true wind which might be due to AMV height assignment errors, AMVs being representative of some layer of the atmosphere and not a single level, and AMVs being too strong or too weak compared to the true wind.

The standard version of FTC has four degrees of freedom corresponding to wind speed multiplicative and additive corrections ($\gamma $ and $\delta \mathbf{V}$), an estimate of the depth of the layer that contributes to the AMV ($\mathrm{\Delta}z$), and a vertical height assignment correction (*h*). Since the effect of the FTC observation operator is to add a bias correction to a weighted average of the profile of background winds, the more general formulation is in terms of a profile of weights (${w}_{k}$) and $\delta \mathbf{V}$. These formulation may have more degrees of freedom than necessary and some restrictions may be warranted. For example, and as noted in Section 2, there may be multiple ways of correcting a slow bias. However, in the present case, experiments (described in Section 4.5) with a variety of formulations indicate that more degrees of freedom increasingly improve the fit for the independent sample.

The FTC observation operator is tested in the context of a collocation study between AMVs projected onto the collocated Aeolus horizontal line‐of‐sight (HLOS) and the Aeolus HLOS wind profiles. This is meant to be a prototype for an implementation in a variational data assimilation system, and here the Aeolus profiles act as the background in the FTC observation operator. However, in the collocation study the FTC observation operator must account for errors in both the AMVs and the Aeolus observations. This complicates the interpretation of the results because of the Aeolus observation error characteristics. First, the Aeolus data used have had the M1 mirror temperature bias correction applied, but still have large random errors (of order 5 m$\xb7$s${}^{-1}$ RMSE). These errors are thought to be due to instrument noise and therefore are uncorrelated in the vertical. This noise can be reduced by averaging the Aeolus profile in the vertical and as a result the FTC solutions favour thick averaging layers (typically 4.5 km thick) to reduce the misfit between AMVs and FTC observation operator estimates. Second, except for the WV clear AMVs, the collocations are in cloudy areas, and in such cases it appears the Aeolus profiles below cloud top have larger errors. We even find that the RMSD between AMVs and Aeolus profiles is slightly smaller 0.5 km above the AMV height (Section 4.4). As a result the FTC solutions favour averaging layers centred above the AMV height (typically by 0.5 km). These factors are also consistent with results for subsets based on the Shear and Method factors (Section 4.6).

Collocation study results were obtained for ten days of data using modest (i.e., not stringent) quality control (QC). The solution for the standard model has $\mathrm{\Delta}z=4.5$ km, $h=0.5$ km, $\gamma =0.93$, and $\delta \mathbf{V}$ negligible. As explained above, the fact that the averaging layer is thick and displaced upwards can be explained at least in part in terms of the error characteristics of the Aeolus profiles. The AHA model was used to quantify this effect. We found in Section 4.5 that the AHA model provides reduction of mean square difference (MSD) of 37.9% compared to the value of 42.5% for the *lm1* model, demonstrating the value of both vertically averaging and the FTC observation operator. The optimal reduction in Aeolus wind by a factor of approximately 0.9 in FTC solutions with and without layer averaging suggests that the AMVs are relatively slow compared to the Aeolus winds. The overall RMS collocation difference for the standard model for the independent sample is 5.57 m$\xb7$s${}^{-1}$ with negligible mean. For comparison the RMSD of the corresponding closest in $\mathrm{ln}(p)$ collocation is 7.96 m$\xb7$s${}^{-1}$, and the null solution, which only interpolates the Aeolus profile to the reported height of the AMV, has an RMSD of 7.35 m$\xb7$s${}^{-1}$. These values correspond to reduction in MSD of 51 and 43% due to the FTC observation operator in comparison to the closest in $\mathrm{ln}(p)$ and interpolated in *Z* collocations, respectively.

In the present case the findings relate more to the errors of the Aeolus profiles than the AMVs. However, these preliminary tests do demonstrate the potential for the FTC observation operator to

As the next step to implement VarFTC, our ongoing research uses DA backgrounds in place of Aeolus winds as input to the FTC observation operator. This approach has several advantages: it provides a much larger set of data; it eliminates issues related to the Aeolus Rayleigh‐clear wind error characteristics; and it compares wind vectors instead of HLOS winds. For example, in the work reported, other than the factor Speed, the stratifications described provided little or no benefit. However, in our ongoing research, thanks to much larger sample sizes, we can stratify first by individual types of AMVs (i.e., by sensor and method) and still have sufficient sample sizes to consider other factors. Since local biases are often detected by routine monitoring of observation‐minus‐background (O‐B) statistics as reported by the EUMETSAT Satellite Application Facility on Numerical Weather Prediction (NWP SAF) at

**Ross N. Hoffman:** conceptualization; formal analysis; investigation; methodology; validation; visualization; writing – original draft; writing – review and editing. **Katherine E. Lukens:** data curation; investigation; resources; writing – review and editing. **Kayo Ide:** conceptualization; methodology; project administration; resources; supervision; writing – review and editing. **Kevin Garrett:** conceptualization; funding acquisition; project administration; supervision; writing – review and editing.

The authors thank their colleagues who encouraged and challenged them in this work, particularly David Santek (CIMSS). Anonymous reviewer comments were thought‐provoking and helped to improve this article. ESA provided the Aeolus L2B EE data sets. NCEP provided the SATWND BUFR AMV data archive. The University of Wisconsin‐Madison provided the S4 supercomputing system (Boukabara *et al*., 2016). Jim Jung (CIMSS) produced a declassified version of the SATWND archive for our use on S4. The authors acknowledge support from NOAA [NA14NES4320003 and NA19NES4320002] through CICS and CISESS at the University of Maryland/ESSIC.

To discretize Equation (2) we first interpolate the background wind to a regular vertical grid, ${z}_{k}=k\delta z$, relative to the AMV reported height. That is, the interpolated wind profile is centred at each observation, but all profiles have the same grid increment and levels.

The analogue of Equation (1) in terms of the discretized background wind profile is

Here the sum is over a total of *n* levels, starting at level *m*, so $k=m$ to $m+n-1$. The correspondence to the original parameters is given by $\mathrm{\Delta}z=n\delta z$ and $h=\{m+(n-1)/2\}\delta z$.

The regular vertical grid allows the use of the standard linear model (i.e., regression analysis) to optimize the ${w}_{k}$. That is, for a given $\mathrm{\Delta}z$ and *h* Equation (A1) is a linear model with predictors ${\mathbf{V}}_{k}$, coefficients ${w}_{k}$, and intercept $\delta \mathbf{V}$. The global optimum would then be taken over $\mathrm{\Delta}z$ and *h*. We find the global optimum by searching a small $m-n$ grid with *m* in [$-$8,2] and *n* in [1,14]. We call Equation (1) the *lmn* model because it is a linear model with *n* predictors.

Equation (A1) can be further simplified by assuming a fixed shape for the ${w}_{k}$ that can be specified with a small number of (to be optimized) parameters, such as a trapezoid or truncated Gaussian hill. Now Equation (2) still holds, but

Equation (2) is a linear model with the weighted average background wind as the predictor, coefficient $\gamma $, and intercept $\delta \mathbf{V}$. If we assume the ${w}_{k}$ correspond to the boxcar shape, then ${w}_{k}^{\prime}=1/n$, ${w}_{k}=\gamma /n$ and $\stackrel{\u203e}{\mathbf{V}}$ is the layer average background wind. We call Equation (2) the *lm1* model since there is only one predictor. If the intercept is fixed to be zero, we call this the *lm1+0* model. (In the R programming language, linear models are calculated by the function *lm* and “+0” in the formula for the model indicates a zero constant term.)

A final simplification is to assume $\gamma =1$ so that

This is a linear model for the l.h.s. in terms of a constant intercept $\delta \mathbf{V}$. Since there are no predictors, we call this the *lm0* model.

In each of these cases, the global optimum is taken over $\mathrm{\Delta}z$ and *h*. The height adjustment can be turned off by restricting the search to $h=0$, and vertical averaging of the background can be turned off by restricting the search to a single level, i.e., $\mathrm{\Delta}z=\delta z$. In the case of no height adjustment and no vertical averaging ($h=0$ and $\mathrm{\Delta}z=\delta z$), the *lm1* model reduces to the ordinary least squares (OLS) model, the *lm1+0* model reduces to the least squares through the origin (LSO) model, and the *lm0* model reduces to the simple bias correction (SBC) model.

Note that the discretization of the integration in Equation (A1) does not actually represent a reduction in flexibility in fitting the data. This is true since any form of $w(z)$ would necessarily reduce to the form of Equation (A1) in which the values of the ${w}_{k}$ would be determined by the integration bounds, the weighting function shape and details of the finite difference integration method.