A 10-year reanalysis of the PacIOOS Hawaiian Island Ocean Forecast System was produced using an incremental strong-constraint 4-D variational data assimilation with the Regional Ocean Modeling System (ROMS v3.6). Observations were assimilated from a range of sources: satellite-derived sea surface temperature (SST), salinity (SSS), and height anomalies (SSHAs); depth profiles of temperature and salinity from Argo floats, autonomous Seagliders, and shipboard conductivity–temperature–depth (CTD); and surface velocity measurements from high-frequency radar (HFR). The performance of the state estimate is examined against a forecast showing an improved representation of the observations, especially the realization of HFR surface currents. EOFs of the increments made during the assimilation to the initial conditions and atmospheric forcing components are computed, revealing the variables that are influential in producing the state-estimate solution and the spatial structure the increments form.

The Pacific Integrated Ocean Observing System

The PacIOOS forecast system uses the time-dependent incremental strong-constraint
four-dimensional variational data assimilation (I4D-Var) scheme

Our model domain covers the Hawaiian Island Archipelago
(Fig.

Model domain and bathymetry, with mean currents labeled from

There are two main objectives to this study: to assess the skill and
performance of the state-estimation model and to analyze the increments made
to the initial, boundary, and atmospheric forcing terms. For the first
objective, we compare the state-estimate solution with a free-running
forecast over the decadal time period and examine how the performance changes
over time utilizing observations derived from satellites and it situ
measurements. In addition, PacIOOS operates seven high-frequency radar
stations sites across the Hawaiian Islands. The first station was constructed
in 2010, with the remaining six becoming operational over the period from
2011 to 2015. These instruments produce high-resolution (both spatially and
temporally) surface current velocities in the vicinity of the islands of
O`ahu and Hawai`i. The use of HFR observations within a state-estimation
scheme has been shown to produce a significantly improved representation of
surface currents

Section

The Regional Ocean Modeling System (ROMS) version 3.6 is used to simulate the
physical ocean around the Hawaiian Islands. ROMS is a free-surface,
hydrostatic, primitive equation model using a stretched coordinate system in
the vertical to follow the underwater terrain. In order to allow varying time
steps for the barotropic and baroclinic components, ROMS utilizes a
split-explicit time stepping scheme (for more details on ROMS, see

The Hawaiian Island domain covers 164–153

Tidal forcing is produced using the OSU Tidal Prediction Software (OTPS)

Lateral boundary conditions are taken from the HYbrid Coordinate Ocean Model
(HYCOM)

From 2007 to 2009, atmospheric forcing fields (excluding the wind) are
provided by the National Center for Environmental Prediction (NCEP)
reanalysis fields

To blend the two, we convert the MM5 winds to anomalies by subtracting a
30-day mean centered about the record of interest. We compute the mean for
the same period from the CORA/NCEP winds. The difference between the two
means provides a bias estimate. The bias is removed from the MM5 anomalies
and the CORA/NCEP mean is added. Within a 1

From July 2009, atmospheric forcing is provided locally by a high-resolution
Weather Regional Forecast (WRF) model

Prior to the experiment, a 6-year non-assimilative model was run using the same initial state, boundary conditions, and atmospheric forcing. The variability of the model is used to produce an estimate of the background error covariances used within I4D-Var, as well as the mean sea surface height to use with sea level anomaly observations.

The cost function of the I4D-Var method penalizes for the increments made to
the initial conditions, the boundary conditions, and the forcing and for the
deviations of the model state from the observations. A detailed derivation of
the cost function can be found in

The reanalysis covers a period of 10 years from July 2007 to July 2017. The
period of assimilation for the I4D-Var cycles is 4 days, which corresponds
to the limit of the linearity assumption within the domain

During each I4D-Var cycle, a minimization procedure is applied. The
nonlinear model is first integrated forward to estimate the background state
(the first

Several 4- and 8-day forecasts are performed from the end of each cycle using the assimilated state as initial conditions, and the short-range (1–4 days) and midrange (5–8 days) forecasts are evaluated for skill.

Observational data used within this study include satellite measurements of
the ocean surface of temperature, height, and salinity, in situ depth
profiles of temperature and salinity, and surface velocities from high-frequency radar. Observations within one Rossby radius (

Sea surface temperature (SST) observations are available from two sources at
different time periods: initially we used the Global Ocean Data Assimilation
Experiment High Resolution Sea Surface Temperature (GHRSST) level 4 OSTIA
Global Foundation Sea Surface Temperature Analysis

Beginning in April 2008, we switched to using the GHRSST level 4 K10_SST
global 1 m sea surface temperature analysis dataset

Sea surface height (SSH) observations are derived using sea level anomaly
data from the Archiving, Validation and Interpretation of Satellite
Oceanographic data (AVISO) delayed-time along-track information. The data
come from multiple altimeter satellites measuring the anomaly with respect
to a 20-year mean SSH, homogenized against one of the missions to ensure
consistency. Each track has approximately

Sea surface salinity (SSS) data are taken from Aquarius mission daily L3
gridded dataset

Depth profiles of temperature and salinity are obtained from threes sources: the Hawai`i Ocean Time-Series (HOT) shipboard conductivity–temperature–depth (CTD) casts, the global network of Argo floats, and autonomous Seagliders operated by the University of Hawai`i.

The HOT project conducts monthly cruises to the deep water station ALOHA (A
Long-term Oligotrophic Habitat Assessment; located at
23

HOT also conducts regular Seaglider missions departing from station ALOHA. In addition, PacIOOS conducts occasional Seaglider surveys in areas close to the south coast of O`ahu. The buoyancy-driven autonomous underwater vehicles take profiles and transects at depth of temperature and salinity.

Observations from the global Argo float network are available from the Argo
array network

Representational errors for HOT CTDs, Argo floats, and Seagliders are defined
by the variance of observational data from all available sources across our
domain sorted into depth bins. These profiles resemble a typical
temperature–salinity profile, with a peak temperature error of

Composite image of percentage coverage for all radar sites (situated at green dots) when all are operational. Where two sites overlap the greater value is taken to indicate the level of coverage at each point.

Number of observations used within data assimilation run. Note that there tend to be orders-of-magnitude more satellite or remotely sensed observations than in situ.

HFR measurements of surface currents are available from PacIOOS at seven
sites around the Hawaiian islands: five around the southwest of O`ahu and
two on the east coast of Hawai`i. Data are available from the first site
in October 2010 with the other sites coming online at various times, the most
recent being October 2015. The range for the HFRs on O`ahu extend
approximately

The numbers of observations for each 4-day cycle from all sources are shown
in Fig.

Time series of percentage reduction in the I4D-Var cost function;
in the left column are pre-HFR observations and in the right column are post-HFR observations, with the mean value
given in parentheses. Dashed lines mark the limit of

In this section we examine the state estimate to quantify the performance during our time period.

I4D-Var minimizes the residuals between the model and observations over each 4-day cycle. We calculate the percentage reduction between the initial and final cost function for each cycle to assess how the assimilation performs over time. Additionally, the I4D-Var algorithm reports the individual contributions by the state variables considered by the data assimilation to the total cost function. Hence we can examine the cost function in detail for those observation types that are most critical for its reduction. However, it should be noted that for this decomposition we do not distinguish between observation sources.

Figure

The total cost function of all data (Fig.

Salinity measurements tend to contribute the least improvement, ranging from
34 % (pre-HFR) to 16 % (post-HFR). Salinity data are the least numerous
(Fig.

The cost function associated with HFR measurements is reduced by 60 % of the initial value, meaning the model is closer to the HFR observations after the assimilation.

Another measure of performance is the theoretical minimum value of the
cost function (

This optimality value provides a simple representation of how consistently
the error matrices (

Post-HFR, the optimality value increases, suggesting the errors in this period are underestimated. A large optimality value arises when the cost function is large (i.e., large differences between the model and observations). There were two anomalous cycles in 2011; the first coincides with the introduction of a second radar site. From 2012 onwards the optimality value is generally good, if highly variable. The increase in optimality given the available observations points to an underestimation of HFR errors or at the least a persistent difference between the model and HFR observations.

The consistency of the assimilation can be assessed by comparing the error
matrices

Time series of spatially averaged background (blue) and observation
(green) errors, with thick lines showing a priori values and thin lines the
posterior calculated using Eqs. (

Similarly, using the difference between the observation

For a detailed description of the above diagnostics the reader is referred to

Figure

Sea surface salinity observation errors (Fig.

This error consistency analysis supports the conclusions in
Sect.

Time series of root mean squared anomalies (RMSAs) between remotely
sensed observations and two model realizations: the state estimate (orange)
and the forecast (blue).

Because I4D-Var relies on the model physics to represent observations through time, it should provide better forecasts. Time-invariant methods (3D-Var, optimal interpolation) that perturb the state at single times may better reduce the time-fixed cost function, but can add nonphysical structures that generate noisy forecasts.

In this section, we examine the state-estimate solution by comparing the model to observations. For reference, the observations are also compared against the forecast starting from the same time as each state-estimate cycle. The initial and boundary as well as atmospheric and tidal forcings are initially the same for both runs; however, the initial and boundary conditions and atmospheric forcing are altered as part of the state-estimate solution.

For comparing fields we use the root mean squared anomaly (RMSA) and the
anomaly correlation coefficient (ACC), defined as

Time series of anomaly correlation coefficients (ACC) between
remotely sensed observations and two model realizations; the state estimate
(orange) and the forecast (blue).

Spatial maps of RMSA for SST observation sources for the
forecast

Figure

The ACC is also improved by the state estimate for all variables, as shown in
Fig.

Spatial maps of HFR statistics for south O`ahu for the
forecast

Figure

Both RMSA and ACC between the experiments and HFR observations are shown in
Fig.

As discussed in

RMSA (solid) and ACC (dashed) profiles of subsurface
temperature

The in situ observation sources are Argo floats, Seagliders, and HOT CTDs, which also
show an improvement in the state estimate over the forecast. The subsurface
temperature RMSA values are reduced by an average of

Figure

For subsurface salinity (Fig.

Mean skill metric for remotely sensed observations as a function of
forecast length. Solid lines: skill (see Eqs.

In this section we quantify the model skill by using a skill score evaluated
as the improvement against a reference field

For this verification we wish to examine the effect of forecast length on the
skill. Starting with the same initial conditions as each state-estimate cycle
we produce an 8-day forecast, the length of two state-estimate cycles.
The RMSA is calculated every

EOF1 and PC1 of initial condition increments for temperature,
east–west velocity, and north–south velocity (all averaged 0–100 m) and of
forcing perturbations applied to surface heat flux. The EOFs were calculated
using the routines described in

Figure

During each I4D-Var 4-day window, the initial model field and
time-varying boundary and surface forcings are adjusted to minimize the
residuals. The initial condition increments form a single record for each
cycle, while the boundary and surface forcings are perturbed every time they
are applied to the model. The perturbations applied to the boundary exhibit
only a minor influence on the model (not shown) due to the mean advection
speed (

Because we are analyzing the increments (rather than the state) to the
initial conditions and forcing fields, the mean increment should be zero
(unless there is a bias in the model), and we are looking to examine the
variability. Over the entire reanalysis period, the mean biases between the
model and observations for the different types are temperature
(

Over the 10-year reanalysis, there are

For each cycle, the initial perturbation of the primary model prognostic
variables are examined: sea surface height, temperature, salinity, east–west
velocity, and north–south velocity. With the exception of sea surface height,
each variable is averaged over the upper 100 m to cover the mixed layer
depth in the domain

The assimilation was configured to optimize the surface forcing increments every 6 h (to avoid overadjustment). The time of day potentially impacts forcing variables, particularly surface heat flux, so we calculate EOFs on the increments for each of the four distinct times of day they occur (00:00, 06:00, 12:00, 18:00 UTC). Due to the size of the model grid, the number of records, and the computational resources available the EOF calculation is limited to a 4-year period, with approximately 1500 records. Several different periods were examined with no significant differences in the structure of the modes or their percentage of variance explained. The time of day does impact the percentage of variance explained by each mode, most notably for surface heat flux for which the effect of diurnal solar heating occurs. However, the overall locations and magnitudes of the peaks and troughs as well as the temporal evolution of PCs do not exhibit significant differences for each time of day, so we present one of the modes for each considered variable.

Spatial EOF patterns and principal components (PCs) of wind stress perturbations for the period prior to the assimilation of HFR measurements (June 2007–September 2010).

The four key surface forcing terms are surface heat flux, surface salinity flux, east–west wind stress, and north–south wind stress. Of these, increments in surface salinity flux are quite small compared to their initial value, while increments in surface heat flux (10 %–15 % of initial value) and the wind stresses (15 %–20 % of initial value) are significant.

For surface heat flux and near-surface temperature, we observe that the EOF1
modes represent 63 % and 20.8 % of the variability, respectively, with
a consistent sign over the region (Fig.

Spatial EOF patterns and principal components (PCs) of wind stress perturbations for the period including the assimilation of HFR measurements (January 2011–January 2014).

The EOF1 modes of the near-surface velocity increments explain 26.1 % and
20.8 % of the variance, respectively. Both modes exhibit a strong impact
south of the main Hawaiian Islands. The structure of the wind stress curl in
this region results in the spin-up of cyclonic and anticyclonic eddies to the
north and south of the lee side of each island, respectively

The EOFs of surface wind stress increments are confined to relatively small
regions of the model domain (Figs.

With the integration of the HFR measurements (October 2010), the dominant
wind stress increments occur across the shallow region close to the south
coast of O`ahu (Fig.

We have presented a 10-year reanalysis of the PacIOOS Hawaiian Island Ocean Forecast System and assessed the performance of the state-estimate solution and free-running forecasts. Using a time-dependent incremental strong-constraint four-dimensional variational data assimilation (I4D-Var) scheme, we show that the model represents the observational data well over the time period. The state-estimate solution reduces the RMSA compared to the forecast by 3 % (salinity) to 37 % (surface velocities). A limitation of the model–observation comparison is given by the fact that in the absence of a sufficient number of independent observations, only assimilated data could be used for the validation.

The largest reduction of the cost function of the state-estimate solution occurs when minimizing the residuals to HFR data, with SST also accounting for a significant improvement. On average, the assimilation achieves the near-optimal solution; however, the variability is heavily influenced by the HFR observations. The analysis suggests that the observational errors associated with HFR are too low and results could be improved by redefining these errors. This is supported by the increase in variability and upward trend of optimality towards the end of the time period during which HFR observations are most numerous.

The increments made by the reanalysis have revealed that sea surface height and salinity initial conditions are not significantly adjusted by the I4D-Var procedure, whereas temperature and velocity account for a significant change from the forecast field. For the atmospheric forcing, surface salinity is insignificant, but the adjustments made to surface heat flux and wind stresses alter the forcings by up to 20 %. This corresponds to cost function statistics that point to HFR and temperature as the two dominant observation sources.

The dominant EOF mode for adjustments of surface heat flux and near-surface temperature exhibits a monopole structure, indicating a slight bias correction between the ocean and atmospheric model. The leading modes of wind stress increments are concentrated in the region south of O`ahu. The wind stress heavily influences the surface currents and adjustments are mostly made as a consequence to HFR data. Additional analysis reveals that wind stress adjustments in the channels between the islands dominated the increments in the period prior to the radar-based measurements of surface currents.

The reanalysis has provided the testing for improvements to the PacIOOS
operational forecast system. The data are being used to update the back
catalog available to the public at

The specific ROMS Fortran source for this package is under
the MIT license and is available from

Atmospheric surface forcing and HF radar observations are
distributed through the PacIOOS data portal at

DP and BSP designed and conducted the reanalysis simulations. All three authors contributed to the analysis and interpretation of the model results and to writing the paper.

The authors declare that they have no conflict of interest.

The authors would like to thank the GODAE for hosting the Argo observations
and the HOT project for CTD and Seaglider data. The authors would also like
to thank Yi-Leng Chen of the University of Hawai`i Department of Meteorology
for the atmospheric model data MM5 and WRF. The authors are grateful to two
anonymous reviewers and the editor for helping improve this paper. This work
was supported by PacIOOS (^{®}), funded in part by
National Oceanic and Atmospheric Administration (NOAA) award
no. NA16NOS0120024. This is SOEST publication no. 10525. Edited by: Steven Phipps Reviewed by: two
anonymous referees