Seafoodborne
Longterm monitoring has been established in the Great Bay Estuary (GBE) by the Northeast Center for
The goal of this study was to develop an integrated modeling approach to predict
The study area was the Great Bay estuary in New Hampshire. The two sampling locations (
Oyster samples were collected from the two oyster beds at NI and OR except during the period January–March from June 2007 through December 2016. For each sampling date, 10–12 oysters were cleaned and aseptically shucked into a sterile beaker (liquor and meat), weighed and diluted 1:1 with alkaline peptone water (APW (pH 8.6, 1% NaCl), and homogenized. A volume of 20 mL homogenate was further diluted in 80 mL APW for a starting dilution of 1:10. A volume of 1 mL of 1:10 solution was added to three tubes and then serially diluted with 1 mL aliquots into three serial dilutions containing 9 mL APW (pH 8.6, 1% NaCl). Each tube was incubated at 37 °C overnight (18–20 h) following the U.S. Food and Drug Administration Bacteriological Analytical Manual (BAM) [
Following incubation, turbid APW tubes were scored positive for growth. From 2007 to 2010, turbid tubes were streaked to ThioglycollateCitrateBileSalt (TCBS) agar (Beckton Dickson (BD), Franklin Lakes, NJ, USA) and incubated at 37 °C for 18–20 h. From 2011 to 2016, turbid tubes were streaked onto Vibrio CHROMAgar (CHROMagar, Paris, France) and incubated at 37 °C for 18–20 h. Sucrose negative (green) colonies from TCBS or purple colonies from CHROMagar were streaked onto tryptic soy agar (TSA; BD) and incubated at room temperature for 18–20 h. TSA isolates were inoculated in Heart Infusion (HI) broth for 18–20 h. Then, 1 mL HI aliquots were pelleted for 5 min at 8000 rpm, resuspended in 1 mL molecular biology grade water (Phenix Research Products, Candler, NC, USA), boiled at 100 °C for 10 min and debris removed by centrifugation. Species identity of isolates was determined by polymerase chain reaction (PCR) performed using 2 μL cleared supernatant in 13 μL Mastermix, iQSupermix (BioRad, Hercules, CA, USA) using a BIO RAD T100 thermocycler and published primers and conditions [
All statistical computations were performed in the R Statistical Program and Environment, version 3.5.1 [
All measurements were arranged in chronological order based on the date of measurement and multiple time series were compiled for the entire study period. The relationships between the time series for water quality variables, including water temperature, salinity, pH, DO, turbidity, CHL, TDN, rainfall and
To explore the seasonality and the general trend throughout the whole study period (2007–2016) in all variables—
In both models,
When estimates of
The models’ performance was determined by the deviance explained, residual variation, Akaike’s Information Criterion (AIC), and coefficient of determination (
In addition to a general trend and Mann–Kendall trend analysis, we explored potential trends in high values of
To explore the relationship between the response variable,
Nonlinear relationships were initially assessed using nine default thinplate splines
The relationships between the environmental conditions and
The environmental parameters determined to be significant in univariate models (Models 5 and 6) were incorporated into a multivariate general linear regression model using Gaussian (GLMG) and negative binomial (GLMNB) distributional assumptions. For GLMNB, the dispersion was determined by the index of dispersion: ∅ =
We then added variables to reflect the trend and seasonal oscillations and finetuned the model by using the photoperiod variable (Model 8), or harmonic terms (Model 9). In both models:
For these hybrid models, we employed sequential model building using both Gaussian and negative binomial distributional assumptions in parallel and explored the contribution of interaction terms to the model’s fit. Overall performance of GLMs was evaluated by evaluation of Akaike’s Information Criterion (AIC) [
Using the parameters of the harmonic terms, e.g., the estimates of
The predictive skill or forecasting ability of the selected versions of Models 7, 8 and 9 were evaluated by splitting the whole dataset into two datasets representing two periods: a training dataset from 2007 to 2013, and a test dataset from 2014 to 2016. Correlations between environmental variables and
The peak timing of
Over the tenyear period of surveillance there were significant increases in
Individual linear and nonlinear regression analyses conducted between
The form of the relationship between the environmental conditions and
The nonlinear regression between pH and
A multiple regression model was next developed to determine a set of environmental variables that predict
Spearman rank correlation analysis of the individual intervals indicates that photoperiod, water temperature, DO, pH and salinity were significantly correlated with
The hybrid model (Model 9.1) provided the best overall fit for each dataset time interval, with consistently lower RMSE and higher
The environmental (Model 7.4), hybrid (Model 8.1), and harmonic regression (Model 9.1) models developed with the observations from the training dataset accurately predict the overall trend, seasonality, and dispersion of the test dataset (
The intrinsic link that
Photoperiod and harmonic regression models along with correlation analysis showed that
Salinity and water temperature are both seasonally variable parameters that, together are the most commonly cited environmental drivers of
Though most studies find little to no correlation between pH and
In other studies [
Approximately half of the variability of
Peak timing was used to assess each environmental variable individually to detect how environmental variables may contribute to the development of ideal conditions for
A major characteristic of the
Model evaluation, estimations, and predictions illustrate how each model provides fit and prediction ability of the variability in
The increased incidence of illnesses caused by
This study suggests that transferable models can be developed for forecasting public health risks related to
The following are available online at
Conceptualization, M.A.H., E.A.U., C.A.W., V.S.C., E.N.N. and S.H.J.; methodology, M.A.H., E.A.U., C.A.W., V.S.C., E.N.N. and S.H.J.; software, M.A.H. and E.A.U.; validation, C.A.W., V.S.C., E.N.N. and S.H.J.; formal analysis, M.A.H. and E.A.U.; resources, C.A.W., V.S.C. and S.H.J.; data curation, M.A.H. and E.A.U.; writing—original M.A.H. and E.A.U.; writing—review and editing, C.A.W., V.S.C., E.N.N. and S.H.J.; visualization, M.A.H. and E.A.U.; supervision, S.J.H.; funding acquisition, C.A.W., V.S.C. and S.H.J.
The authors gratefully acknowledge partial funding support from the National Science Foundation EPSCoR IIA1330641, USDA National Institute of Food and Agriculture Hatch NH00574, NH00609 (accession 233555), and NH00625 (accession 1004199), and the National Oceanic and Atmospheric Administration College Sea Grant program and New Hampshire Sea Grant program grants R/CE137, R/SSS2, R/HCE3, and from NSF IRES Track I: Collaborative Research: U.S.Indonesian Research Experience for Students on Sustainable Adaptation of Coastal Areas to Environmental Change (award #1826939).
The authors would like to thank Jennifer Mahoney, Meg Striplin, Brian Schuster, Crystal Ellis, Jong Yu, Eliot Jones, Michael Taylor, Ashley Marcinkiewicz, Feng Xu, Tom Gregory, Chris Peters, Jackie Lemaire, Audrey Berenson, Sarah Richards, Emily Schulz, and Elizabeth Deyett for their help with sampling, sample processing, detection analysis, and database management. Also, we thank Iago Hale, Alexandra Kulinkina and Tania M. Alarcon Falconi for support with implementing harmonic regression analysis, peak timing calculations in the R environment, and other statistical approaches.
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Study area and sites for oyster and water sampling in the Great Bay Estuary, New Hampshire, USA. OR = Oyster River; NI = Nannie Island.
Patterns in (
The number of observations per year above the 75th percentile for (
Loess smoothing applied to
Model estimations (filled circle) and observed
Spearman correlation analysis of
Estimates of
Trend and seasonality estimates detected by Model 1 and Model 2 for
Variable ^{a}  Coefficients ^{b}  Standard Error 

Deviance  AIC  Peak Timing ^{c}  

Trend  Seasonality  Trend  Seasonality  
0.0005 ***  0.57 ***  0.0001  0.11  0.19  0.21  673.4  
0.0006 ***  −2.87 *** 
0.0001  0.34 
0.50  0.51  597.4  222 ± 5  
Water Temperature (°C)  <0.001  2.01 ***  <0.001  0.15  0.53  0.54  774.1  
0.002 *  −5.81 *** 
<0.001  0.24 
0.93  0.93  497.9  213 ± 2  
Dissolved Oxygen (mg/L)  <0.001  −0.31 ***  <0.001  0.05  0.22  0.23  441.5  
<0.001  1.45 *** 
<0.001  0.15 
0.58  0.59  352.0  220 ± 6  
Salinity (ppt)  0.001 ***  −0.19  0.0003  0.20  0.12  0.13  849.4  
0.002 ***  −4.06 *** 
0.0003  0.76 
0.26  0.28  825.5  251 ± 18  
pH  <0.001 ***  −0.02 *  <0.001  0.01  0.08  0.10  19.9  
<0.001 ***  −0.06 
0.006  0.05 
0.09  0.11  20.9  298 ± 98  
Turbidity (NTU)  −0.02 ***  3.93  0.007  4.10  0.06  0.09  1723.6  
−0.02 ***  −6.34 
0.007  16.77 
0.06  0.08  1716.5  135 ± 111  
Chlorophyll 
−0.0002  0.62 ***  0.005  0.0002  0.09  0.10  775.3  
<0.001  0.11 
<0.001  0.65 
0.09  0.10  778.2  180 ± 37  
Total Dissolved Nitrogen (mg/L)  <0.001 ***  −0.008 *  <0.001  0.005  0.15  0.16  −229.0  
<0.001 ***  0.02 
<0.001  0.02 
0.15  0.17  −228.2  206 ± 45  
Rainfall (mm)  <0.001  0.01 *  <0.001  <0.001  0.01  0.02  −76.7  
<0.001  −0.03 
<0.001  0.001 
0.01  0.04  −74.6  209 ± 38 
^{a} Variables are shown for Model 1, top row and Model 2, two bottom rows for sine and cosine terms; ^{b} the significance of coefficients is indicated as *** 0.001, ** 0.01, and * 0.1; ^{c} peak timing estimates are represented by the mean and standard error values; for two parameters, dissolved oxygen (DO) and total dissolved nitrogen (TDN), the estimates reflect the seasonal nadir. AIC, Akaike’s Information Criterion.
Trends of the frequency of days when
Year 

Salinity  TDN  pH  

75th Percentile  25th and 75th Percentile  
220 MPN/g  27 ppt  0.27 mg/L  7.56–7.88  

% 

% 

% 

%  
2007  2/17  11.8%  196/488  40.2%  6/17  35.3%  215/488  44.1% 
2008  2/18  11.1%  10/465  2.2%  0/18  0.0%  148/465  31.8% 
2009  1/11  9.1%  18/463  3.9%  1/11  9.0%  173/449  38.5% 
2010  3/14  21.4%  58/451  12.9%  0/14  0.0%  157/451  34.8% 
2011  0/9  0.0%  46/377  12.2%  0/9  0.0%  102/430  23.7% 
2012  3/7  42.9%  135/475  28.4%  0/7  0.0%  217/447  48.5% 
2013  1/6  16.7%  65/438  14.8%  3/6  50.0%  231/438  52.7% 
2014  7/22  31.8%  135/432  31.3%  13/22  59.1%  277/432  64.1% 
2015  8/24  33.3%  205/443  46.3%  10/22  45.5%  230/408  56.3% 
2016  8/21  38.1%  266/479  55.5%  4/18  22.2%  289/465  62.1% 
The relationship between
Variable  Model 5  Model 6  



Water Temperature (°C)  <0.001  <0.001  0.03  0.03  8.27 
Dissolved Oxygen (mg/L)  <0.001  <0.001  0.04  0.05  7.28 
Salinity (ppt)  <0.001  <0.001  −0.01  0.0  0.0 
pH  0.009  0.002  0.14  0.08  8.48 
Chlorophyll 
0.05  0.09  0.01  0.29  0.11 
Rainfall (mm)  0.03  0.02  0.04  0.04  −6.31 
Turbidity (NTU)  0.27  0.48  0.01  0.25  0.43 
Total Dissolved Nitrogen (mg/L)  0.38  0.31  0.02  0.03  3.20 
The sequential building of multiple regression models for
Model Composition ^{a}  Coefficients  St. Error  Deviance  AIC  Coefficients  St. Error  Deviance  AIC 

Model 7 GLMG  GLMNB  
1. Temperature 
0.34 *** 
0.03 
0.54  586.9  0.34 *** 
0.03 
0.48  1533.4 
2. Temperature 
0.37 *** 
0.03 
0.57  583.1  0.41 *** 
0.03 
0.51  1521.6 
3. Temperature 
0.35 *** 
0.03 
0.59  572.5  0.34 *** 
0.02 
0.53  1518.3 
4. Temperature 
0.35 *** 
0.02 
0.61  567.8  0.34 *** 
0.02 
0.57  1507.1 
Model 8 GLMG  GLMNB  
1. Trend 
0.0003 ** 
0.0001 
0.62  564.4  0.0003 *** 
0.0001 
0.58  1501.9 
2. Trend 
0.0002 ** 
0.001 
0.62  565.9  0.0003 *** 
0.0001 
0.58  1503.9 
Model 9 GLMG  GLMNB  
1. Trend 
0.0003 ** 
0.0001 
0.62  566.3  0.0003 *** 
0.0001 
0.58  1504.2 
2. Trend 
0.0003 ** 
0.0001 0.69 
0.62  567.7  0.0003 *** 
0.0001 
0.58  1506.2 
^{a} The significance of coefficients is indicated as *** 0.001, ** 0.01, and * 0.1; ^{b} CpH data were treated as reparametrized CpH variables.
The performance of three selected models: environmental model (Model 7.4), hybrid model (Model 8.1), and harmonic regression model (Model 9.1) for three time periods: full (P1), training (P2), and testing (P3) intervals.
Model  Variable ^{a}  Time Interval  

P1  P2  P3  
Model 7.4  Coefficient: Temperature  0.34 ***  0.37 ***  0.31 *** 
Salinity  0.10 ***  0.08 **  0.24 **  
CpH  5.51 *  5.12  266.01 ***  
Salinity*CpH  −0.53 ***  −0.53 ***  −11.01 ***  

0.54  0.58  0.57  
Deviance  0.57  0.58  0.54  
RMSE  1.91  1.79  1.96  
Model 8.1  Coefficient: Trend  0.0003 ***  0.0003  0.0007 
Photoperiod  −0.31 ***  −0.28 **  −0.48 **  
Temperature  0.43 ***  0.45 ***  0.44 ***  
CpH  −4.51 ***  −4.32 ***  5.10  
r2  0.61  0.57  0.61  
Deviance  0.58  0.59  0.53  
RMSE  1.85  1.81  1.92  
Model 9.1  Coefficient: Trend  0.0004 ***  0.0004 *  0.0008 
Sin(.)  −0.41  −1.88 *  1.72 *  
Cos(.)  0.63  −1.54  4.66 **  
Temperature  0.40 ***  0.29 **  0.74 ***  
CpH  −4.30 ***  −4.20 ***  1.60  

0.61  0.55  0.63  
Deviance  0.58  0.60  0.54  
RMSE  1.81  1.82  1.83 
^{a} The significance of coefficients is indicated as *** 0.001, ** 0.01, and * 0.1.