Assessing the Impact of Biased Target Variables on Machine Learning Models of Severe Hail
-
2025
-
Details
-
Journal Title:Weather and Forecasting
-
Personal Author:
-
NOAA Program & Office:
-
Description:This study examines the implications of using traditional local storm reports (LSRs) versus radar-derived Multi-Radar Multi-Sensor (MRMS) system maximum estimated size of hail (MESH) as classification target variables for training and evaluating machine learning (ML) models to predict severe hail events. Using input data from the NSSL Warn-on-Forecast System (WoFS), we explore how the LSR and MESH severe hail climatologies compare in WoFS and the variation in model performance with the choices of target variable for training and testing. Regardless of the training target variable, all ML models performed better when evaluated on MESH. The improved performance of the LSR-trained model on MESH was attributed to MESH better capturing nighttime events, which reduced spurious false alarms compared to evaluating LSRs only. However, the best model for a given target variable was the one trained on that target variable. For example, when evaluating LSRs, the LSR-trained model performed best. This has operational significance as MESH-trained models may underperform LSR-trained models if the target variable is LSRs. We attribute the better MESH scores to MESH being more spatially and temporally consistent with WoFS versus LSRs. Nevertheless, whether either approach better predicts severe hail occurrence is still to be determined. Last, combining MESH and LSRs did not significantly improve model performance, which may be attributed to the fact that both datasets have unique error sources that do not cancel out. Ultimately, the main goal of this study is to shed light on the broader implications of data choice in the training and verification of ML models.
-
Keywords:
-
Source:Wea. Forecasting, 40, 1015–1028
-
DOI:
-
Format:
-
Document Type:
-
Funding:
-
Rights Information:Other
-
Compliance:Submitted
-
Main Document Checksum:urn:sha-512:ea0e7be389efdbf7ab3081430def21b6961540ac0f0d09efd363d506407cb9d8a3057349cfd5cf38bcca0484aa695a6a6dae9c49058a8754ad405cbad4076a74
-
File Type:
ON THIS PAGE
The NOAA IR serves as an archival repository of NOAA-published products including scientific findings, journal articles,
guidelines, recommendations, or other information authored or co-authored by NOAA or funded partners. As a repository, the
NOAA IR retains documents in their original published format to ensure public access to scientific information.
You May Also Like