A New Paradigm for Medium-Range Severe Weather Forecasts: Probabilistic Random Forest–Based Predictions
Advanced Search
Select up to three search categories and corresponding keywords using the fields to the right. Refer to the Help section for more detailed instructions.

Search our Collections & Repository

For very narrow results

When looking for a specific result

Best used for discovery & interchangable words

Recommended to be used in conjunction with other fields



Document Data
Clear All
Clear All

For additional assistance using the Custom Query please check out our Help Page


A New Paradigm for Medium-Range Severe Weather Forecasts: Probabilistic Random Forest–Based Predictions

Filetype[PDF-5.82 MB]

Select the Download button to view the document
This document is over 5mb in size and cannot be previewed


  • Journal Title:
    Weather and Forecasting
  • Personal Author:
  • NOAA Program & Office:
  • Description:
    Historical observations of severe weather and simulated severe weather environments (i.e., features) from the Global Ensemble Forecast System v12 (GEFSv12) Reforecast Dataset (GEFS/R) are used in conjunction to train and test random forest (RF) machine learning (ML) models to probabilistically forecast severe weather out to days 4–8. RFs are trained with ∼9 years of the GEFS/R and severe weather reports to establish statistical relationships. Feature engineering is briefly explored to examine alternative methods for gathering features around observed events, including simplifying features using spatial averaging and increasing the GEFS/R ensemble size with time lagging. Validated RF models are tested with ∼1.5 years of real-time forecast output from the operational GEFSv12 ensemble and are evaluated alongside expert human-generated outlooks from the Storm Prediction Center (SPC). Both RF-based forecasts and SPC outlooks are skillful with respect to climatology at days 4 and 5 with diminishing skill thereafter. The RF-based forecasts exhibit tendencies to slightly underforecast severe weather events, but they tend to be well-calibrated at lower probability thresholds. Spatially averaging predictors during RF training allows for prior-day thermodynamic and kinematic environments to generate skillful forecasts, while time lagging acts to expand the forecast areas, increasing resolution but decreasing overall skill. The results highlight the utility of ML-generated products to aid SPC forecast operations into the medium range. Significance Statement Medium-range severe weather forecasts generated from statistical models are explored here alongside operational forecasts from the Storm Prediction Center (SPC). Human forecasters at the SPC rely on traditional numerical weather prediction model output to make medium-range outlooks and statistical products that mimic operational forecasts can be used as guidance tools for forecasters. The statistical models relate simulated severe weather environments from a global weather model to historical records of severe weather and perform noticeably better than human-generated outlooks at shorter lead times (e.g., day 4 and 5) and are capable of capturing the general location of severe weather events 8 days in advance. The results highlight the value in these data-driven methods in supporting operational forecasting.
  • Keywords:
  • Source:
    Weather and Forecasting, 38(2), 251-272
  • DOI:
  • ISSN:
  • Format:
  • Publisher:
  • Document Type:
  • Funding:
  • Rights Information:
  • Compliance:
  • Main Document Checksum:
  • Download URL:
  • File Type:

Supporting Files

  • No Additional Files
More +

You May Also Like

Checkout today's featured content at repository.library.noaa.gov

Version 3.26.1