Evaluating the Forecast Evolution

Every year in the SFE, a fundamental problem arises when evaluating the long-range full-period forecasts: how to rate the long-range full period forecasts. On the Innovation Desk, participants are given the chance to issue Day 3 forecasts in addition to their day 1 forecast, if time and the potential warrants. Due to the weekly structure of the SFE (which runs M-F), at best two of these forecasts can be evaluated each week – those issued on Monday for Wednesday, and those issued on Tuesday for Thursday. Luckily, with a relatively active period of severe weather CONUS-wide, the three weeks of the experiment so far have yielded evaluations for four out of the potential six days that have Day 3 forecasts. Two of these forecasts give examples of how the long-range forecasts can change as the day of the event draws nearer, and more guidance becomes available: 10 May 2017 and 18 May 2017.

10 May 2017 began with one MCS on the Texas/Oklahoma Panhandle border and another in southeastern Iowa. The forecast focus of the day was centered over Oklahoma, as multiple convection-allowing ensembles suggested that the MCS would rapidly weaken as it moved northeastward over Oklahoma. This movement left an opportunity for clearing and a boundary across the area of interest.

However, these mesoscale details are far from certain in the long-range forecast. MPAS showed a dying MCS that moved north, with redevelopment in central Oklahoma, and the GFS and MPAS showed a narrow axis of overlapping CAPE and shear. The GFS suggested very isolated storms, as the QPF signal was not strong. The SREF showed the highest potential into Kansas, rather than Oklahoma. This was a tough forecast, but eventually the desk issued the following Day 3 forecast:

During the Day 2 forecast, more convection-allowing ensembles were within range. The NAM 4 km indicated high reflectivity across Kansas, and the 1200 UTC NSSL-WRF showed cellular convection at 0000 UTC. Two main concerns were indicated by the CAMs: the evolution of an anticipated morning MCS, and the formation of a storm off of the triple point. If CAMs don’t correctly anticipate the strength and placement of morning convection, the environment portrayed by the CAMs may be unrealistically favorable for subsequent severe convection. The resultant Day 2 forecast extended the 15% along the I-70 corridor, to account for a potential MCS moving across Kansas. Overall, the forecast was broad, as uncertainty was large due to the morning convection.

This forecast captured the general area of the severe reports better than the Day 3 forecast (particularly in Kansas), but detrimentally trimmed the western edge of the forecasts in the Texas Panhandle.

By Day 1, the morning convection was underway, and guided us in our subsequent forecasts. If models didn’t handle the early convection well, they were taken with a grain of salt, as the extent of the afternoon instability was often not depicted well either. However, these members were taken in context with the other guidance – if CAMs have a consensus on an aspect of a solution, that indicates a signal whether or not the member handled the morning convection well. Many CAMs still initialized a storm off of the triple point, but few CAMs handled the morning convection well. What resulted was a huge envelope of possibility, and the participants found that issuing a forecast for Day 1 was extremely difficult. Eventually, the Innovation Desk issued a broad 15% probability, honing in on the storm that CAMs were initiating off of the triple point. This incorrectly trimmed back the western extent of the 5% and 15%,

10 May 2017 was a tricky case, dependent on morning convection that was not well-depicted in the CAMs. But what about a higher-end event…say, 18 May 2017?

18 May 2017 was forecast well in advance by the SPC, with an outlook area being indicated starting in the Day 6 Forecast.

Large-scale models such as the GFS indicated a strong dynamic forcing and high dewpoints over central Oklahoma. We wanted to edge our probabilities west of the ensembles, due to concerns about the guidance mixing drylines too far east. The eventual Day 3 forecast highlighted an area of concern in western Oklahoma, but captured neither the eastern extent of the severe convection nor the reports extending over southeast Colorado.

At this point in the experiment, the Innovation Desk decided to solely use CAMs to issue the Day 2 forecasts – to determine how the evolution from using broad-scale model environments to fine-scale model environments would affect our forecasts. Thus, the desk considered fields such as the dewpoints, soundings to measure the depth of the moisture, and storm-scale attributes such as reflectivity and updraft helicity. When issuing the forecasts, the dryline position forecast was a bit more confident, which was improved compared to the Day 3 forecast. The area of 30% and 45% was also increased, as a consistent signal for significant severe weather emerged from the CAMs.

By Day 1, the instability across central Oklahoma was evident by the time the experiment begins, 8 AM CDT. Participants could experience it as they walked into the building in the form of a warm, humid morning! All of the environmental parameters from the operational, coarse-scale models were coinciding – a 700 mb weakness in the forecast flow had disappeared, mass convergence was focused along the front, and 850 mb flow was strong throughout the forecast period. UH tracks from the CAM ensembles was extremely high over Kansas, and exceeded the colorbar scale on multiple ensembles. Lots of left-moving storms were evident in the CAMs as well, particularly when looking at cores of strong updrafts. There was also a stronger signal for linear structures, so the eastern bound of the 30% was extended (albeit slightly north of the largest corridor of reports). At this point, the participants also felt confident enough in the parameter space to issue a 60% significant contour, equivalent to an SPC high risk.

So how did the participants think their forecasts did?

18 May had overall better ratings than 10 May, but the scores between the Day 1 and Day 2 forecasts were the same. 10 May, however, showed a steady improvement in scores. While objective analysis of the forecasts will come later, the subjective evaluations paint a picture of increasing skill as forecast lead time decreases.

HWT EFP

The Experimental Forecast Program

Evaluating the Forecast Evolution