HIGH-RESOLUTION MODEL VERIFICATION EVALUATION
PI and organization: J. Maksymczuk & M. Mittermaier(UK MetOffice)
Co-Is: R. Crocker, C. Pequignet, A. Ryan (UK MetOffice)
Abstract: The proposal considers CMEMS forecast products applicable to regional domains, exploiting spatial methods and ensemble techniques to determine the levels of forecast accuracy and skill in high resolution ocean models. It aims to provide a probabilistic framework addressing priority R&D crosscutting activity identified in the Service Evolution Strategy.
The project has two objectives:
A. Understand the accuracy of CMEMS products at specific observing locations using neighbourhood methods and ensemble techniques;
B. Understand the skill of CMEMS products in forecasting events or features of interest inspace and time.
Objective A will be useful for model developers to understand the basic biases and skill of oceanforecasts at a given location.
Objective B will be useful to CMEMS customers and downstream users. It uses an object-based method called MODE (Method for Object-based Diagnostic Evaluation) and MODE-TD (time domain) to evaluate the evolution of events in both forecast and observation fields. From evaluations of the overlapping domains an improved methodology for assessing higherresolution ocean models will be delivered, which will better inform users on the quality of CMEMSregional forecast products. The methodology will allow for intercomparison between deterministic and probabilistic forecasts in an equitable and consistent way.
Project highlight at mid-term:
The High-resolution model Verification Evaluation (HiVE) project considers CMEMS forecast products applicable to regional domains and aims, for the first time, to exploit spatial verification methods to determine the levels of forecast accuracy and skill in high-resolution ocean models.
One part of the project involves evaluating the overlapping domains of the AMM7 (1/10°), IBI (1/36°) and AMM15 (1.5km) models at observing locations using a single-observation-neighbourhood- forecast (SO-NF) spatial method known as the High-Resolution Assessment (HiRA) framework (Mittermaier 2014). The methodology utilises ensemble and probabilistic forecast verification metrics such as the Continuous Ranked Probability Score (CRPS, Hersbach 2000). It allows for an inter- comparison of models with different grid resolutions as well as between deterministic and probabilistic forecasts in an equitable and consistent way. It was developed specifically for the evaluation of high-resolution sub-10 km models.
The CRPS is an error-based score where a perfect forecast has a score of 0, i.e. smaller scores are better. It measures the difference between two cumulative distributions, a forecast distribution formed by ranking the quasi-ensemble members in the neighbourhood, and a step function describing the observed state. HiRA works on the assumption that all grid points in a neighbourhood, centred on an observing location, are equi-probable outcomes at the observing location. For this assumption to remain true, neighbourhood sizes cannot become too large.
To illustrate, the figure shows a comparison of the CRPS as a function of lead time for the AMM7 and IBI models for the daily mean Sea Surface Temperature (SST) and for a sequence of four neighbourhood sizes. It shows that AMM7 has better (lower) CRPS scores than IBI. For both models the scores become progressively better (lower) with increasing neighbourhood size, with considerable reductions in error with successive increases in neighbourhood size. AMM7 is rather curious in that the CRPS increases more steeply with lead time and with a rather curious invariance to neighbourhood size for the 3×3 (9) and 5×5 (25) neighbourhoods. This invariance has not been seen before and will be the subject of further investigation. AMM7 does show a substantial reduction in error when utilising a 3×3 neighbourhood compared to using the nearest grid point, suggesting that using a spatial verification method is justified. The fact that IBI is being outperformed by the coarser resolution AMM7 also needs further understanding. The CRPS is sensitive to bias for example. Perhaps IBI forecasts are more biased which could inflate the CRPS.
Continuous Ranked Probability Scores (CRPS) for sea surface temperature at different forecast lead times for AMM7 (blue) and IBI (red) models for the period September 2018 to November 2018. Each model shows results for several different HiRA neighbourhood sizes (in number of grid squares).
Hersbach, H., 2000: Decomposition of the Continuous Ranked Probability Score for Ensemble Prediction Systems. Wea. Forecasting, 15, 559–570, https://doi.org/10.1175.
Mittermaier, M.P., 2014: A strategy for verifying near-convection-resolving forecasts at observing sites. Wea. Forecasting. 29(2), 185-204.