Predicting the anomaly of the water level with a neural network (black) and the contributions of the inputs (coloured bars below).

Extreme Water Level Forecasting of Highway 37 CA

Water level systems consist of an abundance of normal tidal behavior but are frequently exposed to underlying environmental changes that result in extreme high or low water levels. Predicting when and why these significant changes occur is difficult and requires both an understanding of the physical water system, as well as a nuanced understanding of modeling techniques that can accurately predict these changes without sacrificing too much performance during the normal behavior of the waterway. We specifically look at predicting the water level of the Petaluma River in Northern California where, currently, linear models perform decently well at modeling the normal behavior but fall short when exposed to the more extreme values associated with floods. We use a hierarchy of time series forecasting models including MLPs, LSTMs, and Transformers to try and address the ability to predict extreme water levels. We found that MLPs and LSTMs generally perform very well in their ability to predict these extreme water levels while transformers require more effort to achieve the same. In addition to assessing these models by themselves we also evaluate them in ensembles comprised of members of a singular type of model. This gives a more consistent assessment of how these models perform and enables us to validate the models’ fits to the extreme water levels in the data. We do this validation using Shapley Values (SHAP) to specifically look at feature importance and whether it aligns with our expert knowledge. This validation technique is also generalizable to any time series forecasting task and is an effective way at explaining the latent space nonlinear models such as the ones we test here.

Avery Wood, Maike Sonnewald, and John Largier

Code