Data assimilation

What is Data Assimilation?

Data assimilation is the process of aligning model forecasts with recent streamflow observations to improve streamflow forecasts. The goal of data assimilation is to ground forecasts in real-world conditions and enhance forecast accuracy. Assimilation adjusts the model’s starting point to better align with observations from the past 72 hours so that forecasts begin from conditions that more closely match recent observations.

Benefits of Data Assimilation

Assimilation improves the accuracy and reliability of forecasts, with the greatest impacts in the near term. The key benefits include:

  • More responsive forecasts: Frequent updates allow hydrologic models like HydroForecast to respond quickly to new conditions such as storms or sudden changes in flow.
  • Closer alignment with ground conditions: Forecasts are corrected using observed gauge data, reducing drift from actual conditions. This can introduce visible adjustments or jumps but these changes are meant to reflect the true state of the basin.
  • Rapid response to spikes: Sudden increases in streamflow are reflected in the forecast, making hydrographs more actionable for operations.

How HydroForecast's Assimilation Method Works

HydroForecast uses a data assimilation method that leverages machine learning to improve forecasts. The process is designed to:

  • Balance recent model forecasts with observed data.
  • Optimize for high-flow events so forecasts track peaks more closely.
  • Limit abrupt changes to forecasts by implementing parameters that penalize large adjustments to forecasts.
  • Use only recent observations (recorded in the 72 hours before initialization time) by default.
  • Exclude erroneous or noisy observation data automatically.

HydroForecast assimilates streamflow observations from in-situ gauges to improve the model’s initial condition at the start of each forecast issuance. Data assimilation occurs each time a new forecast is initialized, which varies by model horizon:

  • HydroForecast Short-term – every two hours
  • HydroForecast Seasonal and Annual – once per day, only as of models released in September 2025 and after. For these longer term horizons, the models assimilate gauge data up to the daily initialization time; observations received after initialization are not assimilated to avoid incorporating unrepresentative partial-day data. 

Figure 1 shows an example of how HydroForecast results are updated before and after data assimilation.

Figure 1 - HydroForecast results before (blue) and after (orange) data assimilation.

When is Assimilation Not Beneficial?

Assimilation relies on high-quality observations. If data are noisy or unreliable (e.g., back-calculated reservoir inflows), assimilation can degrade performance. To avoid this, HydroForecast:

  • Smooths out data that is noisy, unreliable, or unrealistic. 
  • Allows assimilation to be turned off for specific forecast points if data quality is poor.

Figure 2 shows a case where noisy observations (grey line) are filtered to avoid a negative impact on the forecast (colored lines).

Figure 2 - Noisy observations are filtered to avoid negative impacts from assimilation.

Assimilation differences between model types

While the assimilation method and process is generally the same for HydroForecast Short-term, Seasonal, and Annual, because the model inputs differ, it is expected that the model initialization point that results from assimilation will be slightly different across these models. If users subscribe to both Short-term and Seasonal/Annual models, we recommend using the Short-term forecast for the first 10 days. Please reach out to team@hydroforecast.com if you have further questions.