The Mobius2 data formats

To run a Mobius2 model you need to provide it with site specific data for parameters, forcings (and sometimes more, like topological information).

A model combined with site-specific data is called a model application.

Each Mobius2 model application has a single main data file with a .dat suffix.

The main data file uses the common declaration format, but you don’t need to fully learn this format to use existing Mobius2 models. Instead we provide a guide to set up a new project from an existing example file.

Input time series (forcing and comparison data) are not put directly in the main data file, instead the main data file says what other files series data is loaded from.

Time series can be provided either on a .csv or .xlsx format.

Model inputs and comparison series

Model inputs are forcings that are used by the model for computations. These are declared as variables in the module files, and show up under “Input data series” in MobiView2.

You can put as many comparison time series you want into your data files using the same format as model series (name them anything that is not the name of a model series). These will be loaded and displayed as “Additional series”, but not used in the model computation. This is useful for instance when you want to plot your model results against observations, compute goodness-of-fit statistics, autocalibrate etc.

The sampling step

The sampling step is both the main time step of the model run and the step size for input data (these could maybe be decoupled in the future). The sampling step is by default 1 day, but can be changed with a time_step declaration in the main data file, e.g.

time_step([4, hr])  # 4 hour time step.

Each series value is held constant over a single sampling step even if the model uses ODE solvers that evaluate at fractional steps, so you should set the sampling step to the frequency you want to have in the input data.

If more than one series value is provided for the same series and sampling step, only the last value will be used. This could happen if the series data has a higher frequency than the sampling step. We may implement automatic aggregation in these instances later, but for now you have to do that yourself before you prepare the input file.

If some of the series have a lower frequency than the desired sampling step, you could use interpolation (see below).

Datetime format

Date(-time) values in data files (except some times in .xlsx files) are on one of the following formats:

YYYY-mm-dd
YYYY-mm-dd HH:MM:SS

Alone, YYYY-mm-dd is equivalent to YYYY-mm-dd 00:00:00.

The series interval

The series interval is the largest interval that fits all the series data that is provided in the data set. When loaded into a model application, all series will for technical reasons be expanded to fill this entire window.

The model can only be run if the “Start date” and “End date” parameters of the model run fit within this interval.

A given data set can cut this interval down by specifying a series_interval in the main data file, e.g.

series_interval(2000-01-01, 2020-01-01)

Series values outside this specified interval are then discarded. This can be useful if the loaded source data is is very large and you only want to run the model for shorter intervals.

Missing values

If a model input series is not provided with data, it is filled as 0. If it is provided with some values and you don’t interpolate (see below), missing values are also filled with 0. You should avoid this unless 0 is actually a legitimate value for this input.

Comparison series are filled with nan (non-numbers). Missing values in observation series are ignored when e.g. computing goodness-of-fit stats.

Series indexing

A given input series can either have a single instance, or can be distributed over a tuple of one or more index sets. This can be used e.g. to provide separate meteorological data for different sub-areas of the model domain.

What index sets you are allowed to distribute a model input series over depends on the components forming the location of the series and what index sets they are distributed over in the model. Comparison series can be distributed over any combination of index sets that exist in the model.

The distribution must be the same for all instances of the same input name, even if they come from separate files. This means that each series of the same name must have indexes from the same tuple of index sets (though the indexes themselves could of course be different).

How to distribute a series over index sets is described in the pages for the individual series data formats.

Series flags

Both series formats allow you to specify flags. A flag is either a unit declaration or one of the following:

  • Interpolation flags Interpolation flags tell Mobius2 to fill in missing values in the data. Values will be filled for every sampling step. The following interpolation flags are available
    • step_interpolate. Fills missing values with the last valid value before it.
    • linear_interpolate. Uses linear interpolation to fill missing values between valid values.
    • spline_interpolate. Uses cubic spline interpolation to fill missing values between valid values. This is the smoothest option.
    • inside. If this flag is provided, any interpolation will only fill in values between missing values. If the flag is not present, interpolation will also fill expand the first and last valid values to fill the entire series interval.
  • repeat_yearly. In this case the first year of the time series is repeated yearly to fill the entire series interval.

To put multiple flags they can be listed after one another separated by spaces.


Table of contents


© Norsk Institutt for Vannforskning (The Norwegian Institute for Water Research), NIVA 2024.