# Wiki

# D2D Software / Setting up models

# Setting up models

In the folder `/Examples`

we provide some showcase applications that can be used as a starting point for new modeling projects.

Section 1 describes how a model can be set up and Section 2 how experimental data can be linked to the model.

Please note that the software make heavy use of identifiers for all model and data quantities, for instance for dynamic variables, inputs, parameters, observables, ... It is very important to keep these identifiers consistent throughout one project.

**Table of Contents**

- Setting up models
- 1. Model Definition File
- 1.1 DESCRIPTION section
- 1.2 PREDICTOR section
- 1.3 Defining COMPARTMENTS
- 1.4 Defining the dynamic STATES of the model
- 1.5 Defining the INPUTS
- 1.6 Defining the dynamics
- 1.7 Defining DERIVED variables
- 1.8 Defining the OBSERVABLES
- 1.9 Defining the magnitude of the measurement noise in the ERRORS section
- 1.10 SUBSTITUTIONS section
- 1.11 CONDITIONS section
- 1.12 PARAMETERS section

- 2. Data Definition File

# 1. Model Definition File

A model definition file is a plain text file with the ending `.def`

located in the subfolder `/Models`

of the current MATLAB working directory. The illustration used in the following stem from the example applications in the `/examples`

subfolder of the main code folder: `Becker_2010`

and `Bachmann_2011`

. The model definition files are structured in various section:

DESCRIPTION ... PREDICTOR ... COMPARTMENTS ... STATES ... INPUTS ... REACTIONS (or ODES) ... DERIVED ... OBSERVABLES ... ERRORS ... SUBSTITUTIONS (optional) ... CONDITIONS ...

Each section will be explained in the following. Lines in the model definition file can be commented out using `//`

.

## 1.1 DESCRIPTION section

Can be used to store meta information about the model as quoted lines of text, e.g.

DESCRIPTION "Model of Epo receptor internalization" "Assembled by RG Timmer & RG KlingmΓΌller" "Date: 2010" "Version: 1.0"

## 1.2 PREDICTOR section

Is used to define the independent variable / the predictor. For dynamics model, this is usually time, e.g.

PREDICTOR t T "min" "time" 0 100

- The 1st argument specifies the unique identifier (here
`t`

) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit type (here
`T`

= time), the unit itself (here`"min"`

), and a label used for plots (here`"time"`

). - The 5th and 6th argument specify the range of the independent variable (here 0-100 minutes).

## 1.3 Defining COMPARTMENTS

Can be used to define compartments in a model, for instance cytosol and nuclear compartment in the cell

COMPARTMENTS cyt V "pl" "vol." 0.4 nuc V "pl" "vol." 0.275 ...

The concentration dynamics of the model, in particular the transport reactions, are solved with respect to the size of the compartments.

- The 1st argument specifies the unique identifiers (here
`cyt`

and`nuc`

) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit types (here
`V`

= volume), the units itself (here`"pl"`

), and labels used for plots (here`"vol."`

). These labels do not affect the calculations. - The 5th argument specifies the numerical size of the compartments. Omitting the 5th argument will expose a parameter called
`vol_#`

where # is the compartment name. This parameter can then be estimated or specified in a condition. Note that by default, the parameter is set to constant. If you wish to estimate it, change it to fitted using`arSetPars`

.

If no compartments are required, this section should be left empty. Note that in D2D, rates are always specified relative to the origin of the species. If an in- and out rate of a compartment should be equal and is estimated using a single parameter (e.g. to model passive diffusion), it is important to correct the kinetic rates by the appropriate volumes.

## 1.4 Defining the dynamic STATES of the model

This section describes the dynamic variables / states that evolve over time as will be specified in the REACTIONS / ODES section later.

STATES STAT5 C "nM" "conc." cyt 1 "STAT5" 1 pSTAT5 C "nM" "conc." cyt 1 "phospho STAT5" 1 npSTAT5 C "nM" "conc." nuc 1 "nuclear phospho STAT5" 1 ...

- The 1st argument specifies the unique identifiers (here
`STAT5`

,`pSTAT5`

and`npSTAT5`

) that can be used in the mathematical expressions (reactions / ODE system) later. Initial conditions for each dynamic variable will be generated as free parameters using the prefix`init_`

. - The 2nd-4th argument specify the unit types (here
`C`

= concentration), the unit itself (here`"nM"`

), and a plain text used for plots (here`"conc."`

). These labels do not affect the calculations. - The 5th argument specify the compartment where the state lives in (here either
`cyt`

or`nuc`

). This can be left empty if no compartments were specified above. If the latter arguments are to be used but no compartment is present it is possible to create an`default`

compartment in the section before in which all state live in. - The 6th argument is a flag if the state should be showed in plots (0=no, 1=yes), default is yes.
- The 7th argument a clear text label for plotting.
- The 8th argument specifies if the state is strictly positive (0=no, 1=yes), default is yes.

## 1.5 Defining the INPUTS

In this section input function, usually depending on the independent variable, can be specified. The input function can be used in the mathematical expression later, such as the reaction rate equations or the observables. Please note that the input do not belong to a specific compartment. The user has to ensure plausibility of the usage of the input function himself.

INPUTS Epo C "units/cell" "conc." "k1*exp(-k2*t)" SAv C "units/cell" "conc." "k3" ...

- The 1st argument specifies the unique identifiers (here
`Epo`

and`SAv`

) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit types (here
`C`

= concentration), the unit itself (here`"units/cell"`

), and a plain text used for plots (here`"conc."`

). - The 5th argument specify the mathematical expression of the input function (here
`"k1*exp(-k2*t)"`

is an exponential decay and`"k3"`

is a constant with unknown value).

Sometimes the input function should account for a change in the value of a dynamic variable, see the bolus injection example.

### 1.5.1 Step functions, cubic splines and user defined functions

Step functions, splines and user defined functions can be used as mathematical expression of the input function.

**Step functions** can be defined by:

`"step1(t, level1, switch_time, level2)"`

giving a simple step function at the value`switch_time`

of the independent variable (usually time).`"step2(t, level1, switch_time1, level2, switch_time2, level3)"`

giving a double step function at the values`switch_time1/2`

of the independent variable`t`

.

**Smooth step functions** can be defined by:

`"smoothstep1(t, level1, switch_time, level2, smoothness)"`

giving a smooth step function at the value`switch_time`

of the independent variable (usually time).`"smoothstep2(t, level1, switch_time1, level2, switch_time2, level3, smoothness)"`

giving a smooth double step function at the values`switch_time1/2`

of the independent variable`t`

.

The smoothness parameter controls the smoothness of the step. For large values, the step becomes increasingly sigmoidal.

**Bolus injections** can be defined by:

`"bolus(t, amount, time_point, duration)"`

giving a bolus injection at value`time point`

of the independent variable`t`

. This corresponds to a Gaussian curve with area`amount`

and standard deviation`duration`

.

**Cubic splines** with 3, 4, 5 or 10 knots can be defined by:

`"spline3(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, q_initial_slope_constraint, initial_slope)"`

`"spline4(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, q_initial_slope_constraint, initial_slope)"`

`"spline5(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5, q_initial_slope_constraint, initial_slope)"`

`"spline10(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5, t_knot6, p_knot6, t_knot7, p_knot7, t_knot8, p_knot8, t_knot9, p_knot9, t_knot10, p_knot10, q_initial_slope_constraint, initial_slope)"`

Here, `t`

denotes the independent variable, `t_knot1-5`

indicate the locations of the spline knots (these should be fixed to a numeric value), `p_knot1-5`

denote the spline parameters (these can either be fixed or left as free parameters to be estimated) and `q_initial_slope_constraint`

is a flag (`0`

= no, `1`

= yes) that indicates if the slope of the spline at the first knot is to be constained to the value given in `initial_slope`

(both values should be fixed to a numeric value). The spline parameter p_knot is defined as the value u(t_knot) of the spline at t_knot. If the spline should be constrained to positive values use the functions `spline_pos3`

, `spline_pos4`

and `spline_pos5`

the same way as described above. An example using splines is given in `Examples/Swameye_PNAS2003`

.

**Monotonic splines** with 3, 4, 5 or 10 knots can be defined by:

`"monospline3(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3)"`

`"monospline4(t, t_knot1, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4)"`

`"monospline5(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5)"`

`"monospline10(t, t_knot1, p_knot1, t_knot2, p_knot2, t_knot3, p_knot3, t_knot4, p_knot4, t_knot5, p_knot5, t_knot6, p_knot6, t_knot7, p_knot7, t_knot8, p_knot8, t_knot9, p_knot9, t_knot10, p_knot10)"`

Monotonic are splines which are monotonic between knots. Their syntax is similar to that of the regular splines. For an example comparing the spline types see `Examples/Splines`

.

**Custom input function** can be defined in the c-files `arInputFunctionsC.c`

and `arInputFunctionsC.h`

.

### 1.5.2 A general input function

Often, an input consists of transient and sustained parts. Such behaviour can be implemented by the following expression:

"gif_amp_trans*(1-exp(-t/gif_timescale_sust))*exp(-t/(gif_timescale_trans)) + gif_amp_sust*(1-exp(-t/gif_timescale_sust))"

The function has three parameters, two amplitudes (`gif_amp_trans`

and `gif_amp_sust`

) and two time scales (`gif_timescale_trans`

and `gif_timescale_sust`

), that encode the transient and sustained parts.

## 1.6 Defining the dynamics

This section describes the ODE system that determines the time evolution of the states. One can either specify `REACTIONS`

or `ODES`

directly.

### 1.6.1 Using REACTIONS

The ODE system can be specified as biochemical reaction network.

REACTIONS Epo + EpoR -> Epo_EpoR CUSTOM "k1 * Epo * EpoR" "label1" STAT5 -> pSTAT5 CUSTOM "k2 * STAT5" "label2" pSTAT5 -> npSTAT5 CUSTOM "k3 * pSTAT5" "label3" ...

- The 1st argument describes the reaction in a simple notation (here
`Epo + EpoR -> Epo_EpoR`

is a complex formation,`STAT5 -> pSTAT5`

a protein modification, and`pSTAT5 -> npSTAT5`

a transport reaction between compartments). - The 2nd argument is a keyword that specifies the reaction type,
`CUSTOM`

or`MASSACTION`

.`CUSTOM`

is the recommended usage. - In case of a
`CUSTOM`

reaction, the 3rd argument specifies the mathematical expression of the reaction rate equation (here`"k1 * Epo * EpoR"`

,`"k2 * STAT5"`

and`"k3 * pSTAT5"`

). In case of a`MASSACTION`

reaction, the mathematical expression in the 3rd argument will be just one parameter`name`

that will be the rate constant corresponding to this reaction. In case of a`MASSACTION`

reaction, one can use`<->`

in the 1st argument to specify a reversible reaction. In this case, there will be two parameters with suffix`name_1`

for the forward and`name_2`

for the back reaction. - The 4th argument (optional) specifies a label for the reaction used for plotting.

### 1.6.2 Using ODES

In contrast to the REACTIONS notation, the ordinary differential equations can be specified directly:

ODES "mathematical expression 1" "mathematical expression 2" "mathematical expression 3" ...

where these mathematical expressions define the right-hand sides of the ODEs. Take care that there are as many ODEs as dynamic variables in the model.

## 1.7 Defining DERIVED variables

In this section variable that are derived from the dynamic and input variables can be defined. Derived variables can be used in the OBSERVABLES and REACTIONS sections.

DERIVED Epo_ext C "pM" "conc." "Epo + dEpo_e" Epo_int C "pM" "conc." "Epo_EpoR_i + dEpo_i" ...

- The 1st argument specifies the unique identifiers (here
`Epo_ext`

and`Epo_int`

) that can be used in the mathematical expressions later. - The 2nd-4th argument specify the unit types (here
`C`

= concentration), the unit itself (here`"pM"`

), and a plain text used for plots (here`"conc."`

). - The 5th argument specify the mathematical expression of the input function (here
`"Epo + dEpo_e"`

and`"Epo_EpoR_i + dEpo_i"`

).

## 1.8 Defining the OBSERVABLES

At this point of the model definition files one can specify the default model observables and their corresponding error model (Section 1.9). While the section in the model definition file are optional the corresponding sections in the data definition file (see Section 2.3) are mandatory and override the default specifications given here.

OBSERVABLES tSTAT5_au C "au" "conc." 1 1 "scale_tSTAT5 * (STAT5 + pSTAT5)" pSTAT5_au C "au" "conc." 1 1 "offset_pSTAT5 + scale_pSTAT5 * pSTAT5" ...

- The 1st argument specifies the unique identifiers (here
`tSTAT5_au`

and`pSTAT5_au`

). Please note that this identifier should be the same as the corresponding column header in the data sheet. Also, Observable identifiers are not allowed to be the same as dynamic variable identifiers. - The 2nd-4th argument specify the unit types (here
`C`

= concentration), the unit itself (here`"au"`

), and a plain text used for plots (here`"conc."`

). - The 5th argument corresponds to a flag (0=no, 1=yes) that indicates if the maximal value in the data sheet of the respective observable should be rescaled to one.
- The 6th argument corresponds to a flag (0=no, 1=yes) that indicates if both the raw data form the spread sheet and the model observables should be compared on a log-10 scale. This is frequently used for concentration data that contains log-normal measurement noise. The transformation is applied after rescaling in case of argument 5 applies.
- The 7th argument specify the mathematical expression of the observable function (here
`"scale_tSTAT5 * (STAT5 + pSTAT5)"`

and`"offset_pSTAT5 + scale_pSTAT5 * pSTAT5"`

. It will be places inside the log-10 of argument in case of argument 6 applies.

## 1.9 Defining the magnitude of the measurement noise in the ERRORS section

In the likelihood function, the measurement noise is modeled as normal or log-normal distribution, see 2.2 argument 6. The magnitude of the measurement noise, i.e. the standard deviation of the normal distribution (or standard deviation of the normal distribution in the log-space for log-normally distributed noise), can be implemented as parameterized function

ERRORS tSTAT5_au "sd_STAT5_au" pSTAT5_au "sd_STAT5_au" ...

- The 1st argument corresponds to an unique identifiers as given in the observables section (here
`tSTAT5_au`

and`pSTAT5_au`

). It is mandatory to specify the measurement noise for each observable in the data definition file. - The 2nd argument specify the mathematical expression of the noise function (here
`"sd_STAT5_au"`

and`"sd_STAT5_au"`

indicates that both measurements have the same measurement noise). Relative measurement noise can be implemented by using the observable identifier, e.g.`"sqrt(sd_STAT5_abs^2 + sd_STAT5_rel^2 * tSTAT5_au^2)"`

.

### 1.9.1 Defining the measurement noise in the data sheet

Sometimes, instead of using a parametrized function for the measurement noise, one like to directly specify the amount of noise calculated from replicates in the data sheet. Put these value in the data sheet with column header same as the observable name but with the addition `_std`

(here e.g. `tSTAT5_au_std`

and `pSTAT5_au_std`

). Please note that these values will be interpreted as standard deviation of the data not as variance! An example for this usage can be found in the example application `/Example/Swameye_PNAS2003`

.

To activate this mode set the value `ar.config.fiterrors = -1`

. `ar.config.fiterrors = 1`

indicates the standard mode where the error model is used and estimated together with the dynamics. `ar.config.fiterrors = 0`

indicates that the error model is used but not estimated, i.e. its parameters are fixed.

For the visual output, an alternative mode is available that shows the measurement noise as error bars on the data (set `ar.config.ploterrors = 1`

) rather than noise model around the trajectories (default `ar.config.ploterrors = 0`

).

In order to use multiple modes, set `ar.config.useFitErrorMatrix = 1`

and specify the different modes in `ar.config.fiterrors_matrix(model index, data index)`

. Accordingly, the different plotting modes can be specified in `ar.config.ploterrors_matrix(model index, data index)`

.

## 1.10 SUBSTITUTIONS section

In this section identifiers can be specified which can subsequently be used in model conditions. They will be iteratively substituted in the expressions in the CONDITIONS sections.

SUBSTITUTIONS mySteadyState "kpro/myDegradation" myDegradation "2*kdeg" ...

- The 1st argument specifies a unique identifier, which may not have the same name as any model parameter
- The 2nd argument specifies a mathematical expression that will replace located instances of the identifier. Note that these will be evaluated repeatedly until there are no further substitutions. Hence, in this example
`mySteadyState`

will equate to`0.5 * kpro / kdeg`

. Cyclic substitutions are*not*allowed.

## 1.11 CONDITIONS section

In this section conditions in term of the model parameters can be specified. All expression that are not uniquely defined above as either compartments, states or inputs will be treated as free parameters.

CONDITIONS kD "koff/kon" init_Epo_EpoR "0" STAT5ActJAK2 "STAT5ActJAK2/init_EpoRJAK2" ...

- The 1st argument specifies a target parameter that occurred in the model definition above.
- The 2nd argument specifies a mathematical expression that will replace the target parameter in the model (here
`"koff/kon"`

,`"0"`

and`"STAT5ActJAK2/init_EpoRJAK2"`

). This can be any mathematical expression of composed of new and old parameters. Take care not to implement recursive conditions. All replacements are executed once and sequentially, so order matters.

## 1.12 PARAMETERS section

In this optional section the default settings for parameter can be specified. Please note that the intension of this section is to save final parameter values for documentation purpose. During work in progress parameter values should be manipulate "soft" in the MATLAB workspace variables.

PARAMETERS #ar.pExternLabels ar.pExtern ar.qFitExtern ar.qLog10Extern ar.lbExtern ar.ubExtern init_PX 0 1 0 0 1000 ...

# 2. Data Definition File

Similar to the model definition file experiments have data definition files that contain experiment specific information. A data definition file is a plain text file with the ending `.def`

located in the subfolder `/Data`

of the current MATLAB working directory. Each data set, stored in the subfolder `/Data`

as `.xls`

or `.csv`

file, must have its own data definition file with the same name but the suffix `.def`

. The data definition files are structured in various section:

DESCRIPTION ... PREDICTOR (or PREDICTOR-DOSERESPONSE) ... INPUTS ... OBSERVABLES ... ERRORS ... SUBSTITUTIONS (optional) ... CONDITIONS ...

Sections `DESCRIPTION`

and `INVARIANTS`

are same as in the model definition file, see 1.1 and 1.7. The remaining sections will be explained in the following. As in the model definition file lines in the data definition file can be commented out using `//`

.

## 2.1 PREDICTOR and PREDICTOR-DOSERESPONSE sections

Is used to define the independent variable / the predictor. For the data definition file there are two modes available, `PREDICTOR`

and `PREDICTOR-DOSERESPONSE`

. For using the `PREDICTOR`

mode see description in Section 1.2, the independent variable in this case is usually time. In the `PREDICTOR-DOSERESPONSE`

mode one can use an existing `parameter`

as additional independent variable

PREDICTOR-DOSERESPONSE parameter t T "min" "time" 0 100

In this mode, there has to be a column named `parameter`

in the data sheet, besides the first column that encodes the standard independent variable (here time `t`

). Please note that formally this alternative mode does not change the results, however the plot of the model fit to the data will be modified. For instance if `parameter`

corresponds to the concentration of a certain input, the result will the a dose response plot.

## 2.2 Defining the INPUTS

Each experiment can have a different input function, e.g. instance implementing a different treatment scheme. Therefore, in the data definition file the input can be redefined for this specific experiment.

INPUTS Epo "10 + sin(k1*t)" SAv "1 + 0.1*t" ...

- The 1st argument corresponds to an unique identifiers as given in the model definition file (here
`Epo`

and`SAv`

). - The 2nd argument specify the mathematical expression of the input function (here
`"10 + sin(k1*t)"`

and`"1 + 0.1*t"`

), see 1.5 for more detailed explanation.

## 2.3 OBSERVABLES and ERRORS sections

At this point of the model definition files one can specify observables and their corresponding error model that override the default specifications from the model definition file (see Sections 1.8 and 1.9).

## 2.4 Defining the SUBSTITUTIONS for the specific experiment

Like in the model definition file, see 1.10 for detailed explanation, experiment specific substitutions can be specified.

## 2.5 Defining the CONDITIONS for the specific experiment

Like in the model definition file, see 1.11 for detailed explanation, experiment specific conditions can be specified.

CONDITIONS init_Epo_EpoR "10" ...

Please note that these "data" conditions are applied to the model after the "model" conditions have been applied.

## 2.6 PARAMETERS section

Like in the model definition file, see 1.12 for detailed explanation, the default settings for parameter can be specified also on the level of the data definition file, i.e. for the additional parameter that might be introduced there.

Updated