# Case study: bioreactor

Authors: JuanPi Carbajal

## Introduction

The bioreactor is a DAE system [Masic2017] [FloresAlsina2015].

Emulator is needed to perform system identification.

## Emulating algebraic part

Source code committed on: b6f2d8d and 43d8622

The first emulation tasks focuses on emulating the algebraic constraints of the DAE systems.

Here we consider a algraic part that takes a 2D input. In the next report we explore the case of 4D input.

### Existence of emulator

The algebraic constraints are rational functions of the inputs, hence in the points in which the Jacobian matrix of the constraints is not singular, they define an implicit function, i.e.

\begin{align*} \boldsymbol{g}(\boldsymbol{x}, \operatorname{pH}, \boldsymbol{\theta}_e) &= \boldsymbol{0} \\ \boldsymbol{g}(\boldsymbol{x}_0, \ldots) = \boldsymbol{0} \; \wedge \; \det(J\boldsymbol{g}(\boldsymbol{x}_0, \ldots)) &\neq 0 \\ &\Rightarrow \operatorname{pH} = f(\boldsymbol{x}, \boldsymbol{\theta}_e), \; \boldsymbol{x} \in \operatorname{\mathcal{B}}(\boldsymbol{x}_0) \end{align*}

This means that at least for some open domains of the inputs a function from the inputs to the pH value exists and can be approximated, i.e. we can emulate.

The general strategy to compute the manifold is to use a zero finding routine like fzero. This is indeed very general but for our particular case it should be possible to find a specific approach that is faster than fzero and comparable in performance.

### Toy model

We first considered the simulator in Masic et al. [Masic2017]. In this case the constraints take a 2-dimensional input and the pH value. Sampling the constraints on a grid (see sample_constraint.m) produces the black circles in the following plot

Applying a fully data driven approach, we just need to interpolate these samples. To do this we use the GNU Octave function interp2

#### Time costs

The timing results for a single run of s_interp.m look like the following

Training ... Trainig time per sample (N=900): 0.00626

Testing ... ** Time per sample (N=100) ** fzero: 0.00634 interp: 1.82e-05 Avg. rel. difference: 0.82% ** Avg. constraint error (N=100) ** fzero: 8.83e-18 interp: -0.000354

The cost of evaluating the interpolation is very low and decreases with the number of samples, since we do not pay the time cost of the setup overhead. However, the quality of the interpolated solution is much worse than those of fzero. This could be improved with adaptive sampling, with a slight increase in the training time.

Most of these values are not stable and for a final report average over large samples should be considered.

#### Observations

• The current use of fzero is naive, it is expected that the performance can be improved by tuning the optimization options, e.g. providing the Jacobian of the constraints.
• Due to the smoothness of the constraints it is also possible to use simpler (gradient free) zero finding algorithms, e.g. Picard iterations.
• By calculating the inverse of the Jacobian of the constraints, the manifold approximation problem becomes solving a system of differential equations, whose solution could be solve efficiently using model order reduction or Bayesian filters e.g. Kalman filters.

## References

 [Masic2017] (1, 2) Masic, A., Srinivasan, S., Billeter, J., Bonvin, D., & Villez, K. (2017). Identification of Biokinetic Models using the Concept of Extents. Environmental Science & Technology. http://doi.org/10.1021/acs.est.7b00250
 [FloresAlsina2015] Flores-Alsina, X., Kazadi Mbamba, C., Solon, K., Vrecko, D., Tait, S., Batstone, D. J., Jeppsson, U., Gernaey, K. V. (2015). A plant-wide aqueous phase chemistry module describing pH variations and ion speciation/pairing in wastewater treatment process models. Water Research, 85, 255–265. http://doi.org/10.1016/j.watres.2015.07.014

Updated