# Dynamical tree output

Authors: JuanPi Carbajal

## 1   Introduction

### 1.1   The urban water network setting

A recurrent problem in operational urban water management, is that of simulating the flow at a given outlet connected to an upstream pipe network.

The upstream network usually has a tree topology. The edges of the tree represent conduits. The nodes of the tree represent points on which two or more conduits merge. There can also be an inlet at the node in which runoff flow into the network. The root node is the outlet at which the flow is simulated.

We can convert any tree in which inner nodes are inlets into a tree where all inlets are leaf nodes by adding virtual conduits.

Series of conduits without inlets or merges can be replaced by a single equivalent conduit

The previous observations allows us to define a rule to simplify the network:

1. Make all inlets leaf nodes.
2. Replace all conduits incident to nodes of degree 2 with its equivalent.
3. Repeat the last step until there are no more nodes of degree 2.

The problems are then: 1) to find the transformation applied to the inflow along the path from inlet to outlet, 2) define the conduit replacement rule.

#### 1.1.1   Edge reduction

Note

Coming soon!

Hypotheses:

1. All edges apply the same family of parametrized transformations, each edge is an instance with a given parameter vector $$\mathbf{\theta}_k$$.
2. Combining two edges produces the same type of operation with different parameters, i.e. the family is closed under concatenation.
\begin{align*} \mathcal{L}_i = \mathcal{L}_{\mathbf{\theta}_i}, \forall i\\ \mathcal{L}_{\mathbf{\theta}_i} \circ \mathcal{L}_{\mathbf{\theta}_j} = \mathcal{L}_{\mathbf{\theta}}\\ \theta = m(\theta_i,\theta_j) \end{align*}

### 1.2   Tree representation of a dynamical system

Here we deal with the output of a tree whose edges are linear operators and nodes are inputs to these operators.

A tree is fully specified by giving the predecessors list, i.e. for each node indicate the predecessor node:

Node (label and index) Predecessor
1 0
2 1
3 < 3
$$\ldots$$ $$\ldots$$
n < n

For example, consider the tree (0, 1, 1):

a -- c
b -- c


where nodes a and b are connected to node c. The nodes a and b could be functions of time and the node c is the result of doing

\begin{equation*} a(t-\delta_a) + b(t-\delta_b) = c(t) \end{equation*}

In this example the tree takes two functions as inputs ($$a(t)$$ and $$b(t)$$) on its leaf nodes and the edges delay those functions by $$delta_a$$ and $$\delta_b$$, respectively. The root node c is the the output of the tree $$c(t)$$.

In general the edges can apply any linear operator to the input functions, for example they could take the function on the node as the input to a linear ODE

\begin{align*} \dot{\mathbf{x}} = A \mathbf{x} + B f(t)\\ \mathbf{y}(t) = C \mathbf{x}(t) \end{align*}

where $$f(t)$$ is the input and $$\mathbf{y}(t)$$ is the output.

In general we will write $$\mathcal{L}_i(f)$$ to the effect of edge $$i$$ on the input $$f$$, i.e. $$\mathcal{L}_i$$ is the input-output relation of the edge.

The output at node $$k$$ in a tree with $$N_l$$ leaf nodes is

\begin{equation*} y_k = \sum_{i=1}^{N_l} \left(\bigodot_{n \in \operatorname{path}(i,k)} \mathcal{L}_n\right)(x_i) = \sum_{i=1}^{N_l} \bar{\mathcal{L}}_i(x_i) \end{equation*}

where $$\bigodot$$ stands for composition and $$\operatorname{path}(i,k)$$ is the path from leaf node $$i$$ to the node in question $$k$$. In words, the input at each leaf is transformed down the path to node $$k$$ , and the output is the sum of all transformed inputs.

## 2   Edge types

### 2.1   Delay edges

A simple input-output relation of an edge is a delay, i.e. $$\mathcal{L}_i(f(t)) = f(t - \tau_i)$$. In this situation the output function is then given by

\begin{equation*} y(t) = \sum_{i=1}^{N_l} x_i(t - \sum_{n \in \operatorname{path}(i)} \tau_n) = \sum_{i=1}^{N_l} x_i(t - \tau_i) \end{equation*}

where $$\tau_i$$ is the delay generated along the path from the root node to the leaf node $$i$$ (i.e. $$\operatorname{path}(i)$$). There are $$N_l$$ leaf nodes and correspoding paths (i.e. a tree).

The delay is always non-negative and zero only for virtual edges.

This results leads to the following questions:

1. Given $$N_l$$ functions $$x_i(t)$$ and an delay array $$\lbrace \tau_i \rbrace$$. What is the result of the delayed superposition?
2. What is the result for a given distribution of the delay array?
3. What is the result for the mean over many delay arrays?
4. Can we estimate the delay array from data?
1. What is the limiting distribution ($$N_l \to \infty$$) of the delay array for random trees?

We answer the first questions 1. and 2. in section Delayed superposition. We deal with question 4. in Estimating delays from data. Question 5. is studied in section Delay array distribution in large random trees.

#### 2.1.1   Delays in urban water networks

In the case of urban water networks, the value of the delay is related to the physical properties of the conduit, for example, it could be estimated using the Gauckler–Manning–Strickler formula for the cross-sectional average speed of flow

\begin{equation*} \tau \propto L <v>^{-1} = L n \left(\frac{L_w}{A}\right)^{2/3}S^{-1/2} \end{equation*}

where $$L$$ is the length of the conduit, $$n$$ is the Gauckler–Manning coefficient, $$L_w$$ is the wetted perimeter of the conduit, $$A$$ is the conduit's flow cross-sectional area, and $$S$$ the conduit's slope (or linear hydraulic head loss when the water depth is not constant along the conduit).

#### 2.1.2   Delayed superposition

Herein we explore the result of the superposition of a delayed waveform $$\varphi(t)$$, i.e.

\begin{equation*} f(t,\vec{\tau}) = \sum_{i=1}^N \varphi(t - \tau_i) \end{equation*}

where the delays $$\lbrace \tau_i \rbrace \geq 0$$ form the delay array $$\vec{\tau}$$.

Assume that each delay is independently drawn from a given distribution $$\tau_i \sim p(\tau)$$, then the delay array is a random variable (i.e. i.i.d. random variable)

\begin{equation*} \vec{\tau} \in \mathbb{R}_+^N \sim P(\vec{\tau}) = \prod_{k=1}^N p(\tau_k) = p(\tau)^N \end{equation*}

Note

I expect that the delays will be correlated depending on the distance between the underlying pipes. Correlation in the terrain/pipe will be reflected in the delays, via slope and diameters.

We now calculate the mean delayed superposition (i.e. the average superposition over trees with the same number of leaf nodes but different delays drawn from $$P(\vec{\tau})$$)

\begin{align*} <f(t)>_{\vec{\tau}} =& \int_{\mathbb{R}_+^N} f(t,\vec{\tau}) P(\vec{\tau}) d\vec{\tau} = \int_{\tau_N \cdots \tau_1} f(t,\vec{\tau}) \prod_{k=1}^N p(\tau_k) d\tau_k = \\ &\sum_{i=1}^N \int_{\tau_N \cdots \tau_1} \varphi(t - \tau_i) \prod_{k=1}^N p(\tau_k) d\tau_k = \\ = & \sum_{i=1}^{N-1} \int_{\tau_N \cdots \tau_1} \varphi(t - \tau_i) \prod_{k=1}^N p(\tau_k) d\tau_k + \int_{\tau_N \cdots \tau_1} \varphi(t - \tau_N) \prod_{k=1}^N p(\tau_k) d\tau_k \end{align*}

The second term in the last expression can be simplified to

\begin{align*} \int_{\tau_N \cdots \tau_1} \varphi(t - \tau_N) \prod_{k=1}^N p(\tau_k) d\tau_k =& \int_{\tau_{N-1} \cdots \tau_1} \prod_{k=1}^{N-1} p(\tau_k) d\tau_k \int_{\tau_N}\varphi(t - \tau_N) p(\tau_N) d\tau_N = \\ = & \int_{\tau_N}\varphi(t - \tau_N) p(\tau_N) d\tau_N \end{align*}

due to the normalization of the distribution $$p(\tau)$$. This is true for all terms, so applying this simplification to each term we obtain

\begin{equation*} \bar{f}(t) = <f(t)>_{\vec{\tau}} = \sum_{i=1}^{N} \int_{\tau_i}\varphi(t - \tau_i) p(\tau_i) d\tau_i = N \int_0^{\infty} \varphi(t - s) p(s) ds = N \hat{\varphi}(t) \end{equation*}

the last expression is obtained form the fact that all the elements of the delay array are independently drawn from the same distribution (i.i.d. assumption).

##### 2.1.2.1   Weighted superposition

The results above can be easily generalized to weighted superpositions (i.i.d. weights),

Note

As before expect that the weights are correlated, and also correlated with the delays.

\begin{align*} f(t,\vec{\tau},\vec{w}) =& \Phi(t, \vec{\tau}^\top) \vec{w}\\ \Phi(t,\vec{\tau}^\top) =& \begin{bmatrix} \varphi(t-d_1) & \ldots & \varphi(t-d_N)\end{bmatrix} \in \mathbb{R}^{1 \times N}([0,T]) \end{align*}

where $$\vec{w} \in \mathbb{R}^{N \times 1}$$ is a vector of weights. Note that $$\Phi(t,\vec{\tau})^\top = \Phi(t,\vec{\tau}^\top)$$

This gives

\begin{equation*} \bar{f}(t) = <f(t)>_{\vec{\tau},\vec{w}} = N <w> \hat{\varphi}(t) \end{equation*}

where $$<w>$$ is the mean of any of the components of the weight vector (i.i.d.)

Examples of these results are provided in the demo script s_delayed_superposition.m or its html report.

#### 2.1.3   Estimating delays from data

We now want to estimate the delays given observed data of the superposition $$y(r)$$ (this is a vector of data with components $$y(r_i)$$) and some assumptions about the delay array.

##### 2.1.3.1   Effective number of delays: least square error

Recall the mean function over delay array and weight vector realizations

\begin{equation*} \bar{f}(t) = <f(t)>_{\vec{\tau},\vec{w}} = N <w> \hat{\varphi}(t) \end{equation*}

This function can be used to estimate a least square estimate of the effective number of terms in the superposition (i.e. effective delay array length or effective mean weight),

\begin{equation*} \tilde{N} = N < w > \end{equation*}

this means that we search for $$\tilde{N}$$ that minimizes the square error

\begin{align*} \tilde{N}_{LS} =& \operatorname{argmin}_{x} \sum_{i=1}^n \left(y(r_i) - x \hat{\varphi}(r_i)\right)^2\\ \tilde{N}_{LS} =& y(r) \frac{\hat{\varphi}(r)}{\Vert \hat{\varphi}(r) \Vert^2} \end{align*}
##### 2.1.3.2   Effective number of delays: maximum marginal likelihood

Now we derive an expression for the deviation from the mean:

\begin{equation*} \Delta f(t, \vec{\tau}, \vec{w}) = f(t, \vec{\tau}, \vec{w}) - \bar{f}(t) = \Phi(t, \vec{\tau}^\top) \vec{w} - \tilde{N} \hat{\varphi}(t) \end{equation*}

The covariance function is obtained by doing

\begin{align*} K(t,r) =& <\Delta f(t, \vec{\tau}, \vec{w}) \Delta f(r, \vec{\tau}, \vec{w})>_{\vec{\tau}, \vec{w}} = <\left(\Phi(t, \vec{\tau}^\top) \vec{w} - \tilde{N} \hat{\varphi}(t)\right)\left(\Phi(r, \vec{\tau}^\top) \vec{w} - \tilde{N} \hat{\varphi}(r)\right)>_{\vec{\tau}, \vec{w}} = \\ = & < \left(\Phi(t, \vec{\tau}^\top) \vec{w} \right)\left(\Phi(r, \vec{\tau}^\top) \vec{w} \right)>_{\vec{\tau},\vec{w}} - \tilde{N}^2 \hat{\varphi}(t)\hat{\varphi}(r) \end{align*}

expectation of the first term:

\begin{equation*} < \Phi(t, \vec{\tau}^\top) \vec{w} \vec{w}^\top \Phi(r, \vec{\tau})>_{\vec{\tau},\vec{w}} = < \Phi(t, \vec{\tau}^\top) <\vec{w} \vec{w}^\top>_{\vec{w}} \Phi(r, \vec{\tau})>_{\vec{\tau}} = < \Phi(t, \vec{\tau}^\top) \left(\Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top\right)\Phi(r, \vec{\tau})>_{\vec{\tau}} \end{equation*}

using independence and that the delay array is identically distributed we obtain

\begin{equation*} < \Phi(t, \vec{\tau}^\top) \left(\Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top\right) \Phi(r, \vec{\tau})>_{\vec{\tau}} = S_w \hat{\varphi}(t)\hat{\varphi}(r) + \operatorname{tr}(\left(\Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top\right)) \hat{\varphi}(t,r) \end{equation*}

where we defined

\begin{align*} S_w &= \sum_k \sum_{i \neq k} \left(\Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top\right)_{ik} = 2 \sum_k \sum_{i < k} \left(\Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top\right)_{ik}\\ \hat{\varphi}(t,r) &= \int_0^{\infty} \varphi(t - s) \varphi(r - s) p(s) ds \end{align*}

giving

\begin{equation*} K(t,r) = \left( S_w - \tilde{N}^2\right) \hat{\varphi}(t)\hat{\varphi}(r) + \operatorname{tr}\left(\Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top\right) \hat{\varphi}(t,r) \end{equation*}

Lets consider a limiting case: all weights are 1 with no variance.

\begin{align*} \Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top &= 1 \\ S_w &= N( N - 1)\\ \tilde{N} &= N\\ \operatorname{tr}\left(\Sigma_{\vec{w}} + <\vec{w}><\vec{w}>^\top\right) &= N \\ K(t,r) &= N \left[\hat{\varphi}(t,r) - \hat{\varphi(t)} \hat{\varphi(r)}\right] = N \hat{K}(t,r) \end{align*}

With mean and covariance we can define a Gaussian process that approximates the data. The predictive mean of this GP is

\begin{equation*} \bar{f}(t) = N \hat{K}(t,r) \left[ N \hat{K}(r,r) + \sigma^2 I\right]^{-1} \, \left[y(r) - N\hat{\varphi}(r)\right] + N\hat{\varphi}(t) \end{equation*}

where $$r$$ are the times where we have data.

Equating to zero the derivative of the log marginal likelihood with respect to $$N$$ gives the MAP estimate of that parameter (here assuming $$\sigma = 0$$)

\begin{align*} N_{GP} =& \frac{\sqrt{n^2 + 4 \hat{\alpha} \cdot \hat{\varphi}(r) \; y(r) \cdot \alpha} - n}{2 \hat{\alpha} \cdot \hat{\varphi}(r)} \\ \hat{\alpha} =& \hat{K}^{-1}(r,r) \hat{\varphi}(r) \\ \alpha =& \hat{K}^{-1}(r,r) y(r) \end{align*}

where $$n$$ is the number of observations

For an example (see script s_delayed_decomp.m or its html report).

##### 2.1.3.3   More general parametrization of the waveforms

We would like to consider other kinds of parametrization of the waverforms, in particular

\begin{align*} \varphi (t, \vec{\lambda}) &= \varphi (\begin{bmatrix}t & 1\end{bmatrix}\vec{\lambda})\\ \Phi(t,\Lambda) =& \begin{bmatrix} \varphi(t,\Lambda_{:1}) & \ldots & \varphi(t,\Lambda_{:N})\end{bmatrix} \in \mathbb{R}^{1 \times N}([0,T]) \end{align*}

which can be used to represent scalings and translations of the waveform

\begin{equation*} \varphi (t, \begin{bmatrix}\alpha & \tau\end{bmatrix}^\top) = \varphi \left(\frac{t -\tau}{\alpha}\right) \end{equation*}

The structure of the mean and covariance remain the same but the hatted functions change (assuming that the parameter vectors are i.i.d.)

\begin{align*} \hat{\varphi}(t) &= \int_{\vec{\lambda}} \varphi(t, \vec{\lambda}) p(\vec{\lambda}) d\vec{\lambda}\\ \hat{\varphi}(t,r) &= \int_{\vec{\lambda}} \varphi(t, \vec{\lambda}) \varphi(r, \vec{\lambda}) p(\vec{\lambda}) d\vec{\lambda} \end{align*}

Warning

From here on is not updated

##### 2.1.3.4   Effective upstream network

Herein we discuss the case in which the upstream network topology is not know, but we have a sample of the tree output and we are given the waveform, i.e. we know the result $$y(r)$$ in

\begin{equation*} y(r) = \sum_{\tau} w(\tau) \varphi (t - \tau) \end{equation*}

but we do not know $$N$$, nor the weights, nor the elements in the delay array.

Approximate solutions to this problem have been widely studied in the literature using greedy algorithms and the many guises of the matching pursuit algorithm. In 2011 Ekanadham et al. [Ekanadham11] provided an algorithm to approximately solve this problem without resourcing to arbitrary discretization: the continuous basis pursuit.

The solution is a sparse delay array. This delay array can be interpreted as the weighted path length from inlets to the root node of an effective upstream network. That is, it provides us with the rooted path length distribution of a family of trees. Using a generative algorithm we can generate the predecessors list of these trees.

Note

Coming soon!

#### 2.1.5   Rooted path length distribution

Note

Coming soon!

All delays equal to 1

Note

Coming soon!

Log-normal delays

### 2.2   Transfer function edges

Delay edges are a particular case.

Note

Coming soon!

## 3   References

 [Ekanadham11] Ekanadham, C., Tranchina, D., & Simoncelli, E. P. (2011). Recovery of Sparse Translation-Invariant Signals With Continuous Basis Pursuit. IEEE Transactions on Signal Processing, 59(10), 4735–4744. https://doi.org/10.1109/TSP.2011.2160058

Updated