11 Symbolic Expressions
Introduction
This file formalizes the expression tree abstract data type (ADT) that enables deferred computation in phasic. When a phase-type graph has parameterized edges (Definition 2.15), the edge weights are functions of a parameter vector \boldsymbol{\theta}. Rather than recomputing the entire graph elimination for each new \boldsymbol{\theta}, phasic records the elimination operations as symbolic expression trees that can be evaluated efficiently with any concrete parameter vector. This file defines the expression tree nodes, their evaluation semantics, common subexpression elimination (CSE) via interning, and symbolic differentiation for gradient computation.
The expression system is the foundation for symbolic graph elimination ([10]), the trace system ([11]), and JAX integration ([25]). It is a purely syntactic layer: this file defines expressions over parameters without reference to graphs. The connection to graphs is established in [10].
Prerequisites: [01]
Source files: - api/c/phasic.h (types: ptd_expression, enum ptd_expr_type, ptd_expr_intern_table) - src/c/phasic_symbolic.c (functions: ptd_expr_evaluate_iterative, ptd_expr_derivative, ptd_expr_copy_iterative, ptd_expr_destroy_iterative, edge_weight_to_expr)
Definitions
Definition 11.1 (Expression tree) An expression tree is a rooted tree \mathcal{E} where each node has a type drawn from the set \{\texttt{Const}, \texttt{Param}, \texttt{Dot}, \texttt{Add}, \texttt{Sub}, \texttt{Mul}, \texttt{Div}, \texttt{Inv}\} and carries data according to its type:
| Node type | Data | Children | Semantics |
|---|---|---|---|
| \texttt{Const}(c) | c \in \mathbb{R} | none (leaf) | Constant value c |
| \texttt{Param}(j) | j \in \{1, \ldots, k\} | none (leaf) | Parameter reference \theta_j |
| \texttt{Dot}(\mathbf{c}, \mathbf{j}) | \mathbf{c} \in \mathbb{R}^m, \mathbf{j} \in \{1,\ldots,k\}^m | none (leaf) | Sparse dot product \sum_{i=1}^{m} c_i \, \theta_{j_i} |
| \texttt{Add} | none | left \mathcal{E}_L, right \mathcal{E}_R | \mathcal{E}_L + \mathcal{E}_R |
| \texttt{Sub} | none | left \mathcal{E}_L, right \mathcal{E}_R | \mathcal{E}_L - \mathcal{E}_R |
| \texttt{Mul} | none | left \mathcal{E}_L, right \mathcal{E}_R | \mathcal{E}_L \cdot \mathcal{E}_R |
| \texttt{Div} | none | left \mathcal{E}_L, right \mathcal{E}_R | \mathcal{E}_L / \mathcal{E}_R |
| \texttt{Inv} | none | left \mathcal{E}_L | 1 / \mathcal{E}_L |
Intuition
An expression tree is a standard abstract syntax tree for arithmetic expressions, extended with the \texttt{Dot} node as an optimized representation for linear combinations of parameters (the most common expression in phase-type models). The \texttt{Inv} node represents unary inversion, distinct from \texttt{Div} which is binary.Example
The expression 2\theta_1 + 3\theta_2 is represented as a single \texttt{Dot} node: \texttt{Dot}((2, 3), (1, 2)). The expression \theta_1 / (\theta_1 + \theta_2) is represented as \texttt{Div}(\texttt{Param}(1), \texttt{Add}(\texttt{Param}(1), \texttt{Param}(2))).Definition 11.2 (Expression evaluation) Let \mathcal{E} be an expression tree and let \boldsymbol{\theta} = (\theta_1, \ldots, \theta_k) be a parameter vector. The evaluation function \operatorname{eval}(\mathcal{E}, \boldsymbol{\theta}) is defined recursively:
\operatorname{eval}(\mathcal{E}, \boldsymbol{\theta}) = \begin{cases} c & \text{if } \mathcal{E} = \texttt{Const}(c) \\ \theta_j & \text{if } \mathcal{E} = \texttt{Param}(j) \\ \sum_{i=1}^{m} c_i \, \theta_{j_i} & \text{if } \mathcal{E} = \texttt{Dot}(\mathbf{c}, \mathbf{j}) \\ \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) + \operatorname{eval}(\mathcal{E}_R, \boldsymbol{\theta}) & \text{if } \mathcal{E} = \texttt{Add}(\mathcal{E}_L, \mathcal{E}_R) \\ \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) - \operatorname{eval}(\mathcal{E}_R, \boldsymbol{\theta}) & \text{if } \mathcal{E} = \texttt{Sub}(\mathcal{E}_L, \mathcal{E}_R) \\ \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) \cdot \operatorname{eval}(\mathcal{E}_R, \boldsymbol{\theta}) & \text{if } \mathcal{E} = \texttt{Mul}(\mathcal{E}_L, \mathcal{E}_R) \\ \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) / \operatorname{eval}(\mathcal{E}_R, \boldsymbol{\theta}) & \text{if } \mathcal{E} = \texttt{Div}(\mathcal{E}_L, \mathcal{E}_R) \\ 1 / \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) & \text{if } \mathcal{E} = \texttt{Inv}(\mathcal{E}_L) \end{cases}
Intuition
Evaluation is a standard post-order tree traversal: evaluate children first, then combine. The implementation uses an iterative (stack-based) traversal to avoid stack overflow on deep expression trees.Definition 11.3 (Common Subexpression Elimination via interning) An intern table \mathcal{I} is a hash table mapping expression structure to canonical expression nodes. For each expression \mathcal{E}, the interning function \operatorname{intern}(\mathcal{I}, \mathcal{E}) returns a canonical representative: if an expression structurally equal to \mathcal{E} already exists in \mathcal{I}, its existing node is returned; otherwise, \mathcal{E} is inserted and returned as the new canonical representative.
Two expressions \mathcal{E}_1 and \mathcal{E}_2 are structurally equal if they have the same type, the same data (constants, parameter indices, coefficient vectors), and recursively structurally equal children.
The interned constructors \texttt{Add}_\mathcal{I}, \texttt{Mul}_\mathcal{I}, \texttt{Div}_\mathcal{I}, \texttt{Sub}_\mathcal{I}, \texttt{Inv}_\mathcal{I} construct the expression node and immediately intern it, ensuring that identical subexpressions share a single node in memory.
Intuition
During graph elimination, the same subexpression (e.g., the sum of edge weights at a vertex) appears in multiple contexts. Without CSE, the expression tree grows exponentially with the number of elimination steps. Interning ensures that each unique subexpression is stored once, converting the tree into a DAG (directed acyclic graph of expression nodes) and keeping memory bounded.Example
If vertex v has edges with weights \mathcal{E}_1 and \mathcal{E}_2, the total rate is \mathcal{E}_\lambda = \texttt{Add}_\mathcal{I}(\mathcal{E}_1, \mathcal{E}_2). The normalized probability for the first edge is \texttt{Mul}_\mathcal{I}(\mathcal{E}_1, \texttt{Inv}_\mathcal{I}(\mathcal{E}_\lambda)). If another vertex also needs \texttt{Inv}_\mathcal{I}(\mathcal{E}_\lambda), the intern table returns the same node rather than creating a duplicate.Definition 11.4 (Expression differentiation) Let \mathcal{E} be an expression tree. The symbolic derivative of \mathcal{E} with respect to parameter \theta_j is an expression tree \partial_j \mathcal{E} defined recursively:
\partial_j \mathcal{E} = \begin{cases} \texttt{Const}(0) & \text{if } \mathcal{E} = \texttt{Const}(c) \\ \texttt{Const}(\delta_{j,j'}) & \text{if } \mathcal{E} = \texttt{Param}(j') \\ \texttt{Const}(c_i) & \text{if } \mathcal{E} = \texttt{Dot}(\mathbf{c}, \mathbf{j}) \text{ and } j = j_i \text{ for some } i \\ \texttt{Const}(0) & \text{if } \mathcal{E} = \texttt{Dot}(\mathbf{c}, \mathbf{j}) \text{ and } j \notin \{j_1, \ldots, j_m\} \\ \texttt{Add}(\partial_j \mathcal{E}_L, \, \partial_j \mathcal{E}_R) & \text{if } \mathcal{E} = \texttt{Add}(\mathcal{E}_L, \mathcal{E}_R) \\ \texttt{Sub}(\partial_j \mathcal{E}_L, \, \partial_j \mathcal{E}_R) & \text{if } \mathcal{E} = \texttt{Sub}(\mathcal{E}_L, \mathcal{E}_R) \\ \texttt{Add}(\texttt{Mul}(\mathcal{E}_L, \partial_j \mathcal{E}_R), \, \texttt{Mul}(\mathcal{E}_R, \partial_j \mathcal{E}_L)) & \text{if } \mathcal{E} = \texttt{Mul}(\mathcal{E}_L, \mathcal{E}_R) \\ \texttt{Div}(\texttt{Sub}(\texttt{Mul}(\mathcal{E}_R, \partial_j \mathcal{E}_L), \, \texttt{Mul}(\mathcal{E}_L, \partial_j \mathcal{E}_R)), \, \texttt{Mul}(\mathcal{E}_R, \mathcal{E}_R)) & \text{if } \mathcal{E} = \texttt{Div}(\mathcal{E}_L, \mathcal{E}_R) \\ \texttt{Div}(\texttt{Sub}(\texttt{Const}(0), \partial_j \mathcal{E}_L), \, \texttt{Mul}(\mathcal{E}_L, \mathcal{E}_L)) & \text{if } \mathcal{E} = \texttt{Inv}(\mathcal{E}_L) \end{cases}
where \delta_{j,j'} is the Kronecker delta.
Intuition
This is standard symbolic differentiation (sum rule, product rule, quotient rule, chain rule) applied to the expression tree ADT. The \texttt{Dot} case is a specialization of the linearity of differentiation. The result is a new expression tree that, when evaluated, yields the partial derivative \partial f / \partial \theta_j where f = \operatorname{eval}(\mathcal{E}, \boldsymbol{\theta}).Example
For \mathcal{E} = \texttt{Mul}(\texttt{Param}(1), \texttt{Param}(2)) representing \theta_1 \cdot \theta_2, we have \partial_1 \mathcal{E} = \texttt{Add}(\texttt{Mul}(\texttt{Param}(1), \texttt{Const}(0)), \texttt{Mul}(\texttt{Param}(2), \texttt{Const}(1))) = \texttt{Add}(\texttt{Const}(0), \texttt{Param}(2)). Evaluating at any \boldsymbol{\theta} gives \theta_2, which is the correct derivative.Theorems and Proofs
Theorem 11.1 (Evaluation correctness) Let \mathcal{E} be an expression tree and \boldsymbol{\theta} \in \mathbb{R}^k be a parameter vector such that no division by zero occurs during evaluation. Then \operatorname{eval}(\mathcal{E}, \boldsymbol{\theta}) is well-defined and equals the real number obtained by the standard arithmetic interpretation of \mathcal{E}.
Proof. By structural induction on \mathcal{E}.
Base cases. For \texttt{Const}(c): \operatorname{eval}(\texttt{Const}(c), \boldsymbol{\theta}) = c, which is the standard interpretation. For \texttt{Param}(j): \operatorname{eval}(\texttt{Param}(j), \boldsymbol{\theta}) = \theta_j, which is the standard interpretation. For \texttt{Dot}(\mathbf{c}, \mathbf{j}): \operatorname{eval}(\texttt{Dot}(\mathbf{c}, \mathbf{j}), \boldsymbol{\theta}) = \sum_{i=1}^m c_i \theta_{j_i}, which is the standard dot product.
Inductive cases. For each binary operation \texttt{Op} \in \{\texttt{Add}, \texttt{Sub}, \texttt{Mul}, \texttt{Div}\}: by the induction hypothesis, \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) and \operatorname{eval}(\mathcal{E}_R, \boldsymbol{\theta}) equal the standard interpretations of the subtrees. The definition of \operatorname{eval} applies the corresponding arithmetic operation, producing the standard interpretation of the full tree. For \texttt{Inv}: by the induction hypothesis, \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) is well-defined and non-zero (by the no-division-by-zero hypothesis), so 1 / \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) is well-defined and equals the standard interpretation. \square
Theorem 11.2 (Differentiation correctness) Let \mathcal{E} be an expression tree and \boldsymbol{\theta} \in \mathbb{R}^k be a parameter vector such that no division by zero occurs. Then for each j \in \{1, \ldots, k\}:
\operatorname{eval}(\partial_j \mathcal{E}, \boldsymbol{\theta}) = \frac{\partial}{\partial \theta_j} \operatorname{eval}(\mathcal{E}, \boldsymbol{\theta}).
Proof. By structural induction on \mathcal{E}.
Base cases. For \texttt{Const}(c): \operatorname{eval}(\partial_j \texttt{Const}(c), \boldsymbol{\theta}) = \operatorname{eval}(\texttt{Const}(0), \boldsymbol{\theta}) = 0 = \frac{\partial c}{\partial \theta_j}. For \texttt{Param}(j'): \operatorname{eval}(\partial_j \texttt{Param}(j'), \boldsymbol{\theta}) = \delta_{j,j'} = \frac{\partial \theta_{j'}}{\partial \theta_j}. For \texttt{Dot}(\mathbf{c}, \mathbf{j}): \operatorname{eval}(\partial_j \texttt{Dot}(\mathbf{c}, \mathbf{j}), \boldsymbol{\theta}) = c_i if j = j_i, or 0 otherwise. This equals \frac{\partial}{\partial \theta_j} \sum_{i} c_i \theta_{j_i} by linearity of differentiation.
Inductive cases. For \texttt{Add}: by the induction hypothesis and the sum rule, \operatorname{eval}(\partial_j \texttt{Add}(\mathcal{E}_L, \mathcal{E}_R), \boldsymbol{\theta}) = \operatorname{eval}(\partial_j \mathcal{E}_L, \boldsymbol{\theta}) + \operatorname{eval}(\partial_j \mathcal{E}_R, \boldsymbol{\theta}) = \frac{\partial}{\partial \theta_j}(\operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) + \operatorname{eval}(\mathcal{E}_R, \boldsymbol{\theta})). The \texttt{Sub} case is identical with subtraction.
For \texttt{Mul}: let f = \operatorname{eval}(\mathcal{E}_L, \boldsymbol{\theta}) and g = \operatorname{eval}(\mathcal{E}_R, \boldsymbol{\theta}). By the induction hypothesis, \operatorname{eval}(\partial_j \mathcal{E}_L, \boldsymbol{\theta}) = f' and \operatorname{eval}(\partial_j \mathcal{E}_R, \boldsymbol{\theta}) = g'. Then \operatorname{eval}(\partial_j \texttt{Mul}(\mathcal{E}_L, \mathcal{E}_R), \boldsymbol{\theta}) = f \cdot g' + g \cdot f' = \frac{\partial}{\partial \theta_j}(f \cdot g) by the product rule.
For \texttt{Div}: by the quotient rule, \frac{\partial}{\partial \theta_j}(f/g) = (g \cdot f' - f \cdot g') / g^2, which matches the definition of \partial_j \texttt{Div}.
For \texttt{Inv}: \frac{\partial}{\partial \theta_j}(1/f) = -f'/f^2 = (0 - f')/f^2, which matches the definition of \partial_j \texttt{Inv}. \square
Lemma 11.1 (CSE preserves evaluation) Let \mathcal{I} be an intern table. For any expression \mathcal{E} and parameter vector \boldsymbol{\theta}:
\operatorname{eval}(\operatorname{intern}(\mathcal{I}, \mathcal{E}), \boldsymbol{\theta}) = \operatorname{eval}(\mathcal{E}, \boldsymbol{\theta}).
Proof. The intern function \operatorname{intern}(\mathcal{I}, \mathcal{E}) returns either \mathcal{E} itself (if no structurally equal expression exists in \mathcal{I}) or an existing expression \mathcal{E}' that is structurally equal to \mathcal{E}. Since structural equality implies identical type, identical data, and recursively structurally equal children, a straightforward structural induction shows that \operatorname{eval}(\mathcal{E}', \boldsymbol{\theta}) = \operatorname{eval}(\mathcal{E}, \boldsymbol{\theta}) for all \boldsymbol{\theta}. \square
Algorithms
11.0.0.1 Expression Evaluation
Description. This algorithm evaluates an expression tree with a concrete parameter vector using iterative post-order traversal. The iterative approach avoids stack overflow on deep trees (which can arise from long chains of eliminations). The algorithm uses an explicit stack of (node, visited) pairs.
Expression Evaluation (Iterative)
1: Let E be an expression tree (Definition 9.1)
2: Let θ = (θ₁, ..., θₖ) be a parameter vector
3: function EvaluateExpression(E, θ)
4: S ← empty stack ▷ Stack of (node, visited) pairs
5: R ← empty stack ▷ Result stack
6: Push(S, (E, false))
7:
8: while S ≠ ∅ do
9: (node, visited) ← Pop(S)
10: if node.type ∈ {CONST, PARAM, DOT} then ▷ Leaf nodes
11: if node.type = CONST then
12: Push(R, node.const_value)
13: else if node.type = PARAM then
14: Push(R, θ[node.param_index])
15: else ▷ DOT
16: s ← 0
17: for i ← 1 to node.n_terms do
18: s ← s + node.coefficients[i] · θ[node.param_indices[i]]
19: end for
20: Push(R, s)
21: end if
22: else if visited = false then ▷ First visit: push children
23: Push(S, (node, true)) ▷ Re-push with visited = true
24: if node.right ≠ null then
25: Push(S, (node.right, false))
26: end if
27: Push(S, (node.left, false))
28: else ▷ Second visit: combine results
29: if node.type = INV then
30: a ← Pop(R)
31: Push(R, 1/a)
32: else
33: b ← Pop(R) ▷ Right operand (pushed first, popped second)
34: a ← Pop(R) ▷ Left operand
35: if node.type = ADD then Push(R, a + b)
36: else if node.type = SUB then Push(R, a - b)
37: else if node.type = MUL then Push(R, a · b)
38: else if node.type = DIV then Push(R, a / b)
39: end if
40: end if
41: end if
42: end while
43:
44: return Pop(R)
45: end function
Correspondence table:
| Pseudocode variable | Math symbol | Code variable (file:function) |
|---|---|---|
| \mathcal{E} | expression tree \mathcal{E} | expr (phasic_symbolic.c:ptd_expr_evaluate_iterative) |
| \boldsymbol{\theta} | parameter vector \boldsymbol{\theta} | params (phasic_symbolic.c:ptd_expr_evaluate_iterative) |
| S | traversal stack | stack (local array in implementation) |
| R | result stack | value_stack (local array in implementation) |
node.type |
\mathcal{E}.\mathrm{type} | expr->type (phasic.h:enum ptd_expr_type) |
CONST, PARAM, … |
\texttt{Const}, \texttt{Param}, … | PTD_EXPR_CONST, PTD_EXPR_PARAM, … |
Complexity. Time: O(|\mathcal{E}|) where |\mathcal{E}| is the number of nodes in the expression tree. Each node is pushed and popped at most twice. Space: O(h) where h is the height of the expression tree (stack depth).
Implementation Notes
Source code mapping:
| Algorithm | File | Function | Notes |
|---|---|---|---|
| Algorithm 11.1 | src/c/phasic_symbolic.c |
ptd_expr_evaluate_iterative |
Iterative evaluation |
| Expression creation | api/c/phasic.h, src/c/phasic_symbolic.c |
ptd_expr_const, ptd_expr_param, ptd_expr_dot, etc. |
Node constructors |
| CSE (Definition 11.3) | src/c/phasic_symbolic.c |
ptd_expr_intern, ptd_expr_add_interned, etc. |
Hash-based intern table |
| Differentiation (Definition 11.4) | src/c/phasic_symbolic.c |
ptd_expr_derivative |
Recursive symbolic differentiation |
| Edge-to-expression | src/c/phasic_symbolic.c |
edge_weight_to_expr (static) |
Converts parameterized edge to DOT expression |
Deviations from pseudocode:
Recursive vs. iterative evaluation. Both
ptd_expr_evaluate(recursive) andptd_expr_evaluate_iterative(iterative, matching Algorithm 11.1) exist. The iterative version is preferred for deep trees. The pseudocode presents only the iterative version.Differentiation is recursive. The
ptd_expr_derivativefunction uses recursive descent (matching Definition 11.4 directly) rather than iterative traversal, since derivative trees are typically much shallower than the original expression trees and are constructed once rather than evaluated repeatedly.Intern table sizing. The intern table is created with 4096 hash buckets initially. This is a tunable parameter; the table resizes if the load factor exceeds a threshold. The pseudocode does not specify hash table internals.
Edge coefficient optimization. The
edge_weight_to_exprfunction optimizes common cases: if all coefficients are zero, it returns \texttt{Const}(w) (the current evaluated weight). If there is a single coefficient equal to 1.0, it returns \texttt{Param}(j) instead of \texttt{Dot}. Otherwise, it constructs a sparse \texttt{Dot} node containing only non-zero coefficients.
Symbol Index
| Symbol | Name | First appearance |
|---|---|---|
| \mathbf{c} | Coefficient vector in \texttt{Dot} node | Definition 11.1 |
| \delta_{j,j'} | Kronecker delta | Definition 11.4 |
| \mathcal{E} | Expression tree | Definition 11.1 |
| \mathcal{E}_L, \mathcal{E}_R | Left and right children of expression node | Definition 11.1 |
| \operatorname{eval} | Expression evaluation function | Definition 11.2 |
| \mathcal{I} | Intern table for CSE | Definition 11.3 |
| \mathbf{j} | Parameter index vector in \texttt{Dot} node | Definition 11.1 |
| \partial_j \mathcal{E} | Symbolic derivative w.r.t. \theta_j | Definition 11.4 |