25 Mixed-Radix State Indexing
Introduction
Many phase-type models arising in population genetics encode structured states: a lineage may carry several discrete properties (population label, allele count, number of descendants) that together define the Markov chain state. Rather than working with arbitrary state vectors \sigma \in \mathbb{Z}^d, it is advantageous to define each property as a bounded integer with known range, and to establish a bijection between the combinatorial product of these ranges and a contiguous linear index. This bijection enables efficient graph construction (Algorithm 10.1) while allowing the user to think in terms of named properties rather than raw indices.
This file formalizes the mixed-radix numbering system (Cormen et al. 2009) used by the StateIndexer, Property, and PropertySet classes. The system assigns each property a “digit” in a positional number system where different digits may have different bases (radices). Conversion between the linear index and the property tuple is O(d) in the number of properties, with no table lookups or hash maps required.
Prerequisites: [01], [02]
Source files:
src/phasic/state_indexing.py(classes:Property,PropertySet,StateIndexer,Slot)
Definitions
Definition 25.1 (Property) A property is a triple P = (\mathrm{name}, a, b) where \mathrm{name} is a string identifier, a \in \mathbb{Z} is the minimum value, and b \in \mathbb{Z} with b \geq a is the maximum value. The base (radix) of the property is \beta(P) = b - a + 1, and the set of valid values is \mathcal{V}(P) = \{a, a+1, \ldots, b\}.
Intuition
A property represents one dimension of a structured state. For instance, a “population” property with a = 1, b = 3 models three populations, while a “descendants” property with a = 0, b = 10 tracks lineage size up to 10. The base \beta(P) is the number of distinct values.Example
Let P_1 = (\texttt{descendants}, 0, 10) and P_2 = (\texttt{population}, 1, 3). Then \beta(P_1) = 11 and \beta(P_2) = 3, so the total number of combinations is 11 \times 3 = 33.Definition 25.2 (PropertySet) A property set is a pair \mathcal{P} = (\mathrm{name}, (P_0, P_1, \ldots, P_{d-1})) where \mathrm{name} is a string identifier and (P_0, \ldots, P_{d-1}) is an ordered tuple of d properties with pairwise distinct names. The state length of \mathcal{P} is
N(\mathcal{P}) = \prod_{j=0}^{d-1} \beta(P_j).
The state space of \mathcal{P} is the Cartesian product \mathcal{V}(P_0) \times \mathcal{V}(P_1) \times \cdots \times \mathcal{V}(P_{d-1}), which has cardinality N(\mathcal{P}).
Intuition
A PropertySet groups properties into a single combinatorial space. The ordering of properties matters: it determines which property is the “least significant digit” in the mixed-radix representation. Property P_0 varies fastest as the linear index increases.Example
With P_0 = (\texttt{descendants}, 0, 2) (\beta = 3) and P_1 = (\texttt{population}, 1, 2) (\beta = 2), the state length is N = 3 \times 2 = 6. The six states are: (0,1), (1,1), (2,1), (0,2), (1,2), (2,2) at indices 0, 1, 2, 3, 4, 5 respectively.Definition 25.3 (Mixed-Radix Index) Let \mathcal{P} = (\mathrm{name}, (P_0, \ldots, P_{d-1})) be a property set. The mixed-radix encoding is the bijection
\phi \colon \mathcal{V}(P_0) \times \cdots \times \mathcal{V}(P_{d-1}) \to \{0, 1, \ldots, N(\mathcal{P}) - 1\}
defined by
\phi(v_0, v_1, \ldots, v_{d-1}) = \sum_{j=0}^{d-1} (v_j - a_j) \cdot \pi_j \tag{25.1}
where a_j is the minimum value of P_j and the radix powers are
\pi_0 = 1, \qquad \pi_j = \prod_{i=0}^{j-1} \beta(P_i) \quad \text{for } j \geq 1. \tag{25.2}
The inverse mapping \phi^{-1} recovers the property values from a linear index n:
v_j = \left\lfloor \frac{n}{\pi_j} \right\rfloor \bmod \beta(P_j) + a_j, \qquad j = 0, \ldots, d-1. \tag{25.3}
Intuition
This is the generalization of standard positional notation to bases that vary by digit position. In decimal (base 10 everywhere), the number 542 means 2 \cdot 1 + 4 \cdot 10 + 5 \cdot 100. Here, each property acts as a digit but may have a different base. Property P_0 is the “ones digit” (varies fastest), P_1 is the next digit, and so on.Example
Continuing the example from Definition 25.2: \pi_0 = 1, \pi_1 = 3. For state (\texttt{descendants}=2, \texttt{population}=2): \phi = (2-0) \cdot 1 + (2-1) \cdot 3 = 5. Inverse: given n = 5, we get v_0 = (5 \mathbin{/} 1) \bmod 3 + 0 = 2 and v_1 = (5 \mathbin{/} 3) \bmod 2 + 1 = 2.Definition 25.4 (StateIndexer) A StateIndexer is a tuple \mathcal{I} = ((\mathcal{P}_1, \ldots, \mathcal{P}_k), (\mathrm{slot}_1, \ldots, \mathrm{slot}_m)) consisting of k \geq 0 property sets and m \geq 0 slots (with k + m \geq 1), where each slot occupies exactly one index position. The total state length is
N(\mathcal{I}) = \sum_{i=1}^{k} N(\mathcal{P}_i) + m.
The index space is partitioned into contiguous blocks: property set \mathcal{P}_i occupies indices [o_i, o_i + N(\mathcal{P}_i)) where o_1 = 0 and o_{i+1} = o_i + N(\mathcal{P}_i), followed by slot indices o_{k+1}, o_{k+1}+1, \ldots, o_{k+1}+m-1.
Intuition
A StateIndexer combines multiple independent state spaces and metadata slots into a single flat index space. This is useful when a model has logically distinct groups of properties – for example, lineage properties and timing metadata – that should be managed independently but stored in a single state vector.Example
Let \mathcal{P}_1 have state length 11 (descendants 0–10) and \mathcal{P}_2 have state length 101 (time bins 0–100), with one slot “epoch”. Then N(\mathcal{I}) = 11 + 101 + 1 = 113. The lineage indices are 0–10, metadata indices are 11–111, and the epoch slot is index 112.Theorems and Proofs
Theorem 25.1 (Mixed-Radix Bijection Correctness) Let \mathcal{P} = (\mathrm{name}, (P_0, \ldots, P_{d-1})) be a property set. The mixed-radix encoding \phi (Definition 25.3) is a bijection from \mathcal{V}(P_0) \times \cdots \times \mathcal{V}(P_{d-1}) to \{0, 1, \ldots, N(\mathcal{P})-1\}, and both \phi and \phi^{-1} can be computed in O(d) time.
Proof. We proceed in three parts.
Well-definedness and range. For any (v_0, \ldots, v_{d-1}) with v_j \in \mathcal{V}(P_j), each encoded digit e_j = v_j - a_j satisfies 0 \leq e_j \leq \beta(P_j) - 1. The value \phi = \sum_{j=0}^{d-1} e_j \pi_j satisfies:
0 \leq \phi \leq \sum_{j=0}^{d-1} (\beta(P_j) - 1) \pi_j = \prod_{j=0}^{d-1} \beta(P_j) - 1 = N(\mathcal{P}) - 1
where the equality \sum_{j=0}^{d-1} (\beta(P_j) - 1) \pi_j = \prod_{j=0}^{d-1} \beta(P_j) - 1 follows by the standard telescoping identity for mixed-radix systems (induction on d: the base case d=1 gives (\beta(P_0)-1) \cdot 1 = \beta(P_0) - 1; for the inductive step, \sum_{j=0}^{d-1} (\beta(P_j)-1)\pi_j = (\beta(P_{d-1})-1)\pi_{d-1} + \sum_{j=0}^{d-2}(\beta(P_j)-1)\pi_j = (\beta(P_{d-1})-1)\pi_{d-1} + \pi_{d-1} - 1 = \beta(P_{d-1})\pi_{d-1} - 1 = N(\mathcal{P}) - 1).
Injectivity. Suppose \phi(v_0, \ldots, v_{d-1}) = \phi(v_0', \ldots, v_{d-1}'). Then \sum_j (v_j - a_j)\pi_j = \sum_j (v_j' - a_j)\pi_j. Since 0 \leq v_j - a_j < \beta(P_j), the representation of any integer in the range [0, N-1) in the mixed-radix system with bases (\beta(P_0), \ldots, \beta(P_{d-1})) is unique (by the division algorithm applied iteratively). Hence v_j = v_j' for all j.
Inverse correctness. Given n = \sum_j e_j \pi_j with 0 \leq e_j < \beta(P_j), we recover e_j = \lfloor n / \pi_j \rfloor \bmod \beta(P_j) by the standard mixed-radix digit extraction. Adding back a_j gives v_j.
Complexity. Computing \phi requires d multiplications and additions. Computing \phi^{-1} requires d integer divisions and modular reductions. The radix powers \pi_j are precomputed once in O(d) time and reused. \square
Algorithms
25.0.0.1 IndexToProperties
Description. Given a linear index n and a property set \mathcal{P}, this algorithm computes the inverse mixed-radix mapping \phi^{-1}(n), returning the property values (v_0, \ldots, v_{d-1}). It uses the precomputed radix powers \pi_j to extract each digit via integer division and modular reduction.
IndexToProperties
1: Let P = (P₀, ..., P_{d-1}) be properties with bases β(P_j) and min values a_j
2: Let π₀ = 1, π_j = ∏_{i=0}^{j-1} β(Pᵢ) be precomputed radix powers
3: Let n ∈ {0, ..., N(P)-1} be the linear index
4: function IndexToProperties(n, P, π)
5: for j ← 0 to d-1 do
6: e_j ← ⌊n / π_j⌋ mod β(P_j) ▷ Extract encoded digit
7: v_j ← e_j + a_j ▷ Decode to property value
8: end for
9: return (v₀, v₁, ..., v_{d-1})
10: end function
Correspondence table:
| Pseudocode variable | Math symbol | Code variable (file:function) |
|---|---|---|
| n | index n \in \{0, \ldots, N-1\} | index (state_indexing.py:PropertySet.index_to_props) |
| P_j | property P_j | self.properties[j] (state_indexing.py:PropertySet) |
| \beta(P_j) | base of property j | prop.base (state_indexing.py:Property.base) |
| a_j | minimum value of P_j | prop.min_value (state_indexing.py:Property) |
| \pi_j | radix power | self._radix_powers[j] (state_indexing.py:PropertySet) |
| e_j | encoded digit | encoded[i] (state_indexing.py:PropertySet.index_to_props) |
| v_j | decoded property value | decoded_values[prop.name] (state_indexing.py:PropertySet.index_to_props) |
Complexity. Time: O(d) where d is the number of properties. Space: O(d) for the output vector.
25.0.0.2 PropertiesToIndex
Description. Given a tuple of property values (v_0, \ldots, v_{d-1}) and a property set \mathcal{P}, this algorithm computes the forward mixed-radix mapping \phi(v_0, \ldots, v_{d-1}). Each value is validated against its property bounds, encoded by subtracting the minimum value, and combined via a dot product with the precomputed radix powers.
PropertiesToIndex
1: Let P = (P₀, ..., P_{d-1}) be properties with bases β(P_j) and min values a_j
2: Let π₀ = 1, π_j = ∏_{i=0}^{j-1} β(Pᵢ) be precomputed radix powers
3: Let (v₀, ..., v_{d-1}) be property values with v_j ∈ V(P_j)
4: function PropertiesToIndex((v₀, ..., v_{d-1}), P, π)
5: n ← 0
6: for j ← 0 to d-1 do
7: if v_j < a_j or v_j > b_j then ▷ Validate bounds
8: error "value out of range"
9: end if
10: e_j ← v_j - a_j ▷ Encode to 0-based digit
11: n ← n + e_j · π_j ▷ Accumulate weighted digit
12: end for
13: return n
14: end function
Correspondence table:
| Pseudocode variable | Math symbol | Code variable (file:function) |
|---|---|---|
| (v_0, \ldots, v_{d-1}) | property value tuple | props (dict or array) (state_indexing.py:PropertySet.props_to_index) |
| a_j, b_j | min/max value of P_j | prop.min_value, prop.max_value (state_indexing.py:Property) |
| e_j | encoded digit | prop.encode_value(props[prop.name]) (state_indexing.py:Property.encode_value) |
| \pi_j | radix power | self._radix_powers (state_indexing.py:PropertySet) |
| n | linear index | return value of np.dot(encoded, self._radix_powers) (state_indexing.py:PropertySet.props_to_index) |
Complexity. Time: O(d) where d is the number of properties. Space: O(d) for the encoded digit array.
Implementation Notes
Source code mapping:
| Algorithm | File | Function | Notes |
|---|---|---|---|
| Algorithm 25.1 | src/phasic/state_indexing.py |
PropertySet.index_to_props |
L300–L422 |
| Algorithm 25.2 | src/phasic/state_indexing.py |
PropertySet.props_to_index |
L428–L577 |
| Radix power precomputation | src/phasic/state_indexing.py |
PropertySet.__init__ |
L288–L289 |
| Property validation | src/phasic/state_indexing.py |
Property.validate_value |
L113–L131 |
Deviations from pseudocode:
Vectorized array input. The implementation of
index_to_propsaccepts NumPy arrays of indices and processes them in a vectorized loop, extracting all digits for all indices simultaneously using broadcasting. The pseudocode presents the scalar case for clarity.Partial property specification. The
props_to_indexmethod supports partial specifications (only some properties given), returning an array of all matching indices. This is implemented viaitertools.productover unspecified property ranges. The pseudocode presents only the complete specification case; the partial case is a straightforward enumeration over the Cartesian product of unspecified properties.Multiple output formats.
index_to_propssupports three output formats: a dynamically-created frozen dataclass (default), a dictionary (as_dict=True), and a raw NumPy array (as_values=True). The pseudocode returns a plain tuple. These formats are presentation-level differences and do not affect the underlying algorithm.StateIndexer composition. The
StateIndexerclass manages multiplePropertySetinstances andSlotobjects with offset-based addressing. This is a simple concatenation of index ranges (Definition 25.4) and does not involve algorithmic complexity beyond tracking cumulative offsets.
Symbol Index
| Symbol | Name | First appearance |
|---|---|---|
| a_j | Minimum value of property P_j | Definition 25.1 |
| b_j | Maximum value of property P_j | Definition 25.1 |
| \beta(P) | Base (radix) of a property | Definition 25.1 |
| d | Number of properties in a PropertySet | Definition 25.2 |
| e_j | Encoded digit (0-based offset from a_j) | Definition 25.3 |
| \mathcal{I} | StateIndexer | Definition 25.4 |
| N(\mathcal{P}) | State length of a PropertySet | Definition 25.2 |
| \mathcal{P} | PropertySet | Definition 25.2 |
| P_j | Property (named bounded integer dimension) | Definition 25.1 |
| \phi | Mixed-radix encoding bijection | Definition 25.3 |
| \phi^{-1} | Mixed-radix decoding (inverse bijection) | Definition 25.3 |
| \pi_j | Radix power (cumulative product of bases) | Definition 25.3 |
| \mathcal{V}(P) | Set of valid values for property P | Definition 25.1 |