petri_net_nn.compiler¶
compiler ¶
Compile a PetriNet into a differentiable nn.Module.
Implements §4 of the architecture spec. The compiled module instantiates the continuous relaxation from §4.2 directly over the net's flow relation:
activation(t) = sigmoid( sharpness * (sum_p w(p,t)*a(p) - theta(t)) )
a(p) = sum_{t: (t,p) in F} activation(t) * w(t,p)
The structural constraint from §4.3 — "weights outside this structure are zero by construction and cannot be learned away from zero" — holds because the module allocates exactly one learnable scalar per arc in F and one threshold per transition in T. There is no global weight matrix with a mask; the parameters that don't exist literally don't exist.
Two forward-pass modes:
-
num_steps == 0(default) — acyclic mode. The constructor topologically sorts (P ∪ T, F) and forward does a single propagation pass in that order. The §4.2 equations are evaluated exactly once per node. Rejects cyclic nets at construction. -
num_steps > 0— time-unrolled mode. The constructor skips the topological sort so cyclic nets are accepted. Forward initialises place activations from the input marking / M_0 and then performsnum_stepssynchronous updates (each step: refresh every transition's activation from current place activations, then refresh every non-source place's activation from the new transition activations). Source places — those with empty preset — clamp to their input value at every step, so they behave as a persistent input layer.
Coloured-Petri-net layer¶
When a transition has a structural guard (declarative
{place, op, value} form), the compiler builds a learnable
soft-guard alongside the standard firing equation: an
nn.Parameter threshold initialised at value and a sigmoid
gate that multiplies the transition's firing strength by
sigmoid(s * sign * (value(place) - threshold)) (sign = +1 for
>/>=, −1 for </<=). The threshold trains end-to-end
with the rest of the network, so the model can refine the
declared boundary from execution traces. Guards declared as opaque
callables stay transparent to the compiler (the token-game still
uses them — they don't take part in training).
To make value-conditioned routing trainable, the forward pass
carries a per-place value channel alongside the activation
channel. Source-place values come from the optional input_values
argument (default 1.0). Non-source places combine the values
arriving on their incoming arcs into an activation-weighted average
— the natural soft-token analogue of "what value would this place
hold right now if a token were here." Output-arc values may be
declared on the net (arc_output_values); constant scalars are
honoured by the compiler, callable transforms are evaluated only
in the discrete coloured token-game and treated as the default
value 1.0 in the differentiable forward pass.
The guard sigmoid scales sharpness by 1 / max(|theta_init|, 1.0)
so the initial gradient at the boundary is in O(1) regardless of
the units the modeller used. The same SharpnessScheduler from
Phase 6 sharpens guards alongside firing transitions during
training.
PetriNetModule ¶
Bases: Module
Differentiable neural network whose topology is exactly a PetriNet.
Parameters¶
net :
A well-formed PetriNet. Validation errors are rejected at
construction time; cycles are rejected only when
num_steps == 0.
sharpness :
Multiplier inside the sigmoid (§4.2 has no such factor; this is a
training aid for AND-join–shaped transitions where a near-step
activation is needed — see §5 Subnet 4). Default 1.0 keeps the
forward pass faithful to §4.2 verbatim.
num_steps :
0 (default) selects acyclic single-pass mode; any positive
integer selects time-unrolled mode and accepts cyclic nets.
Source code in petri_net_nn/compiler.py
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 | |
forward ¶
Run a forward pass.
In acyclic mode (num_steps == 0), produces a single
topological propagation. In time-unrolled mode, returns the
activations after self.num_steps synchronous updates.
input_marking overrides any place's activation. In
time-unrolled mode the override is re-applied at every step,
which is how you clamp a "persistent input" through the
unrolled dynamics — equivalent to the §7.1 "predict next
activations from a partial execution" use case.
input_values feeds the per-place value channel that the
coloured-Petri-net layer reads. Each entry is a 1D tensor of
shape (batch_size,) giving the scalar value carried by
the token at that source place — loan amount, signal
strength, sensor reading, whatever the modeller chose. Any
source place absent from this dict defaults to value 1.0
(the value-carrying-no-information case, equivalent to a
plain unannotated token).