Skip to content

petri_net_nn.traces

traces

Utilities for training PetriNetModule on XES execution logs.

This module is the bridge from §10 Step 3 (XES logs as training data) to §7.1 (process execution prediction) and §7.2 (anomaly detection).

The training signal is straightforward: each trace tells us which task transitions actually fired during that process instance. We turn that into a per-trace target vector — 1.0 for transitions whose label matches an event in the trace, 0.0 for the rest — and fit the compiled network so its forward-pass transition activations approach that vector.

Auto-generated gateway transitions (those whose label contains "->", which parse_bpmn emits for XOR-split / XOR-join branches) are excluded from the supervision by default, since XES logs don't record gateway firings.

Anomaly detection (§7.2): once trained, anomaly_score reports the per-transition absolute deviation between the network's prediction and the trace's observed occurrence vector. An out-of-distribution trace — one whose path through the process doesn't match what the learned weights predict — produces large per-transition residuals on the diverging arcs, which §7.2 calls "interpretable" at the granularity of the BPMN element.

SharpnessScheduler

SharpnessScheduler(module, *, start=1.0, end=8.0, num_steps, kind='linear')

Anneal PetriNetModule.sharpness over training.

The continuous relaxation in §4.2 trades faithfulness for gradient flow: a low sharpness gives smooth sigmoids that train well but don't really enforce step-like firing; a high sharpness gives near-step firing but with small gradients far from the threshold. Annealing — start low, finish high — gets both benefits in one training run.

Use it like a learning-rate scheduler: call .step() once per optimizer step and the module's sharpness attribute is updated in place. The forward pass picks up the new value on its next call.

Source code in petri_net_nn/traces.py
def __init__(
    self,
    module,
    *,
    start: float = 1.0,
    end: float = 8.0,
    num_steps: int,
    kind: str = "linear",
) -> None:
    if kind not in ("linear", "exponential"):
        raise ValueError(f"kind must be 'linear' or 'exponential', got {kind!r}")
    if num_steps <= 0:
        raise ValueError(f"num_steps must be positive, got {num_steps}")
    if kind == "exponential" and (start <= 0 or end <= 0):
        raise ValueError(
            "exponential schedule requires positive start and end"
        )
    self.module = module
    self.start = float(start)
    self.end = float(end)
    self.num_steps = num_steps
    self.kind = kind
    self._step = 0
    module.sharpness = self.start

trace_occurrence_vector

trace_occurrence_vector(net, trace, transitions)

Return a 0/1 tensor indicating which of transitions fired in trace — matched by transition label against each event's concept:name. Unknown event names are silently ignored.

Source code in petri_net_nn/traces.py
def trace_occurrence_vector(
    net: PetriNet,
    trace: XESTrace,
    transitions: list[str],
) -> torch.Tensor:
    """Return a 0/1 tensor indicating which of ``transitions`` fired in
    ``trace`` — matched by transition label against each event's
    ``concept:name``. Unknown event names are silently ignored."""
    label_map = _label_to_transition(net, transitions)
    fired = {label_map[e.name] for e in trace.events if e.name in label_map}
    return torch.tensor([1.0 if t in fired else 0.0 for t in transitions])

train_on_traces

train_on_traces(module, traces, *, attribute_to_marking, attribute_to_values=None, steps=500, lr=0.1, transitions=None)

Fit module so its transition activations match each trace's observed occurrence vector under that trace's derived input marking.

The default supervision target is every transition whose label doesn't look auto-generated (i.e. doesn't contain ->); pass transitions to override that selection. Returns the per-step loss trajectory.

attribute_to_values is the coloured-Petri-net analogue of attribute_to_marking: it returns the scalar value carried by the token at each source place for a given trace. Optional — omit it for plain (uncoloured) training. When present, the compiled module's learnable structural-guard thresholds train against these values, so the model can refine the declared threshold from data.

Source code in petri_net_nn/traces.py
def train_on_traces(
    module: PetriNetModule,
    traces: list[XESTrace],
    *,
    attribute_to_marking: AttributeToMarking,
    attribute_to_values: AttributeToValues | None = None,
    steps: int = 500,
    lr: float = 0.1,
    transitions: list[str] | None = None,
) -> list[float]:
    """Fit ``module`` so its transition activations match each trace's
    observed occurrence vector under that trace's derived input marking.

    The default supervision target is every transition whose label
    doesn't look auto-generated (i.e. doesn't contain ``->``); pass
    ``transitions`` to override that selection. Returns the per-step
    loss trajectory.

    ``attribute_to_values`` is the coloured-Petri-net analogue of
    ``attribute_to_marking``: it returns the scalar value carried by
    the token at each source place for a given trace. Optional —
    omit it for plain (uncoloured) training. When present, the
    compiled module's learnable structural-guard thresholds train
    against these values, so the model can refine the declared
    threshold from data."""
    net = module.net
    scored = transitions if transitions is not None else _scored_transitions(net)
    if not scored:
        raise ValueError("no transitions to score against")

    device = next(module.parameters()).device
    input_marking = _stack_input_markings(
        [attribute_to_marking(t) for t in traces], device
    )
    input_values = (
        _stack_input_markings(
            [attribute_to_values(t) for t in traces], device
        )
        if attribute_to_values is not None
        else None
    )
    targets = torch.stack(
        [trace_occurrence_vector(net, t, scored) for t in traces]
    ).to(device)

    opt = torch.optim.Adam(module.parameters(), lr=lr)
    losses: list[float] = []
    for _ in range(steps):
        opt.zero_grad()
        out = module(
            input_marking=input_marking,
            input_values=input_values,
            batch_size=len(traces),
        )
        predicted = torch.stack([out[t] for t in scored], dim=1)
        loss = F.binary_cross_entropy(predicted.clamp(1e-6, 1 - 1e-6), targets)
        loss.backward()
        opt.step()
        losses.append(loss.item())
    return losses

sweep_trace_count

sweep_trace_count(module_factory, traces, *, attribute_to_marking, sample_sizes, steps=500, lr=0.1)

Train a fresh module on the first N traces for each N in sample_sizes and return the final loss for each. Used to answer §8's "training data requirements" question — plotting the returned dict shows how quickly each subnet shape converges as more traces become available.

module_factory must return a fresh module on each call so that the runs don't share parameters.

Source code in petri_net_nn/traces.py
def sweep_trace_count(
    module_factory,
    traces: list[XESTrace],
    *,
    attribute_to_marking: AttributeToMarking,
    sample_sizes: list[int],
    steps: int = 500,
    lr: float = 0.1,
) -> dict[int, float]:
    """Train a fresh module on the first N traces for each N in
    ``sample_sizes`` and return the final loss for each. Used to
    answer §8's "training data requirements" question — plotting
    the returned dict shows how quickly each subnet shape converges
    as more traces become available.

    ``module_factory`` must return a fresh module on each call so that
    the runs don't share parameters."""
    results: dict[int, float] = {}
    for n in sample_sizes:
        subset = traces[:n]
        if not subset:
            results[n] = float("nan")
            continue
        module = module_factory()
        losses = train_on_traces(
            module,
            subset,
            attribute_to_marking=attribute_to_marking,
            steps=steps,
            lr=lr,
        )
        results[n] = losses[-1]
    return results

trace_anomaly_score

trace_anomaly_score(module, trace, *, attribute_to_marking, attribute_to_values=None, transitions=None)

Trace-level scalar anomaly score: the sum of per-transition absolute residuals returned by :func:anomaly_score. Higher means more anomalous; useful for ranking traces and computing AUC.

Source code in petri_net_nn/traces.py
def trace_anomaly_score(
    module: PetriNetModule,
    trace: XESTrace,
    *,
    attribute_to_marking: AttributeToMarking,
    attribute_to_values: AttributeToValues | None = None,
    transitions: list[str] | None = None,
) -> float:
    """Trace-level scalar anomaly score: the sum of per-transition
    absolute residuals returned by :func:`anomaly_score`. Higher means
    more anomalous; useful for ranking traces and computing AUC."""
    per_transition = anomaly_score(
        module, trace,
        attribute_to_marking=attribute_to_marking,
        attribute_to_values=attribute_to_values,
        transitions=transitions,
    )
    return sum(per_transition.values())

expected_cost

expected_cost(module, transition_costs, *, input_marking=None, batch_size=None)

Expected cost-to-completion under a trained module.

For each transition with a configured cost weight, multiply its forward-pass activation by the weight and sum. This is the realised-execution cost the §6-framing-bullet promises: a number you can read off two bisimilar variants under the same trace distribution and use to rank them by total cost.

Transitions absent from transition_costs contribute zero; transitions in the dict that aren't in the net are ignored (caller's responsibility to keep them aligned).

Source code in petri_net_nn/traces.py
def expected_cost(
    module,
    transition_costs: dict[str, float],
    *,
    input_marking: dict[str, "torch.Tensor"] | None = None,
    batch_size: int | None = None,
) -> "torch.Tensor":
    """Expected cost-to-completion under a trained module.

    For each transition with a configured cost weight, multiply its
    forward-pass activation by the weight and sum. This is the
    realised-execution cost the §6-framing-bullet promises: a number
    you can read off two bisimilar variants under the same trace
    distribution and use to rank them by total cost.

    Transitions absent from ``transition_costs`` contribute zero;
    transitions in the dict that aren't in the net are ignored
    (caller's responsibility to keep them aligned)."""
    out = module(input_marking=input_marking, batch_size=batch_size)
    total = None
    for t, cost in transition_costs.items():
        if t not in out:
            continue
        contribution = out[t] * float(cost)
        total = contribution if total is None else total + contribution
    if total is None:
        return torch.zeros(batch_size or 1, device=module._device())
    return total

auc

auc(positive_scores, negative_scores)

Mann-Whitney U / ROC AUC: the probability that a randomly drawn positive (anomalous) trace scores higher than a randomly drawn negative (normal) trace. Ties count as half. Returns nan if either list is empty.

Source code in petri_net_nn/traces.py
def auc(positive_scores: list[float], negative_scores: list[float]) -> float:
    """Mann-Whitney U / ROC AUC: the probability that a randomly drawn
    positive (anomalous) trace scores higher than a randomly drawn
    negative (normal) trace. Ties count as half. Returns ``nan`` if
    either list is empty."""
    if not positive_scores or not negative_scores:
        return float("nan")
    pos = torch.as_tensor(positive_scores, dtype=torch.float64)
    neg = torch.as_tensor(negative_scores, dtype=torch.float64)
    diff = pos[:, None] - neg[None, :]
    wins = (diff > 0).sum().item()
    ties = (diff == 0).sum().item()
    return (wins + 0.5 * ties) / (len(pos) * len(neg))

anomaly_score

anomaly_score(module, trace, *, attribute_to_marking, attribute_to_values=None, transitions=None)

Score one trace under a trained module. Returns a dict mapping each scored transition to the absolute residual between the network's predicted activation and the trace's observed firing (0 or 1). Summing the values gives a single trace-level anomaly score; inspecting individual entries identifies which parts of the process structure diverge — the §7.2 interpretability claim.

attribute_to_values mirrors the same kwarg on train_on_traces: pass it when scoring coloured-Petri-net traces so the trained guard thresholds get the value channel they need to gate firings correctly.

Source code in petri_net_nn/traces.py
def anomaly_score(
    module: PetriNetModule,
    trace: XESTrace,
    *,
    attribute_to_marking: AttributeToMarking,
    attribute_to_values: AttributeToValues | None = None,
    transitions: list[str] | None = None,
) -> dict[str, float]:
    """Score one trace under a trained module. Returns a dict mapping
    each scored transition to the absolute residual between the
    network's predicted activation and the trace's observed firing
    (0 or 1). Summing the values gives a single trace-level anomaly
    score; inspecting individual entries identifies *which* parts of
    the process structure diverge — the §7.2 interpretability claim.

    ``attribute_to_values`` mirrors the same kwarg on
    ``train_on_traces``: pass it when scoring coloured-Petri-net
    traces so the trained guard thresholds get the value channel
    they need to gate firings correctly."""
    net = module.net
    scored = transitions if transitions is not None else _scored_transitions(net)

    device = next(module.parameters()).device
    input_marking = _stack_input_markings([attribute_to_marking(trace)], device)
    input_values = (
        _stack_input_markings([attribute_to_values(trace)], device)
        if attribute_to_values is not None
        else None
    )
    target = trace_occurrence_vector(net, trace, scored).to(device)

    module.eval()
    with torch.no_grad():
        out = module(
            input_marking=input_marking,
            input_values=input_values,
            batch_size=1,
        )
        predicted = torch.stack([out[t] for t in scored], dim=1).squeeze(0)
        residuals = (predicted - target).abs()
    return {t: residuals[i].item() for i, t in enumerate(scored)}