Skip to content

petri_net_nn.discovery

discovery

Inductive process mining — discover a sound Petri net from event logs.

Implementation of the basic Inductive Miner (Leemans, Fahland, van der Aalst, 2013, "Discovering Block-Structured Process Models from Event Logs - A Constructive Approach"). Given a list of traces (each a sequence of activity names from the event.name field), recursively find structural cuts in the directly-follows graph until a process tree emerges, then translate the process tree into a Petri net.

The output net is sound by construction — option to complete, proper completion, and no dead transitions all hold for any output of this algorithm. That guarantee composes cleanly with the rest of PETRA's pipeline: :func:discover_and_train chains discovery, a soundness check, compilation, and training in one call.

Algorithm sketch

  1. From the trace set, build the directly-follows graph (DFG) — nodes are activities, an edge (a, b) means a was directly followed by b in some trace.

  2. Try the four cut shapes in priority order:

  3. Exclusive choice (×): activities partition into disjoint groups that never co-occur in a single trace. Detected as connected components of the undirected DFG.

  4. Sequence (): activities partition into ordered groups Σ₁, …, Σₙ such that for each pair i<j, every activity in Σᵢ reaches every activity in Σⱼ via DFG paths and none in Σⱼ reaches any in Σᵢ. Detected via strongly-connected components and a total topological ordering.

  5. Parallel (): activities partition into groups such that every cross-group pair (a, b) has both a→b and b→a in the DFG, and every group contains at least one trace-start activity and one trace-end activity.

  6. Loop (): designate one group as the body and the rest as redo parts. The body alone contains all start and end activities of the log; the redos can only be entered after the body ends and exited by re-entering the body.

  7. If a cut is found, split the log accordingly and recurse on each sub-log.

  8. Base cases:

  9. Empty log → Tau (silent transition).

  10. Single activity log → Activity leaf.

  11. Fallback: when no cut applies, return a flower model — an XOR of all activities inside a self-loop. This is a degenerate output that always accepts the log; it appears when the log is too unstructured for the cut detectors to handle.

Limitations of this first delivery

  • Basic IM only. Not IMf (infrequent-noise filtering) or IMi (incremental). Noisy logs may collapse into the flower-model fallback.
  • Loop cut handles body+redo pairs. Nested loops work via recursion, but esoteric loop topologies may degrade.
  • Timestamps are ignored — only activity sequences matter for the structural discovery. PETRA's later training step can read attributes via the standard train_on_traces path.
  • Event attributes are not used for discovery.

Tau dataclass

Tau()

Silent / τ transition — produced for empty sub-logs and for the τ branches of loop and exclusive cuts that allow skipping.

Activity dataclass

Activity(name)

Leaf node carrying a single activity name from the log. Becomes a single labelled transition in the emitted net.

Sequence dataclass

Sequence(children)

— children execute one after the other, in order.

ExclusiveChoice dataclass

ExclusiveChoice(children)

× — exactly one of the children executes.

Parallel dataclass

Parallel(children)

— all children execute concurrently and synchronise at the join.

Loop dataclass

Loop(children)

children[0] is the body; children[1:] are redo parts. The body executes; from its end, either continue (loop exits) or do one of the redos and re-enter the body.

A loop with no explicit redo is equivalent to Loop(body, Tau()) — every body iteration can be followed by either another body execution or termination.

discover_inductive

discover_inductive(traces)

Discover a Petri net from a trace set via the basic Inductive Miner.

Parameters

traces Either a list of :class:XESTrace objects (the natural shape after :func:parse_xes / parse_csv / parse_json) or a raw list of activity-name tuples. XESTrace inputs are projected to their event names; everything else about the traces (attributes, timestamps) is deliberately ignored — discovery is purely structural.

Returns

PetriNet Sound by construction (every Petri net the basic IM emits is option-to-complete, properly-terminating, and free of dead transitions; verify with :func:check_soundness if you want explicit assurance). The initial marking places one token at the source. Transition labels are the activity names from the log; silent (τ) transitions are flagged via :attr:PetriNet.silent_transitions so weak bisimulation treats them correctly.

Notes

Activity names that are pure Python identifiers map cleanly onto place / transition IDs. Names with spaces or special characters are stored verbatim on the transition labels but never appear in IDs (the translator mints synthetic t_N / p_N IDs).

Source code in petri_net_nn/discovery.py
def discover_inductive(
    traces: "list[XESTrace] | list[Trace]",
) -> PetriNet:
    """Discover a Petri net from a trace set via the basic
    Inductive Miner.

    Parameters
    ----------
    traces
        Either a list of :class:`XESTrace` objects (the
        natural shape after :func:`parse_xes` / ``parse_csv``
        / ``parse_json``) or a raw list of activity-name
        tuples. XESTrace inputs are projected to their event
        names; everything else about the traces (attributes,
        timestamps) is deliberately ignored — discovery is
        purely structural.

    Returns
    -------
    PetriNet
        Sound by construction (every Petri net the basic IM
        emits is option-to-complete, properly-terminating, and
        free of dead transitions; verify with
        :func:`check_soundness` if you want explicit assurance).
        The initial marking places one token at the source.
        Transition labels are the activity names from the log;
        silent (τ) transitions are flagged via
        :attr:`PetriNet.silent_transitions` so weak bisimulation
        treats them correctly.

    Notes
    -----
    Activity names that are pure Python identifiers map cleanly
    onto place / transition IDs. Names with spaces or special
    characters are stored verbatim on the transition labels
    but never appear in IDs (the translator mints synthetic
    ``t_N`` / ``p_N`` IDs).
    """
    log: Log = []
    for trace in traces:
        if isinstance(trace, XESTrace):
            log.append(tuple(event.name for event in trace.events))
        else:
            log.append(tuple(trace))
    tree = _mine(log)
    return _process_tree_to_petri_net(tree)

discover_and_train

discover_and_train(traces, *, attribute_to_marking, steps=500, lr=0.1, sharpness=1.0, seed=None)

End-to-end discover → verify → compile → train pipeline.

  1. Mine a Petri net from traces via :func:discover_inductive.
  2. Verify the mined net is sound via :func:petri_net_nn.soundness.check_soundness. By construction the basic IM always yields sound nets; this check guards against future changes to the miner or to the soundness definition.
  3. Compile the net into a :class:PetriNetModule.
  4. Train on the same traces via :func:petri_net_nn.traces.train_on_traces.

Returns (net, module, losses) — the discovered net, the trained module, and the per-step loss trajectory. The caller can then extract rules, score anomalies, run bisimulation against another variant, etc., using the standard PETRA pipeline.

Source code in petri_net_nn/discovery.py
def discover_and_train(
    traces: list[XESTrace],
    *,
    attribute_to_marking,
    steps: int = 500,
    lr: float = 0.1,
    sharpness: float = 1.0,
    seed: int | None = None,
) -> "tuple[PetriNet, object, list[float]]":
    """End-to-end discover → verify → compile → train pipeline.

    1. Mine a Petri net from ``traces`` via
       :func:`discover_inductive`.
    2. Verify the mined net is sound via
       :func:`petri_net_nn.soundness.check_soundness`. By
       construction the basic IM always yields sound nets;
       this check guards against future changes to the
       miner or to the soundness definition.
    3. Compile the net into a :class:`PetriNetModule`.
    4. Train on the same ``traces`` via
       :func:`petri_net_nn.traces.train_on_traces`.

    Returns ``(net, module, losses)`` — the discovered net,
    the trained module, and the per-step loss trajectory.
    The caller can then extract rules, score anomalies, run
    bisimulation against another variant, etc., using the
    standard PETRA pipeline.
    """
    # Local imports keep the module's load-time dependency
    # footprint small — discovery doesn't itself need torch.
    import torch

    from petri_net_nn.compiler import PetriNetModule
    from petri_net_nn.soundness import check_soundness
    from petri_net_nn.traces import train_on_traces

    net = discover_inductive(traces)
    report = check_soundness(net)
    if not report.is_sound:
        # Should be impossible for the basic IM, but if it
        # ever happens it's a bug we want surfaced immediately
        # rather than masked.
        raise RuntimeError(
            f"discover_and_train: the mined net failed soundness — "
            f"{report.summary()}. This is a bug in the miner; "
            f"please report it with the input trace set."
        )
    if seed is not None:
        torch.manual_seed(seed)
    module = PetriNetModule(net, sharpness=sharpness)
    losses = train_on_traces(
        module,
        traces,
        attribute_to_marking=attribute_to_marking,
        steps=steps,
        lr=lr,
    )
    return net, module, losses