petri_net_nn.discovery¶
discovery ¶
Inductive process mining — discover a sound Petri net from event logs.
Implementation of the basic Inductive Miner (Leemans, Fahland,
van der Aalst, 2013, "Discovering Block-Structured Process
Models from Event Logs - A Constructive Approach"). Given a
list of traces (each a sequence of activity names from the
event.name field), recursively find structural cuts in the
directly-follows graph until a process tree emerges, then
translate the process tree into a Petri net.
The output net is sound by construction — option to
complete, proper completion, and no dead transitions all hold
for any output of this algorithm. That guarantee composes
cleanly with the rest of PETRA's pipeline:
:func:discover_and_train chains discovery, a soundness check,
compilation, and training in one call.
Algorithm sketch¶
-
From the trace set, build the directly-follows graph (DFG) — nodes are activities, an edge
(a, b)meansawas directly followed bybin some trace. -
Try the four cut shapes in priority order:
-
Exclusive choice (
×): activities partition into disjoint groups that never co-occur in a single trace. Detected as connected components of the undirected DFG. -
Sequence (
→): activities partition into ordered groupsΣ₁, …, Σₙsuch that for each pairi<j, every activity inΣᵢreaches every activity inΣⱼvia DFG paths and none inΣⱼreaches any inΣᵢ. Detected via strongly-connected components and a total topological ordering. -
Parallel (
∧): activities partition into groups such that every cross-group pair(a, b)has botha→bandb→ain the DFG, and every group contains at least one trace-start activity and one trace-end activity. -
Loop (
↻): designate one group as the body and the rest as redo parts. The body alone contains all start and end activities of the log; the redos can only be entered after the body ends and exited by re-entering the body. -
If a cut is found, split the log accordingly and recurse on each sub-log.
-
Base cases:
-
Empty log →
Tau(silent transition). -
Single activity log →
Activityleaf. -
Fallback: when no cut applies, return a flower model — an XOR of all activities inside a self-loop. This is a degenerate output that always accepts the log; it appears when the log is too unstructured for the cut detectors to handle.
Limitations of this first delivery¶
- Basic IM only. Not IMf (infrequent-noise filtering) or IMi (incremental). Noisy logs may collapse into the flower-model fallback.
- Loop cut handles body+redo pairs. Nested loops work via recursion, but esoteric loop topologies may degrade.
- Timestamps are ignored — only activity sequences matter
for the structural discovery. PETRA's later training step
can read attributes via the standard
train_on_tracespath. - Event attributes are not used for discovery.
Tau
dataclass
¶
Silent / τ transition — produced for empty sub-logs and for the τ branches of loop and exclusive cuts that allow skipping.
Activity
dataclass
¶
Leaf node carrying a single activity name from the log. Becomes a single labelled transition in the emitted net.
Parallel
dataclass
¶
∧ — all children execute concurrently and synchronise
at the join.
Loop
dataclass
¶
↻ — children[0] is the body; children[1:] are
redo parts. The body executes; from its end, either continue
(loop exits) or do one of the redos and re-enter the body.
A loop with no explicit redo is equivalent to Loop(body,
Tau()) — every body iteration can be followed by either
another body execution or termination.
discover_inductive ¶
Discover a Petri net from a trace set via the basic Inductive Miner.
Parameters¶
traces
Either a list of :class:XESTrace objects (the
natural shape after :func:parse_xes / parse_csv
/ parse_json) or a raw list of activity-name
tuples. XESTrace inputs are projected to their event
names; everything else about the traces (attributes,
timestamps) is deliberately ignored — discovery is
purely structural.
Returns¶
PetriNet
Sound by construction (every Petri net the basic IM
emits is option-to-complete, properly-terminating, and
free of dead transitions; verify with
:func:check_soundness if you want explicit assurance).
The initial marking places one token at the source.
Transition labels are the activity names from the log;
silent (τ) transitions are flagged via
:attr:PetriNet.silent_transitions so weak bisimulation
treats them correctly.
Notes¶
Activity names that are pure Python identifiers map cleanly
onto place / transition IDs. Names with spaces or special
characters are stored verbatim on the transition labels
but never appear in IDs (the translator mints synthetic
t_N / p_N IDs).
Source code in petri_net_nn/discovery.py
discover_and_train ¶
End-to-end discover → verify → compile → train pipeline.
- Mine a Petri net from
tracesvia :func:discover_inductive. - Verify the mined net is sound via
:func:
petri_net_nn.soundness.check_soundness. By construction the basic IM always yields sound nets; this check guards against future changes to the miner or to the soundness definition. - Compile the net into a :class:
PetriNetModule. - Train on the same
tracesvia :func:petri_net_nn.traces.train_on_traces.
Returns (net, module, losses) — the discovered net,
the trained module, and the per-step loss trajectory.
The caller can then extract rules, score anomalies, run
bisimulation against another variant, etc., using the
standard PETRA pipeline.