petri_net_nn.sif¶
sif ¶
Simple Interaction Format (SIF) import.
SIF is the de facto interchange format used by Pathway Commons — the public hub that aggregates Reactome, BioCyc, PID, NCI Nature, Panther, HumanCyc, KEGG and others, and the canonical source of real curated biological-pathway content. Adding SIF support here gives PETRA a direct on-ramp to that body of data.
Each SIF line is a tab-separated triple::
ENTITY_A interaction_type ENTITY_B
The parser maps every unique entity (gene symbol, small molecule,
complex name) to a place and every interaction triple to a
transition with one input arc from ENTITY_A and one output
arc to ENTITY_B. Duplicate triples are deduplicated. The flow
is directional even for nominally-symmetric interaction types
(in-complex-with, interacts-with, neighbor-of) — when
a modeller wants symmetric handling they can add a second triple
in the opposite direction.
Standard Pathway Commons interaction types — the parser is not opinionated about which strings are valid, but for reference these are the ones the PC v14 schema documents:
controls-state-change-ofcontrols-phosphorylation-ofcontrols-expression-ofcontrols-transport-ofcatalysis-precedesin-complex-withinteracts-withneighbor-ofchemical-affectsconsumption-controlled-bycontrols-production-ofcontrols-transport-of-chemicalreacts-withused-to-produce
Comment lines starting with # and blank lines are skipped.
Lines with extra columns beyond the standard three (e.g.
EXTENDED_BINARY_SIF, which adds mediator IDs and data-source
columns) are accepted — only the first three columns are read.
What this importer deliberately does not do (yet):
- Inflect interaction direction by type — every triple becomes a
directed transition src → dst, even for symmetric interactions
like
in-complex-with. The Petri-net flow is directional by construction. - Type entities (protein, gene, small molecule, complex). PETRA treats every place as a single colourless slot. BioPAX support would carry that information through; SIF lost it before PETRA ever sees the file.
parse_sif ¶
Parse a SIF file into a PetriNet.
Each unique entity becomes a place; each unique interaction triple becomes a transition with one input arc (entity_a → transition) and one output arc (transition → entity_b). Duplicate triples are silently deduplicated, so re-importing the same file is idempotent.
The transition id is <src>__<interaction>__<dst>; the
label is the natural-language form "src interaction dst"
so distilled rules and anomaly explanations refer to the
triple in its original SIF vocabulary.
Raises ValueError if any non-comment, non-blank line cannot
be parsed as a 3-or-more-column tab-separated row with all
three core fields non-empty.