Consistency with original rdchiral

This page describes how we measure consistency with the original rdchiral library, and summarizes the results.

Methodology

What is compared

We compare five operations across environments:

rdchiralExtract: template extraction via rdchiral.template_extractor.extract_from_reaction.
rdchiralRun: template application via rdchiral.main.rdchiralRun on pre-initialized rdchiralReaction / rdchiralReactants objects.
rdchiralRun_return_mapped_keep_mapnums: template application via rdchiral.main.rdchiralRun on pre-initialized rdchiralReaction / rdchiralReactants objects, with return_mapped=True and keep_mapnums=True.
rdchiralRun_return_mapped: template application via rdchiral.main.rdchiralRun on pre-initialized rdchiralReaction / rdchiralReactants objects, with return_mapped=True.
rdchiralRunText: template application via rdchiral.main.rdchiralRunText operating on (reaction_smarts, reactant_smiles) strings.

Each operation writes an “outcomes” CSV with a single outcome column, under scripts/generated_csvs/.

How outcomes are generated

Outcomes are generated by scripts/speed_benchmark_script.py (which is also used for speed benchmarking), and are written as:

For _rdchiralRun.csv, _rdchiralRunText.csv, _rdchiralRun_return_mapped_keep_mapnums.csv, and _rdchiralRun_return_mapped.csv:
If rdchiralRun* returns a list of product SMILES, the list is sorted and joined with "|".
If an exception is raised or no products are returned, the outcome is written as an empty string.
For _rdchiralExtract.csv:
If extraction returns a dict with reaction_smarts, that string is written.
If extraction fails, the outcome is written as an empty string.

Input ordering is randomized but deterministic: the benchmark script uses RANDOM_SEED = 42 and shuffles the input lists before constructing the benchmark workload.

Environments compared

The helper script scripts/run_speed_benchmark_envs.py builds and runs multiple environments and forwards a per-environment --save-file-prefix so each run produces distinct files:

original: upstream rdchiral installed from git+https://github.com/connorcoley/rdchiral.git.
cpp: the rdchiral_cpp conda-forge package (run with --cpp).
rdchiral_plus: this fork installed normally (pure-Python mode).
rdchiral_plus_mypyc: this fork installed with RDCHIRAL_USE_MYPYC=1.

How consistency is computed

scripts/analyze_consistency.py:

Loads all CSVs matching each suffix (_rdchiralExtract, _rdchiralRun, _rdchiralRunText).
Converts each outcome column to str and places each environment in a separate column.
Computes row-wise exact string equality against original_outcome and prints the identical count and percentage.

Results

Summary (full benchmark runs)

The tables below summarizes agreement with upstream rdchiral.

`_rdchiralExtract` (extracting templates from 50016 mapped reactions)

library	identical / total	identical %
cpp	33496 / 50016	66.97%
rdchiral_plus	49000 / 50016	97.97%
rdchiral_plus_mypyc	49000 / 50016	97.97%

`_rdchiralRun` (applying 1000 templates to 1000 SMILES)

library	identical / total	identical %
cpp	995590 / 1000000	99.56%
rdchiral_plus	999779 / 1000000	99.98%
rdchiral_plus_mypyc	999779 / 1000000	99.98%

`_rdchiralRun_return_mapped` (applying 1000 templates to 1000 SMILES with return_mapped=True)

library	identical / total	identical %
cpp	991897 / 1000000	99.19%
rdchiral_plus	999772 / 1000000	99.98%
rdchiral_plus_mypyc	999779 / 1000000	99.98%

`_rdchiralRun_return_mapped_keep_mapnums` (applying 1000 templates to 1000 SMILES with return_mapped=True and keep_mapnums=True)

library	identical / total	identical %
cpp	0 / 1000000	0.00%
rdchiral_plus	999764 / 1000000	99.98%
rdchiral_plus_mypyc	999779 / 1000000	99.98%

`_rdchiralRunText` (applying 1000 templates to 100 SMILES)

library	identical / total	identical %
cpp	99576 / 100000	99.58%
rdchiral_plus	99967 / 100000	99.97%
rdchiral_plus_mypyc	99967 / 100000	99.97%

Reproducing the analysis

From the repository root:

Generate per-environment CSVs (writes to scripts/generated_csvs/):

python scripts/run_speed_benchmark_envs.py --save-file-prefix true

If you need a clean rebuild of environments:

python scripts/run_speed_benchmark_envs.py --save-file-prefix true --reinstall

Compute identical counts vs upstream rdchiral:

python scripts/analyze_consistency.py

Behavioral differences

Relevant upstream changes and discussion:

https://github.com/connorcoley/rdchiral/pull/40: Fixes incorrect cis/trans outcomes for conjugated systems that could previously depend on atom numbering. In particular, when a template only specifies part of a conjugated system, the copied double-bond stereo directions may need to be reversed consistently.
https://github.com/connorcoley/rdchiral/pull/31: Template extraction corner cases could return None instead of a dict, leading to inconsistent downstream behavior (and possible AttributeErrors for callers expecting a mapping). This change makes the return type consistent.
https://github.com/connorcoley/rdchiral/commit/78bbafaba040678b957497e7f2638e935104e3d7: Extends template extraction to support a configurable fragment radius and an option to disable matching/including “special groups” (no_special_groups), which can change which atoms are included in extracted fragments.
Deterministic template extraction: Replaced random shuffle-based tetrahedral center correction loops with deterministic permutation parity calculation. The old behavior could lead to inconsistent results between runs or hang in rare instances with multiple stereocenters.
Broader stereochemistry handling: Stereochemistry for tetrahedral centers with lone pairs (e.g., sulfur in sulfoxides) is now properly accounted for during template extraction and application.
Stereochemistry tracking: Inversions of tetrahedral centers are now counted as changed atoms and included in extracted templates, improving accuracy for reactions where stereochemistry changes.
Spectator tracking: Spectator molecules that participate in the reaction mechanism but don't change are now included in extracted template dictionaries.
One-pot reactions: Templates defining multiple reactions on the same product are now properly handled by initializing templates with parentheses where needed.
Recursive template application: Templates can be recursively applied with a configurable max_depth parameter, useful for symmetric reactions or reactions that occur at multiple sites in a molecule.