Skip to content

AgaveChem

An open-source Python library for atom-to-atom mapping (AAM) of chemical reactions. AgaveChem provides four composable mappers—from deterministic graph-based methods to a supervised neural mapper—that can be used individually or combined into a pipeline.

The primary contribution is a supervised ALBERT-based neural mapper trained without any per-reaction manual annotation. Ground truth atom maps are generated automatically by composing an expert template mapper and an MCS mapper over a filtered subset of the Lowe USPTO dataset (~0.97M reactions), yielding a labeled training corpus orders of magnitude larger than what direct annotation can provide.

Requirements

  • Python (version >= 3.10)
  • RDKit
  • rdchiral-plus
  • PyTorch
  • Transformers (Hugging Face)

Installation

Install AgaveChem from PyPi:

pip install agave_chem

Or install AgaveChem with pip directly from this repo:

pip install git+https://github.com/denovochem/agave_chem.git

Or clone and install locally:

git clone https://github.com/denovochem/agave_chem.git
cd agave_chem
pip install .

Basic usage

from agave_chem import NeuralReactionMapper

mapper = NeuralReactionMapper("my_mapper")
result = mapper.map_reaction("CC(Cl)(Cl)OC(C)(Cl)Cl.CC(=O)C(=O)O>>CC(=O)C(=O)Cl")
print(result["selected_mapping"])

MCS mapper (fast, deterministic, partial mapping)

from agave_chem import MCSReactionMapper

mapper = MCSReactionMapper("my_mcs_mapper")
result = mapper.map_reaction("CC(Cl)(Cl)OC(C)(Cl)Cl.CC(=O)C(=O)O>>CC(=O)C(=O)Cl")
print(result["selected_mapping"])

Expert template mapper (interpretable, mechanistically grounded)

from agave_chem import TemplateReactionMapper

mapper = TemplateReactionMapper("my_template_mapper")
result = mapper.map_reaction("CC(Cl)(Cl)OC(C)(Cl)Cl.CC(=O)C(=O)O>>CC(=O)C(=O)Cl")
print(result["selected_mapping"])

Mapping a batch of reactions through the full pipeline

from agave_chem import map_reactions

reactions = [
    "CC(Cl)(Cl)OC(C)(Cl)Cl.CC(=O)C(=O)O>>CC(=O)C(=O)Cl",
    "OCC(=O)OCCCO.Cl>>ClCC(=O)OCCCO",
]
results = map_reactions(reactions)
for r in results:
    print(r["final_mapping"])

Documentation

Full documentation is available at the AgaveChem documentation site.

Contributing

  • Feature ideas and bug reports are welcome on the Issue Tracker.
  • Fork the source code on GitHub, make changes and file a pull request.

License

AgaveChem is licensed under the MIT license.

References