# rxnfp **Repository Path**: dqhe/rxnfp ## Basic Information - **Project Name**: rxnfp - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-09-08 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # RXNFP - chemical reaction fingerprints > This library generates chemical reaction fingerprints from reaction SMILES ## Install For all installations, we recommend using `conda` to get the necessary `rdkit` and `tmap` dependencies: ### From pypi ```console conda create -n rxnfp python=3.6 -y conda activate rxnfp conda install -c rdkit rdkit conda install -c tmap tmap pip install rxnfp ``` ### From github ```console conda create -n rxnfp python=3.6 -y conda activate rxnfp conda install -c rdkit rdkit conda install -c tmap tmap git clone git@github.com:rxn4chemistry/rxnfp.git cd rxnfp pip install -e . ``` ## How to use Compute a fingerprint from a reaction SMILES ```python ``` ``` from rxnfp.transformer_fingerprints import ( RXNBERTFingerprintGenerator, get_default_model_and_tokenizer, generate_fingerprints ) model, tokenizer = get_default_model_and_tokenizer() rxnfp_generator = RXNBERTFingerprintGenerator(model, tokenizer) example_rxn = "Nc1cccc2cnccc12.O=C(O)c1cc([N+](=O)[O-])c(Sc2c(Cl)cncc2Cl)s1>>O=C(Nc1cccc2cnccc12)c1cc([N+](=O)[O-])c(Sc2c(Cl)cncc2Cl)s1" fp = rxnfp_generator.convert(example_rxn) print(len(fp)) print(fp[:5]) ``` 256 [-2.0174953937530518, 1.7602033615112305, -1.3323537111282349, -1.1095019578933716, 1.2254549264907837] Or for a list of reactions: ``` rxns = [example_rxn, example_rxn] fps = rxnfp_generator.convert_batch(rxns) print(len(fps), len(fps[0])) ``` 2 256 ## Reaction Atlas ### Pistachio The fingerprints can be used to map the space of chemical reactions:
Figure: Annotated Atlas of the Pistachio test set generated with TMAP.
Figure: Reaction atlas of 50k data set with different properties highlighted.