molll: Data Driven Estimation of Molecular Log-Likelihood using Fingerprint Key Counting

This software provides models for estimating the likelihood of a molecule belonging to a specific dataset based on simple fingerprint key counting. The models, AtomLL and MolLL, are designed for outlier detection and class membership assignment. They offer potential applications in molecular generation and optimization. PropLL is included and uses scikit kernel density estimates on RDKit-derived and user-selectable properties.

Installation

Installed tagged release from PyPI

pip install molecule_ll

or directly from the repository without cloning:

pip install git+https://github.com/EBjerrum/molll.git

If you want to tinker and contribute, then clone and install in editable mode:

git clone [email protected]:EBjerrum/molll.git # Or you fork on GitHub
cd molll
pip install -e .

Usage

The code works on lists of RDKit Mol objects:

from molll import MolLL
molll = MolLL()
molll.analyze_dataset(mols_list)
molll.calculate_lls(other_or_same_mols)
#Or a single Mol object
molll.calculate_ll(single_mols)

Saving and loading from a text-based format:

molll.save("MySaveFile.json")

molll_clone = MolLL()
molll_clone.load("MySaveFile.json")

For convenience, some classes with precomputed data are available, currently based on LibInvent train data:

from molll import LibInventMolLLr1
molll = LibInventMolLLr1()
molll.calculate_lls(mols_list)

Additional Reading

There's a preprint on ChemRxiv with some example usages: https://doi.org/10.26434/chemrxiv-2024-hzddj

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
molll		molll
.gitignore		.gitignore
CITATION.bib		CITATION.bib
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

molll: Data Driven Estimation of Molecular Log-Likelihood using Fingerprint Key Counting

Installation

Usage

Additional Reading

About

Releases

Packages

Languages

License

EBjerrum/molll

Folders and files

Latest commit

History

Repository files navigation

molll: Data Driven Estimation of Molecular Log-Likelihood using Fingerprint Key Counting

Installation

Usage

Additional Reading

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages