Stars
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
Kun-peng: an ultra-fast, low-memory footprint and accurate taxonomy classifier for all
Deep functional residue identification
Official repository for the Boltz-1 biomolecular interaction model
A trainable PyTorch reproduction of AlphaFold 3.
A bioinformatics workflow engine built on top of the Workflow Description Language (WDL).
Official Implemetation of DPLM (ICML'24) - Diffusion Language Models Are Versatile Protein Learners
User friendly and accurate binder design pipeline
Deep networks for protein functional inference
A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
Chai-1, SOTA model for biomolecular structure prediction
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Efficient Triton Kernels for LLM Training
PLM based active learning model for protein engineering
Metabuli: specific and sensitive metagenomic classification via joint analysis of DNA and amino acid.
Protein Sequence Annotation with Language Models
Joint embedding of protein sequence and structure with discrete and continuous compressions of protein folding model latent spaces. https://www.biorxiv.org/content/10.1101/2024.08.06.606920v1
Rank homologous protein sequences based on a regressor trained on experimental measures
Cython bindings and Python interface to FAMSA, an algorithm for ultra-scale multiple sequence alignments.