sailfish009 /
Forked from OpenSourceMolecularModeling/OpenSourceMolecularModeling.github.ioCatalog of Open Source Molecular Modeling Projects
OCR, layout analysis, reading order, table recognition in 90+ languages
A high-throughput and memory-efficient inference and serving engine for LLMs
Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
ChemDataExtractor Version 2.0
This repo contains ReactionDataExtractor v.2 - software toolkit for extraction of information from chemical reaction schemes
Extraction of action sequences from experimental procedures
Toolkit for Chemical Reaction Extraction from Scientific Literature (JCIM 2021)
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!
The Universe of Data. All about data, data science, and data engineering
Llama3、Llama3.1 中文仓库(随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
overview of datasets for ML in chemistry
Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)
S2ORC: The Semantic Scholar Open Research Corpus:
Python PDF parser for scientific publications: content and figures
Community maintained fork of pdfminer - we fathom PDF
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
A PyTorch-based knowledge distillation toolkit for natural language processing
"What Descartes did was a good step. You have added much several ways, and especially in taking the colours of thin plates into philosophical consideration. If I have seen a little further it is by…
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)