Skip to content
View superdma's full-sized avatar
🌴
On vacation
🌴
On vacation

Block or report superdma

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Catalog of Open Source Molecular Modeling Projects

CSS 1 Updated Apr 4, 2019

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 14,606 922 Updated Dec 13, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 31,871 4,844 Updated Dec 14, 2024

A Repo For Document AI

Python 2,623 143 Updated Dec 11, 2024

Making data higher-quality, juicier, and more digestible for foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!

Python 3,101 186 Updated Dec 13, 2024

ChemDataExtractor Version 2.0

HTML 138 31 Updated Jul 4, 2024

This repo contains ReactionDataExtractor v.2 - software toolkit for extraction of information from chemical reaction schemes

Python 22 4 Updated Oct 17, 2023

Extraction of action sequences from experimental procedures

Python 39 11 Updated Oct 13, 2023

Toolkit for Chemical Reaction Extraction from Scientific Literature (JCIM 2021)

Python 74 20 Updated Mar 26, 2022

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 25,331 2,441 Updated Dec 13, 2024

Parse files for optimal RAG

Python 3,351 322 Updated Dec 13, 2024

LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!

Python 191 56 Updated May 26, 2024

The Universe of Data. All about data, data science, and data engineering

Python 521 52 Updated Jul 18, 2024
Python 275 31 Updated Jun 13, 2024

Llama3、Llama3.1 中文仓库(随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)

Python 4,074 337 Updated Sep 16, 2024

overview of datasets for ML in chemistry

272 28 Updated Jul 24, 2024
Python 12 3 Updated Oct 25, 2022

Parsers for scientific papers (PDF2JSON, TEX2JSON, JATS2JSON)

Python 350 65 Updated Apr 11, 2024

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/

Python 844 65 Updated Apr 26, 2024

Python PDF parser for scientific publications: content and figures

Python 367 55 Updated Mar 21, 2024

Community maintained fork of pdfminer - we fathom PDF

Python 6,017 936 Updated Aug 2, 2024

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 40,818 5,233 Updated Jun 27, 2024

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,013 474 Updated May 3, 2024

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 66,319 8,135 Updated Dec 9, 2024

A PyTorch-based knowledge distillation toolkit for natural language processing

Python 1,609 239 Updated May 8, 2023

"What Descartes did was a good step. You have added much several ways, and especially in taking the colours of thin plates into philosophical consideration. If I have seen a little further it is by…

8 Updated Nov 30, 2020
Python 215 37 Updated Sep 2, 2024

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,945 532 Updated Oct 24, 2024

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,507 1,875 Updated Apr 30, 2024

使用指令微调对大模型进行微调。

Python 8 2 Updated Jun 28, 2023
Next