CarolineTong

Follow

CarolineTong

Follow

2 followers · 47 following

Starred repositories

325 results for source starred repositories

opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

Python 23,137 1,677 Updated Jan 3, 2025

lukas-blecher / LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 13,202 1,055 Updated Dec 5, 2024

deepseek-ai / DeepSeek-V3

Python 15,461 1,131 Updated Jan 5, 2025

openai / prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 1,782 104 Updated Jun 1, 2023

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 21,632 1,716 Updated Jan 5, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…

Python 4,894 429 Updated Jan 4, 2025

QwenLM / ProcessBench

Python 104 3 Updated Dec 17, 2024

KwaiVGI / Koala-36M

Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content".

Python 85 3 Updated Nov 8, 2024

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 6,480 566 Updated Dec 31, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,271 401 Updated Aug 7, 2024

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 38,301 6,170 Updated Dec 9, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 37,288 4,596 Updated Jan 4, 2025

QwenLM / Qwen2.5

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 11,498 695 Updated Dec 24, 2024

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,045 698 Updated Dec 17, 2024

ociubotaru / transcripts

421 186 Updated Sep 11, 2024

deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

3,997 196 Updated Sep 25, 2024

mlfoundations / dclm

DataComp for Language Models

HTML 1,194 108 Updated Dec 11, 2024

salesforce / CodeGen

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Python 4,972 382 Updated Mar 17, 2024

deepseek-ai / DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself

Python 9,618 629 Updated May 21, 2024

wenbochang888 / house

有完整版的PDF下载。

Java 3,222 403 Updated Nov 28, 2024

minyoungg / platonic-rep

Python 485 32 Updated Jul 29, 2024

jondurbin / airoboros

Customizable implementation of the self-instruct paper.

Python 1,034 71 Updated Mar 7, 2024

ur-whitelab / chemcrow-public

Chemcrow

Python 663 103 Updated Dec 19, 2024

e2b-dev / awesome-ai-agents

A list of AI autonomous agents

12,659 941 Updated Jan 2, 2025

microsoft / JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Python 23,853 1,982 Updated Sep 26, 2024

AntonOsika / gpt-engineer

Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app

Python 52,789 6,872 Updated Nov 17, 2024

apache / camel

Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.

Java 5,665 4,974 Updated Jan 4, 2025

THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 5,669 474 Updated Dec 31, 2024

THUDM / CodeGeeX

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

Python 8,325 610 Updated Aug 13, 2024

OpenBMB / ChatDev

Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)

Python 26,086 3,266 Updated Dec 30, 2024

Starred topics

Natural language processing

named-entity-recognition