Lists (1)
Sort Name ascending (A-Z)
Stars
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
🧑🚀 全世界最好的LLM资料总结(数据处理、模型训练、模型部署、o1 模型、小语言模型、视觉语言模型) | Summary of the world's best LLM resources.
Official electron build of draw.io
Hosts a number of bilingual Mayan-Spanish corpora
ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages
PyTorch implementation of "UNIT: Unifying Image and Text Recognition in One Vision Encoder", NeurlPS 2024.
[ACL 2024] code and data for the paper: LogogramNLP
The official repo for paper, LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods.
Collection of training data management explorations for large language models
[Survey] Awesome List of Mixup Augmentation and Beyond (https://arxiv.org/abs/2409.05202)
A zero-shot faithfulness evaluation metric for text summarization
Code and Dataset for EMNLP 2024 Findings Paper
Tesseract Open Source OCR Engine (main repository)
[NeurIPS 2024] Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
[ACL 2024] ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training
Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
Codes for Paper: Towards Semantic Equivalence of Tokenization in Multimodal LLM
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
A high-throughput and memory-efficient inference and serving engine for LLMs
本文原文由知名 Hacker Eric S. Raymond 所撰寫,教你如何正確的提出技術問題並獲得你滿意的答案。
List of useful data augmentation resources. You will find here some not common techniques, libraries, links to GitHub repos, papers, and others.