-
California Digital Library
- Los Angeles, CA
- https://orcid.org/0000-0003-1507-1031
Highlights
- Pro
Stars
The Citation File Format lets you provide citation metadata for software or datasets in plaintext files that are easy to read by both humans and machines.
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
A FIG Driver written in JavaScript which aims to fully implement the FIGfont spec.
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Open source project for data preparation of LLM application builders
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
(CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.
The repository for the team Grasshoppers of the Open Science course a.a. 2020/2021
University Domains and Names Data List & API
Unlock custom brushes, natural fill effects and intuitive hatching in p5.js
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
The Data Change Processing platform
ETL, Analytics, Versioning for Unstructured Data
Entropy Based Sampling and Parallel CoT Decoding
GuwenModels: 古文自然语言处理模型合集, 收录互联网上的古文相关模型及资源. A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.
Finding mentions and citations to named and implicit research datasets from within the academic literature
Ideas and suggestions from the DataCite staff and community about product features, metadata schema, and more
Use late-interaction multi-modal models such as ColPali in just a few lines of code.