Low-code framework for building custom LLMs, neural networks, and other AI models
-
Updated
Dec 2, 2024 - Python
Low-code framework for building custom LLMs, neural networks, and other AI models
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
A curated, but incomplete, list of data-centric AI resources.
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
The toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling.
DataCLUE: 数据为中心的NLP基准和工具包
Rust implementation of the Data Distribution Service (DDS)
[ICLR'23] Implementation of "Empowering Graph Representation Learning with Test-Time Graph Transformation"
Simulator framework for analysis of performance, energy consumption, area and cost of multi-node multi-chiplet tile-based manycore designs
A Data Centric NER annotation tool for your Named Entity Recognition projects
Vue Form with Laravel Inspired Validation and Simply Enjoyable Error Messages Api. (Form Api, Validator Api, Rules Api, Error Messages Api)
An observer is a wrapper over JSON data, that provides an interface to know when data is changed, with a focus on performance and memory efficiency.
Codes for a Top 5% finish in the Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI
From local functions to cloud deployed pipelines
Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information"
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
Quickly set up an image labelling web application for manually tagging images for machine learning tasks.
Open-source Data Backend written in Java and based on PostgreSQL & GraphQL.
The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈
Add a description, image, and links to the data-centric topic page so that developers can more easily learn about it.
To associate your repository with the data-centric topic, visit your repo's landing page and select "manage topics."