Skip to content
View naoa's full-sized avatar

Organizations

@groonga @mroonga @ipnexus @cleanhearing @patentfield

Block or report naoa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Training LLMs with QLoRA + FSDP

Jupyter Notebook 1,457 189 Updated Nov 9, 2024

Build LLM-powered applications in Ruby

Ruby 1,659 223 Updated Feb 19, 2025

Language-Agnostic SEntence Representations

Jupyter Notebook 3,619 462 Updated May 2, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,858 2,602 Updated Mar 4, 2025

Incremental Skip-gram Model with Negative Sampling

Shell 69 8 Updated Jun 30, 2019

Word2Vec naïve version from scratch vs Word2Vec parallelized version.

Jupyter Notebook 1 Updated Aug 4, 2022

Package for evaluating word embeddings

Python 437 111 Updated Jan 4, 2021

RiverText is a framework that standardizes the Incremental Word Embeddings proposed in the state-of-art. Please feel welcome to open an issue in case you have any questions or a pull request if you…

Python 22 1 Updated Feb 26, 2025

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 13,552 1,185 Updated Jul 29, 2024

🍇 GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations

Jupyter Notebook 566 39 Updated Feb 24, 2024

A collection of ORM-style clients to public patent data

Python 101 39 Updated Feb 19, 2025

Painterro - JavaScript painting plugin

JavaScript 648 88 Updated Sep 18, 2024

🔥 Use pre-trained models in PyTorch to extract vector embeddings for any image

Python 598 96 Updated Dec 23, 2023

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,407 4,861 Updated Feb 23, 2025

Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

Python 3,318 258 Updated Oct 18, 2024

Header-only C++/python library for fast approximate nearest neighbors

C++ 4,572 680 Updated Aug 11, 2024

Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.

C++ 3,450 457 Updated Sep 21, 2024

FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)

C 1,150 194 Updated Jun 1, 2024

Hash function quality and speed tests

C++ 1,946 183 Updated Jan 18, 2025

SIMD (SSE) population count --- http://0x80.pl/articles/sse-popcount.html

C++ 335 50 Updated Apr 1, 2024

Javascript Canvas Library, SVG-to-Canvas (& canvas-to-SVG) Parser

JavaScript 29,720 3,551 Updated Feb 16, 2025

Zest is a compression-based text classifier using Meta's Zstandard compression algorithm. Zest is language-agnostic and this approach simplifies configuration, avoids careful feature extraction and…

Python 5 Updated Jan 15, 2022

Datasets, SOTA results of every fields of Chinese NLP

HTML 1,803 271 Updated Apr 7, 2022

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Python 3,510 520 Updated Oct 16, 2024

Pytorch version of BERT-whitening

Python 308 44 Updated Oct 9, 2021

PISA: Performant Indexes and Search for Academia

C++ 972 66 Updated Feb 24, 2025

BERT models for Japanese text.

Python 530 55 Updated Mar 23, 2024

PyTorch code for SpERT: Span-based Entity and Relation Transformer

Python 694 146 Updated Feb 1, 2024
Next