taeminlee

Taemin Lee taeminlee

60 followers · 59 following

Korea University, Human-Inspired AI Research
Seoul, South Korea
https://tmkor.com/

Achievements

Highlights

Stars

nlpai-lab / KURE

KURE: 고려대학교에서 개발한, 한국어 검색에 특화된 임베딩 모델

Python 119 6 Updated Jan 17, 2025

microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 21,836 2,156 Updated Jan 22, 2025

isaacus-dev / semchunk

A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.

Python 229 11 Updated Jan 21, 2025

KindXiaoming / pykan

Kolmogorov Arnold Networks

Jupyter Notebook 15,327 1,437 Updated Jan 19, 2025

segment-any-text / wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Python 825 46 Updated Jan 18, 2025

MinishLab / model2vec

The Fastest State-of-the-Art Static Embeddings in the World

Python 611 24 Updated Jan 22, 2025

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,762 186 Updated Nov 14, 2024

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 13,463 1,461 Updated Jan 20, 2025

matheusbach / legen

Uses AI to locally transcribes speech from media files, generating subtitle files, translates the generated subtitles, inserts them into the mp4 container, and burns them directly into video

Python 190 35 Updated Oct 23, 2024

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 4,022 363 Updated Dec 18, 2024

JuergenFleiss / aTrain

A GUI tool for offline transcription of speech recordings, including speaker diarization, utilizing state-of-the-art machine learning models.

CSS 409 28 Updated Dec 17, 2024

g8a9 / ferret

A python package for benchmarking interpretability techniques on Transformers.

Python 212 15 Updated Sep 29, 2024

superheavytail / pklue

Converts standard Korean dataset to instruction-tuning available format.

Python 3 Updated Aug 28, 2024

Marker-Inc-Korea / AutoRAG

AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation

Python 3,433 266 Updated Jan 22, 2025

pymupdf / RAG

RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF

Python 722 104 Updated Nov 1, 2024

unslothai / unsloth

Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory

Python 21,050 1,480 Updated Jan 21, 2025

danny-avila / LibreChat

Enhanced ChatGPT Clone: Features Agents, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code…

TypeScript 20,683 3,492 Updated Jan 22, 2025

rladmstn1714 / CLIcK

CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

42 1 Updated Dec 23, 2024

superheavytail / lm-evaluation-by-openai

A framework for benchmarking model's instruction following ability

Python 9 Updated Aug 31, 2024

argilla-io / distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 1,901 148 Updated Jan 22, 2025

anuraghazra / github-readme-stats

⚡ Dynamically generated stats for your github readmes

JavaScript 70,980 23,634 Updated Jan 22, 2025

Spico197 / Humback

🐋 An unofficial implementation of Self-Alignment with Instruction Backtranslation.

Python 136 9 Updated Jun 23, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 28,019 3,214 Updated Aug 12, 2024

aphrodite-engine / aphrodite-engine

Large-scale LLM inference engine

Python 1,245 140 Updated Jan 19, 2025

UpstageAI / evalverse

The Universe of Evaluation. All about the evaluation for LLMs.

Python 221 25 Updated Jul 9, 2024

Vaibhavs10 / insanely-fast-whisper

Jupyter Notebook 7,967 563 Updated Jun 16, 2024

kamalkraj / e5-mistral-7b-instruct

Finetune mistral-7b-instruct for sentence embeddings

Python 75 17 Updated May 2, 2024

michaelfeil / infinity

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

Python 1,730 120 Updated Jan 22, 2025

kyegomez / BitNet

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Python 1,744 158 Updated Jan 20, 2025

HeegyuKim / ko-rm-judge

Reward Model을 이용하여 언어모델의 답변을 평가하기

Python 27 2 Updated Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Taemin Lee taeminlee

Achievements

Achievements

Highlights

Block or report taeminlee

Stars

nlpai-lab / KURE

microsoft / graphrag

isaacus-dev / semchunk

KindXiaoming / pykan

segment-any-text / wtpsplit

MinishLab / model2vec

ictnlp / LLaMA-Omni

m-bain / whisperX

matheusbach / legen

MahmoudAshraf97 / whisper-diarization

JuergenFleiss / aTrain

g8a9 / ferret

superheavytail / pklue

Marker-Inc-Korea / AutoRAG

pymupdf / RAG

unslothai / unsloth

danny-avila / LibreChat

rladmstn1714 / CLIcK

superheavytail / lm-evaluation-by-openai

argilla-io / distilabel

anuraghazra / github-readme-stats

Spico197 / Humback

meta-llama / llama3

aphrodite-engine / aphrodite-engine

UpstageAI / evalverse

Vaibhavs10 / insanely-fast-whisper

kamalkraj / e5-mistral-7b-instruct

michaelfeil / infinity

kyegomez / BitNet

HeegyuKim / ko-rm-judge