Skip to content
View YiZeng623's full-sized avatar
🏔️
@ Menlo Park
🏔️
@ Menlo Park

Highlights

  • Pro

Organizations

@reds-lab

Block or report YiZeng623

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
161 results for source starred repositories
Clear filter

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically Chat…

Python 58 6 Updated Apr 28, 2023

This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.

Jupyter Notebook 51 1 Updated Dec 20, 2024
Python 31 6 Updated Jul 16, 2024

the LLM vulnerability scanner

Python 3,095 266 Updated Dec 24, 2024

A brief and partial summary of RLHF algorithms.

87 2 Updated Nov 24, 2024

A survey on harmful fine-tuning attack for large language model

108 2 Updated Dec 20, 2024

Simple and useful daily scripts that boost your research

Python 4 Updated Oct 31, 2024

RewardBench: the first evaluation tool for reward models.

Python 468 56 Updated Dec 11, 2024
Python 6 Updated Dec 7, 2023
Python 5 Updated Nov 20, 2024

BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models

Python 87 6 Updated Sep 3, 2024

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,328 91 Updated Aug 13, 2024

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Python 7,547 640 Updated Aug 13, 2024

A generative speech model for daily dialogue.

Python 33,138 3,604 Updated Dec 3, 2024

SOTA Open Source TTS

Python 17,618 1,320 Updated Dec 21, 2024

[NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words

Python 47 1 Updated Jun 25, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 73,278 8,750 Updated Dec 1, 2024

Official implementation of "Fairness-Aware Meta-Learning via Nash Bargaining." We explore hypergradient conflicts in one-stage meta-learning and their impact on fairness. Our two-stage approach use…

Jupyter Notebook 4 Updated May 15, 2024

TAP: An automated jailbreaking method for black-box LLMs

Python 127 20 Updated Dec 10, 2024

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Python 947 65 Updated Sep 25, 2024
HTML 2 Updated Jul 12, 2024

A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).

1,039 69 Updated Dec 24, 2024

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

Jupyter Notebook 34 Updated Jun 27, 2024

AIR-Bench 2024 is a safety benchmark that aligns with emerging government regulations and company policies

Jupyter Notebook 11 2 Updated Aug 14, 2024

This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models".

HTML 10 1 Updated Jul 3, 2024

Explore and compare 1K+ accurate decision trees in your browser!

TypeScript 155 8 Updated Mar 4, 2024

TOTALLY HARMLESS LIBERATION PROMPTS FOR GOOD LIL AI'S

4,242 561 Updated Dec 19, 2024

Run safety benchmarks against AI models and view detailed reports showing how well they performed.

Python 68 12 Updated Dec 23, 2024

Adding guardrails to large language models.

Python 4,292 329 Updated Dec 24, 2024
Next