Skip to content
View Jarviswang94's full-sized avatar

Block or report Jarviswang94

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MTTM: Metamorphic Testing for Textual Content Moderation Software

Python 32 1 Updated Feb 10, 2023
Jupyter Notebook 8 1 Updated Feb 19, 2025

basically all the things I used for this article

Python 24 Updated Jan 8, 2025
HTML 29 Updated Feb 20, 2025

Narrative movie understanding benchmark

Python 66 Updated May 9, 2024
Python 28 2 Updated Feb 19, 2025

Code and data for our paper "On the Resilience of Multi-Agent Systems with Malicious Agents"

Python 15 Updated Jan 28, 2025

A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.

Python 58 Updated Jan 25, 2025

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 1,874 273 Updated Feb 21, 2025

Benchmarking LLMs' Gaming Ability in Multi-Agent Environments

Jupyter Notebook 67 Updated Feb 9, 2025

TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.

Python 327 45 Updated Dec 26, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 534 37 Updated May 8, 2024

SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D

Python 14 Updated Nov 9, 2023
Python 35 4 Updated Jan 9, 2025

Multilingual safety benchmark for Large Language Models

48 2 Updated Sep 1, 2024

Benchmarking LLMs' Psychological Portrayal

Python 105 3 Updated Dec 31, 2024

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs

Python 630 31 Updated Dec 23, 2024

👨‍💻 An awesome and curated list of best code-LLM for research.

1,138 64 Updated Dec 10, 2024

A framework to evaluate the generalization capability of safety alignment for LLMs

Python 585 64 Updated Dec 31, 2024

Benchmarking LLMs' Emotional Alignment with Humans

Python 94 5 Updated Feb 9, 2025

Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.

46 Updated Nov 3, 2023

[TACL 2024] MAPS enables LLMs🤖 to mimic the human😁 translation process.

Python 141 5 Updated Jun 7, 2024

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,544 120 Updated Jan 1, 2025

MAD: The first work to explore Multi-Agent Debate with Large Language Models :D

Python 331 35 Updated Jan 14, 2025

The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.

Python 174 23 Updated Dec 31, 2024

A preliminary evaluation of ChatGPT/GPT-4 for machine translation.

Python 244 16 Updated Dec 31, 2024

This is the tool released in ICSE 2022 paper "Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for Python"

Python 41 6 Updated Oct 19, 2023

Must-read papers on graph neural networks (GNN)

16,235 3,008 Updated Dec 20, 2023

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Python 7,147 798 Updated Aug 24, 2023