Jarviswang94

Follow

Wenxuan Wang Jarviswang94

Follow

PostDoc@HKUST PhD@CUHK B.Eng@HUST

27 followers · 0 following

Achievements

Achievements

Stars

Jarviswang94 / MTTM

MTTM: Metamorphic Testing for Textual Content Moderation Software

Python 32 1 Updated Feb 10, 2023

WebPAI / DCGen

Jupyter Notebook 8 1 Updated Feb 19, 2025

Amaodemao / BiasPainter

basically all the things I used for this article

Python 24 Updated Jan 8, 2025

WebPAI / MRWeb

HTML 29 Updated Feb 20, 2025

yuezih / Movie101

Narrative movie understanding benchmark

Python 66 Updated May 9, 2024

yxwan123 / LogicAsker

Python 28 2 Updated Feb 19, 2025

CUHK-ARISE / MAS-Resilience

Code and data for our paper "On the Resilience of Multi-Agent Systems with Malicious Agents"

Python 15 Updated Jan 28, 2025

RobustNLP / DeRTa

A novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.

Python 58 Updated Jan 25, 2025

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 1,874 273 Updated Feb 21, 2025

CUHK-ARISE / GAMABench

Benchmarking LLMs' Gaming Ability in Multi-Agent Environments

Jupyter Notebook 67 Updated Feb 9, 2025

Trustworthy-AI-Group / TransferAttack

TransferAttack is a pytorch framework to boost the adversarial transferability for image classification.

Python 327 45 Updated Dec 26, 2024

shenyunhang / APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 534 37 Updated May 8, 2024

Skytliang / SpyGame

SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D

Python 14 Updated Nov 9, 2023

yxwan123 / BiasAsker

Python 35 4 Updated Jan 9, 2025

Jarviswang94 / Multilingual_safety_benchmark

Multilingual safety benchmark for Large Language Models

48 2 Updated Sep 1, 2024

CUHK-ARISE / PsychoBench

Benchmarking LLMs' Psychological Portrayal

Python 105 3 Updated Dec 31, 2024

BradyFU / Woodpecker

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs

Python 630 31 Updated Dec 23, 2024

huybery / Awesome-Code-LLM

👨‍💻 An awesome and curated list of best code-LLM for research.

1,138 64 Updated Dec 10, 2024

penguinnnnn / awesome-vlm-and-society

4 Updated Apr 7, 2024

RobustNLP / CipherChat

A framework to evaluate the generalization capability of safety alignment for LLMs

Python 585 64 Updated Dec 31, 2024

CUHK-ARISE / EmotionBench

Benchmarking LLMs' Emotional Alignment with Humans

Python 94 5 Updated Feb 9, 2025

penguinnnnn / awesome-llm-and-society

Recent papers on (1) Psychology of LLMs; (2) Biases in LLMs.

46 Updated Nov 3, 2023

zwhe99 / MAPS-mt

[TACL 2024] MAPS enables LLMs🤖 to mimic the human😁 translation process.

Python 141 5 Updated Jun 7, 2024

lyuchenyang / Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,544 120 Updated Jan 1, 2025

Skytliang / Multi-Agents-Debate

MAD: The first work to explore Multi-Agent Debate with Large Language Models :D

Python 331 35 Updated Jan 14, 2025

wxjiao / ParroT

The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.

Python 174 23 Updated Dec 31, 2024

wxjiao / Is-ChatGPT-A-Good-Translator

A preliminary evaluation of ChatGPT/GPT-4 for machine translation.

Python 244 16 Updated Dec 31, 2024

JohnnyPeng18 / HiTyper

This is the tool released in ICSE 2022 paper "Static Inference Meets Deep Learning: A Hybrid Type Inference Approach for Python"

Python 41 6 Updated Oct 19, 2023

thunlp / GNNPapers

Must-read papers on graph neural networks (GNN)

16,235 3,008 Updated Dec 20, 2023

jessevig / bertviz

BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)

Python 7,147 798 Updated Aug 24, 2023