ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection and Analyzing

Abstract

In this work, we explored the potential of multimodal large language models in the image manipulation detection task. We constructed ForgeryAnalysis, a dataset containing forgery analysis text annotations. Each entry was initially generated by GPT-4o and then reviewed by experts. The proposed data engine ForgeryAnalyst enables the creation of a larger-scale ForgeryAnalysis-PT dataset for pre-training purposes. We also proposed ForgerySleuth, which leverages multimodal large language model to perform comprehensive clue fusion and generate segmentation outputs indicating specific regions that are tampered. More details about our work can be found in the paper.

Install

conda create --name <env> --file requirements.txt

ForgeryAnalyst Data Engine

Automatic Annotation

You can use the data engine ForgeryAnalyst-llava-13B to automatically annotate forgery analysis text for images that already have tampered region masks:

python run_engine.py --model-path Zhihao18/ForgeryAnalyst-llava-13B --image-path <path_to_image> --mask-path <path_to_mask> --manipulation-type <manipulation_type> --output-path <path_to_save_output>

Authentic Image Analysis Generation

To ensure consistency in the training data, for authentic images, you can use ShareCaptioner to generate detailed image captions and then organize them in the Chain-of-Clues format.

python run_sharecaptioner.py --model-path Lin-Chen/ShareCaptioner --image-path <path_to_image> --output-path <path_to_save_output>

Tips: You can download ShareCaptioner in advance and use local_files_only=True to force the use of local weights, avoiding potential network issues.

ForgeryAnalysis Dataset

ForgeryAnalysis-PT

Overview

The ForgeryAnalysis-PT dataset consists of forgery analysis texts automatically generated by our data engine, ForgeryAnalyst. The dataset corresponds to two publicly available image manipulation detection datasets: CASIA2 and MIML. Each entry in the dataset provides forgery analysis for a corresponding tampered image, including clues and explanations structured in a Chain-of-Clues format.

Usage

Before using this dataset, download the original CASIA2 and MIML datasets from the respective public repositories, as ForgeryAnalysis-PT relies on these datasets for the corresponding tampered images.

The tampering analysis for each image is saved as a .txt file with the same name as the tampered image in the original CASIA2 and MIML datasets. You can download this dataset from the following link: Google Drive.

License

The ForgeryAnalysis-PT dataset is freely available for academic research and development. However, you must respect the terms and conditions of the original datasets, CASIA2 and MIML.

ForgerySleuth Assistant (TODO)

Evaluation Dataset

We used several publicly available and widely used image manipulation detection datasets to evaluate the performance of IMD methods. You can access the original repositories and download the data through the following links:

Dataset	Paper	Download URL
Columbia	Detecting Image Splicing Using Geometry Invariants And Camera Characteristics Consistency	https://www.ee.columbia.edu/ln/dvmm/downloads/authsplcuncmp
CASIA	Casia image tampering detection evaluation database	[Unofficial] https://github.com/namtpham/casia1groundtruth
		[Unofficial] https://github.com/namtpham/casia2groundtruth
Coverage	COVERAGE - A Novel Database for Copy-move Forgery Detection	https://github.com/wenbihan/coverage
NIST16	MFC Datasets: Large-Scale Benchmark Datasets for Media Forensic Challenge Evaluation	https://mfc.nist.gov/users/sign_in
IMD20	IMD2020: A Large-Scale Annotated Dataset Tailored for Detecting Manipulated Images	https://staff.utia.cas.cz/novozada/db
COCOGlide	TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization	https://github.com/grip-unina/TruFor?tab=readme-ov-file#cocoglide-dataset

Citation

If you find this project useful for your research and applications, please cite using this BibTeX:

@misc{sun2024forgerysleuth,
      title={ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection}, 
      author={Sun, Zhihao and Jiang, Haoran and Chen, Haoran and Cao, Yixin and Qiu, Xipeng and Wu, Zuxuan and Jiang, Yu-Gang},
      publisher={arXiv:2411.19466},
      year={2024},
      url={https://arxiv.org/abs/2411.19466}, 
}

Acknowledgment

This work is built upon the LLaVA, LISA and SAM.
In the process of dataset creation and model evaluation, we utilized ChatGPT and ShareCaptioner.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
model		model
prompt		prompt
src		src
utils		utils
README.md		README.md
requirements.txt		requirements.txt
run_engine.py		run_engine.py
run_sharecaptioner.py		run_sharecaptioner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection and Analyzing

Abstract

Contents

Install

ForgeryAnalyst Data Engine

Automatic Annotation

Authentic Image Analysis Generation

ForgeryAnalysis Dataset

ForgeryAnalysis-PT

Overview

Usage

License

ForgerySleuth Assistant (TODO)

Evaluation Dataset

Citation

Acknowledgment

About

Releases

Packages

Languages

sunzhihao18/ForgerySleuth

Folders and files

Latest commit

History

Repository files navigation

ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection and Analyzing

Abstract

Contents

Install

ForgeryAnalyst Data Engine

Automatic Annotation

Authentic Image Analysis Generation

ForgeryAnalysis Dataset

ForgeryAnalysis-PT

Overview

Usage

License

ForgerySleuth Assistant (TODO)

Evaluation Dataset

Citation

Acknowledgment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages