Skip to content

sunzhihao18/ForgerySleuth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection and Analyzing

Abstract

In this work, we explored the potential of multimodal large language models in the image manipulation detection task. We constructed ForgeryAnalysis, a dataset containing forgery analysis text annotations. Each entry was initially generated by GPT-4o and then reviewed by experts. The proposed data engine ForgeryAnalyst enables the creation of a larger-scale ForgeryAnalysis-PT dataset for pre-training purposes. We also proposed ForgerySleuth, which leverages multimodal large language model to perform comprehensive clue fusion and generate segmentation outputs indicating specific regions that are tampered. More details about our work can be found in the paper.

Contents

Install

conda create --name <env> --file requirements.txt

ForgeryAnalyst Data Engine

Automatic Annotation

You can use the data engine ForgeryAnalyst-llava-13B to automatically annotate forgery analysis text for images that already have tampered region masks:

python run_engine.py --model-path Zhihao18/ForgeryAnalyst-llava-13B --image-path <path_to_image> --mask-path <path_to_mask> --manipulation-type <manipulation_type> --output-path <path_to_save_output>

Authentic Image Analysis Generation

To ensure consistency in the training data, for authentic images, you can use ShareCaptioner to generate detailed image captions and then organize them in the Chain-of-Clues format.

python run_sharecaptioner.py --model-path Lin-Chen/ShareCaptioner --image-path <path_to_image> --output-path <path_to_save_output>

Tips: You can download ShareCaptioner in advance and use local_files_only=True to force the use of local weights, avoiding potential network issues.

ForgeryAnalysis Dataset

ForgeryAnalysis-PT

Overview

The ForgeryAnalysis-PT dataset consists of forgery analysis texts automatically generated by our data engine, ForgeryAnalyst. The dataset corresponds to two publicly available image manipulation detection datasets: CASIA2 and MIML. Each entry in the dataset provides forgery analysis for a corresponding tampered image, including clues and explanations structured in a Chain-of-Clues format.

Usage

Before using this dataset, download the original CASIA2 and MIML datasets from the respective public repositories, as ForgeryAnalysis-PT relies on these datasets for the corresponding tampered images.

The tampering analysis for each image is saved as a .txt file with the same name as the tampered image in the original CASIA2 and MIML datasets. You can download this dataset from the following link: Google Drive.

License

The ForgeryAnalysis-PT dataset is freely available for academic research and development. However, you must respect the terms and conditions of the original datasets, CASIA2 and MIML.

ForgerySleuth Assistant (TODO)

Evaluation Dataset

We used several publicly available and widely used image manipulation detection datasets to evaluate the performance of IMD methods. You can access the original repositories and download the data through the following links:

Dataset Paper Download URL
Columbia Detecting Image Splicing Using Geometry Invariants And Camera Characteristics Consistency https://www.ee.columbia.edu/ln/dvmm/downloads/authsplcuncmp
CASIA Casia image tampering detection evaluation database [Unofficial] https://github.com/namtpham/casia1groundtruth
[Unofficial] https://github.com/namtpham/casia2groundtruth
Coverage COVERAGE - A Novel Database for Copy-move Forgery Detection https://github.com/wenbihan/coverage
NIST16 MFC Datasets: Large-Scale Benchmark Datasets for Media Forensic Challenge Evaluation https://mfc.nist.gov/users/sign_in
IMD20 IMD2020: A Large-Scale Annotated Dataset Tailored for Detecting Manipulated Images https://staff.utia.cas.cz/novozada/db
COCOGlide TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization https://github.com/grip-unina/TruFor?tab=readme-ov-file#cocoglide-dataset

Citation

If you find this project useful for your research and applications, please cite using this BibTeX:

@misc{sun2024forgerysleuth,
      title={ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection}, 
      author={Sun, Zhihao and Jiang, Haoran and Chen, Haoran and Cao, Yixin and Qiu, Xipeng and Wu, Zuxuan and Jiang, Yu-Gang},
      publisher={arXiv:2411.19466},
      year={2024},
      url={https://arxiv.org/abs/2411.19466}, 
}

Acknowledgment

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published