Skip to content

Latest commit

 

History

History
66 lines (48 loc) · 6.03 KB

README.md

File metadata and controls

66 lines (48 loc) · 6.03 KB

ChatGPT for Digital Forensic Investigation: The Good, The Bad, and The Unknown

Accompanying repository of ChatGPT (GPT-4) interactions associated with the above paper to be published at the Digital Forensics Research Conference (DFRWS APAC), held in Singapore on 17-20 October 2023.

Authors: Mark Scanlon1, Frank Breitinger2, Christopher Hargreaves3, Jan-Niclas Hilgert4, and John Sheppard5

1 Forensics and Security Research Group, School of Computer Science, University College Dublin, Ireland
2 School of Criminal Justice, University of Lausanne, Lausanne, Switzerland
3 Department of Computer Science, University of Oxford, United Kingdom
4 Fraunhofer FKIE, Bonn, Germany
5 Department of Computing and Mathematics, South East Technological University, Waterford, Ireland

Table of Contents

Introduction

This repository provides a set of examples that demonstrate the potential risks and benefits of ChatGPT in the field of digital forensics.

The examples were created for a paper accepted at DFRWS APAC 2023 and will be published in the journal Forensic Science International: Digital Investigation. You can access a preprint of the paper here. See below for information on how to cite.

Paper Abstract

The disruptive application of ChatGPT (GPT-3.5, GPT-4) to a variety of domains has become a topic of much discussion in the scientific community and society at large. Large Language Models (LLMs), e.g., BERT, Bard, Generative Pre-trained Transformers (GPTs), LLaMA, etc., have the ability to take instructions, or prompts, from users and generate answers and solutions based on very large volumes of text-based training data. This paper assesses the impact and potential impact of ChatGPT on the field of digital forensics, specifically looking at its latest pre-trained LLM, GPT-4. A series of experiments are conducted to assess its capability across several digital forensic use cases including artefact understanding, evidence searching, code generation, anomaly detection, incident response, and education. Across these topics, its strengths and risks are outlined and a number of general conclusions are drawn. Overall this paper concludes that while there are some potential low-risk applications of ChatGPT within digital forensics, many are either unsuitable at present, since the evidence would need to be uploaded to the service, or they require sufficient knowledge of the topic being asked of the tool to identify incorrect assumptions, inaccuracies, and mistakes. However, to an appropriately knowledgeable user, it could act as a useful supporting tool in some circumstances.

Topics

The following topics are covered in the paper discussing the use of ChatGPT:

  • Artefact Identification
  • Self-Directed Learning of Digital Forensics
  • Keyword Searching
  • Programming in Digital Forensics
  • Detection
  • Generating Teaching Scenarios

The corresponding experiments conducted for each topic are in their associated folders in this repo.

How to cite this work

Scanlon, M., Breitinger, F., Hargreaves, C., Hilgert, J-N., Sheppard, J., ChatGPT for Digital Forensic Investigation: The Good, The Bad, and The Unknown, Forensic Science International: Digital Investigation (46):301609, ISSN 2666-2817, 2023.

BibTeX:

@article{scanlon2023ChatGPTForDigitalForensics,
title = "{ChatGPT for digital forensic investigation: The good, the bad, and the unknown}",
journal = {Forensic Science International: Digital Investigation},
volume = {46},
pages = {301609},
year = {2023},
issn = {2666-2817},
doi = {https://doi.org/10.1016/j.fsidi.2023.301609},
url = {https://www.sciencedirect.com/science/article/pii/S266628172300121X},
author = {Mark Scanlon and Frank Breitinger and Christopher Hargreaves and Jan-Niclas Hilgert and John Sheppard},
keywords = {ChatGPT, Digital forensics, Artificial intelligence, Generative pre-trained transformers (GPT), Large language models (LLM)},
abstract = {The disruptive application of ChatGPT (GPT-3.5, GPT-4) to a variety of domains has become a topic of much discussion in the scientific community and society at large. Large Language Models (LLMs), e.g., BERT, Bard, Generative Pre-trained Transformers (GPTs), LLaMA, etc., have the ability to take instructions, or prompts, from users and generate answers and solutions based on very large volumes of text-based training data. This paper assesses the impact and potential impact of ChatGPT on the field of digital forensics, specifically looking at its latest pre-trained LLM, GPT-4. A series of experiments are conducted to assess its capability across several digital forensic use cases including artefact understanding, evidence searching, code generation, anomaly detection, incident response, and education. Across these topics, its strengths and risks are outlined and a number of general conclusions are drawn. Overall this paper concludes that while there are some potential low-risk applications of ChatGPT within digital forensics, many are either unsuitable at present, since the evidence would need to be uploaded to the service, or they require sufficient knowledge of the topic being asked of the tool to identify incorrect assumptions, inaccuracies, and mistakes. However, to an appropriately knowledgeable user, it could act as a useful supporting tool in some circumstances.}
}

Hits