Our goal:

open ai api key required

https://platform.openai.com/docs/overview
In .env add:
OPENAI_API_KEY=''
If you get authentication errors, or main doesn't run, hardcode the key into common/chat_app.py: -openai.api_key = ''

Setting up the environment:

Launch anaconda navigator
Launch VSCode through the navigator
In VSCode terminal create a conda environment with python version 3.11.5 and pip:
conda create -n <env_name> python=3.11.5 pip

Activate environment:

conda activate <env_name>
pip install -r requirements.txt python-dotenv
pip install -r requirements.txt
pip install python-dotenv
Create .env file

To run the code:

python main.py

Our goal:

In the original codebase, main tests scenarios that 'pass' when the model initialized as a digital assistant is polluted by a prompt injection.
We wanted to introduce a way that the scenarios would 'fail' without hampering the functionality of the assistant.
The most effective method that we discovered was to introduce a 'vigilant prompt' during the assistant's initialization.
The vigilant prompt further specifies the behavior that the assistant is meant to exhibit.
The demo shows how an LLM-integrated application can be protected with a simple implementation of a vigilant prompt.
Depending on the application's purpose, the vigilant prompt can be tailored for specific use cases.

Experiment: Remote control #Added prompt to user

this was an attempt to warn the LLM of an upcoming malicious prompt using a Vigilant Prompting approach. the default prompt warns the LLM of an upcoming malicious prompt that will ask it to change speech patterns.

Experiment: Authentication added when system tells GPT to send an email Attempt at Defense by User authentication: Adding a feature where, when the email keyword is used, the assistant will print the message contents to the user, and allow them to reply with Yes or No

No will stop the assistant from sending the email with malicious instructions Yes will allow the email to be sent. Block: this caused the program to exit as intended with the no keyword however, the yes keyword also caused the scenerio to fail due to unforseen complexities.

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
Doc		Doc
diagrams		diagrams
fuzzer		fuzzer
img		img
scenarios		scenarios
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
forkedREADME.md		forkedREADME.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

open ai api key required

Setting up the environment:

Activate environment:

To run the code:

Our goal:

About

Releases

Packages

Languages

License

jbutler512/ferengi-llm-security

Folders and files

Latest commit

History

Repository files navigation

open ai api key required

Setting up the environment:

Activate environment:

To run the code:

Our goal:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages