Name	Name	Last commit message	Last commit date
Latest commit History 193 Commits
phishpedia	phishpedia
.gitattributes	.gitattributes
LICENSE	LICENSE
MANIFEST.in	MANIFEST.in
README.md	README.md
requirements.txt	requirements.txt
run.py	run.py
setup.py	setup.py

Phishpedia A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

This is the official implementation of "Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages" USENIX'21 link to paper, link to our website
The contributions of our paper:
- We propose a phishing identification system Phishpedia, which has high identification accuracy and low runtime overhead, outperforming the relevant state-of-the-art identification approaches.
- Our system provides explainable annotations which increases users' confidence in model prediction
- We conduct phishing discovery experiment on emerging domains fed from CertStream and discovered 1,704 real phishing, out of which 1133 are zero-days

Framework

Input: A URL and its screenshot Output: Phish/Benign, Phishing target

Step 1: Enter Deep Object Detection Model, get predicted logos and inputs (inputs are not used for later prediction, just for explaination)
Step 2: Enter Deep Siamese Model
- If Siamese report no target, Return Benign, None
- Else Siamese report a target, Return Phish, Phishing target

Project structure

- src
    - adv_attack: adversarial attacking scripts
    - detectron2_pedia: training script for object detector
     |_ output
      |_ rcnn_2
        |_ rcnn_bet365.pth 
    - siamese_pedia: inference script for siamese
     |_ siamese_retrain: training script for siamese
     |_ expand_targetlist
         |_ 1&1 Ionos
         |_ ...
     |_ domain_map.pkl
     |_ resnetv2_rgb_new.pth.tar
    - siamese.py: main script for siamese
    - pipeline_eval.py: evaluation script for general experiment

- tele: telegram scripts to vote for phishing 
- phishpedia_config.py: config script for phish-discovery experiment 
- phishpedia_main.py: main script for phish-discovery experiment

Requirements

The following packages may need to install manually.

Windows/Linux/Mac machine
python=3.7
torch=1.6.0 # Make sure that the Pytorch is compatible with your CUDA version.
torchvision
Install compatible Detectron2 manually, see the official installation guide. If you are using Windows, try this guide instead.

Use it as a package

First install the requirements, then run

 pip install git+https://github.com/lindsey98/Phishpedia.git

Run in python to test a single site

from phishpedia.phishpedia_main import test
import matplotlib.pyplot as plt
from phishpedia.phishpedia_config import load_config

url = open("phishpedia/datasets/test_sites/accounts.g.cdcde.com/info.txt").read().strip()
screenshot_path = "phishpedia/datasets/test_sites/accounts.g.cdcde.com/shot.png"
cfg_path = None # None means use default config.yaml
ELE_MODEL, SIAMESE_THRE, SIAMESE_MODEL, LOGO_FEATS, LOGO_FILES, DOMAIN_MAP_PATH = load_config(cfg_path)

phish_category, pred_target, plotvis, siamese_conf, pred_boxes = test(url, screenshot_path,
                                                                      ELE_MODEL, SIAMESE_THRE, SIAMESE_MODEL, LOGO_FEATS, LOGO_FILES, DOMAIN_MAP_PATH)

print('Phishing (1) or Benign (0) ?', phish_category)
print('What is its targeted brand if it is a phishing ?', pred_target)
print('What is the siamese matching confidence ?', siamese_conf)
print('Where is the predicted logo (in [x_min, y_min, x_max, y_max])?', pred_boxes)
plt.imshow(plotvis[:, :, ::-1])
plt.title("Predicted screenshot with annotations")
plt.show()

Or run in terminal to test a list of sites, copy run.py to your local machine and run

python run.py --folder <folder you want to test e.g. phishpedia/datasets/test_sites> --results <where you want to save the results e.g. test.txt> --no_repeat

Reference

If you find our work useful in your research, please consider citing our paper by:

@inproceedings{lin2021phishpedia,
  title={Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages},
  author={Lin, Yun and Liu, Ruofan and Divakaran, Dinil Mon and Ng, Jun Yang and Chan, Qing Zhou and Lu, Yiwen and Si, Yuxuan and Zhang, Fan and Dong, Jin Song},
  booktitle={30th $\{$USENIX$\}$ Security Symposium ($\{$USENIX$\}$ Security 21)},
  year={2021}
}

Contacts

If you have any issue running our code, you can raise an issue or send an email to [email protected], [email protected], and [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phishpedia A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

Framework

Project structure

Requirements

Use it as a package

Reference

Contacts

About

Releases

Packages

Languages

License

jie-xiao/Phishpedia

Folders and files

Latest commit

History

Repository files navigation

Phishpedia A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

Framework

Project structure

Requirements

Use it as a package

Reference

Contacts

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages