Skip to content

tonylm00/Predicting-Vulnerable-Code

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Perseverance

logo

Perseverance is a tool developed in Python for analyzing repositories using Pydriller. It works with a dataset of open-source projects labeled as vulnerable or non-vulnerable. The tool extracts three distinct data models from Java code preceding vulnerability-fixing commits:

  • Text Mining: Creates a dictionary containing keywords found in the file and their frequency within it.
  • Software Metrics: Uses the tool Understand to calculate nine different metrics assessing the complexity of the file.
  • Static Analysis: Uses SonarQube to identify vulnerabilities in the file, referring to applicable Java rules. These models are used to train and evaluate various machine learning classifiers, including Logistic Regression, Naive Bayes, Support Vector Machine, and Random Forest, which are not included in the original version of the system.

Pre-Requirements

  1. Set up the Python environment: Ensure you have Python installed on your system. It's recommended to use a virtual environment to manage dependencies.

    python -m venv venv
    source venv/bin/activate  
    # On Windows use `venv\Scripts\activate`
  2. Install dependencies: Use requirements.txt to install all necessary packages.

    pip install -r requirements.txt
  3. Install SonarScanner and SonarQube: SonarScanner and SonarQube are required for static analysis. Please follow the official installation instructions:

    Ensure that SonarQube is running locally or accessible from your environment for analysis.

  4. Generate a SonarQube User Token: A user token is required for authentication with the SonarQube server. You can generate one by logging into SonarQube, navigating to your user account settings, and selecting Security > Generate Token. Save this token as you will need it to configure the analysis.

How to use Perseverance

tutorial

About

Predicting Vulnerable Code : How far are we?

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 86.9%
  • Python 11.0%
  • TeX 1.5%
  • Other 0.6%