Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: resolve duplicate port & improve first endpoint detection #273

Merged
merged 15 commits into from
Feb 19, 2025

Conversation

psyray
Copy link
Contributor

@psyray psyray commented Feb 16, 2025

fix #268

Summary by Sourcery

This pull request addresses several issues related to port scanning, endpoint detection, and data aggregation for the dashboard. It also includes performance improvements, bug fixes, and enhancements to the overall scanning process.

Bug Fixes:

  • Fixes a bug where duplicate ports were being added to the database.
  • Fixes an issue where the first endpoint was not being correctly detected during Nmap scans.

Enhancements:

  • Improves the dashboard's performance by optimizing database queries for retrieving most used ports, IPs, technologies, CVEs, CWEs, and vulnerability tags.
  • Updates the Nmap integration to process service information more reliably and efficiently.
  • Refactors the port scanning task to avoid adding duplicate ports and improve endpoint detection.
  • Improves the HTTP crawl task by handling URLs more robustly and preventing errors when no URLs are provided.
  • Enhances the subdomain import process to create the first endpoint from Nmap data, improving initial scan results.
  • Updates the engine task loading to use the engine ID instead of the engine name.
  • Improves the performance of the dashboard charts by reversing the data before rendering.

Build:

  • Adds a database migration to migrate the M2M relationship between IPAddress and Port to a ForeignKey relationship.

Tests:

  • Adds signal handlers to clean up orphaned IP addresses after subdomain deletion.

Chores:

  • Updates dependencies and configurations for improved performance and security.
  • Updates the default auto field for the startScan app to BigAutoField.

This change improves the port scanning functionality by preventing duplicate port entries and adds cascading deletes for Subdomains, IP addresses, and Ports. It also includes minor updates to logging, datatable display, and app initialization.
This change improves the performance of deleting orphaned ports and adds a database index to the ports ManyToMany field on the IpAddress model. The orphaned port deletion process is now more efficient by using bulk deletion and fetching related objects in bulk. The database index improves query performance for lookups involving IP addresses and their associated ports. A database migration is included to apply the index change.
Improved port management by changing the relationship between IPs and Ports from a ManyToMany field to a ForeignKey on the Port model, directly linking a port to a specific IP. This simplifies database interactions and eliminates redundant data. The logic for handling port creation, updates, and service information association has been centralized and streamlined. Orphaned port and IP cleanup logic, previously handled by signals, has been removed as it's now implicitly managed by the database through cascading deletes.
This change improves port scanning by including all ports for HTTP crawling and removes redundant logic that excluded ports 80 and 443. It also refines the handling of subdomain deletions to ensure orphaned IP addresses are correctly cleaned up after a subdomain is removed, preventing dangling database entries. Additionally, minor logging improvements and code style adjustments are included.
This change improves the handling of imported subdomains by adding automatic Nmap scanning and endpoint creation for each imported subdomain. It also refactors the subdomain deduplication logic to use set comprehension for improved efficiency and readability.
This change refactors the engine task loading mechanism in the subscan modal. It now retrieves tasks based on the selected engine ID instead of the engine name, improving accuracy and efficiency. Additionally, it enhances error handling and user experience by displaying informative messages when no tasks are available or an error occurs during loading. The change also simplifies the task display logic and removes unnecessary code related to port scanning visibility. Finally, it updates the engine ordering to be case-insensitive.
Copy link
Contributor

sourcery-ai bot commented Feb 16, 2025

Reviewer's Guide by Sourcery

This pull request addresses duplicate port handling, improves first endpoint detection, enhances Nmap result processing, and refactors the UI for better engine task loading. It also includes bug fixes for HTTP crawl and adds signals for cleaning up orphaned IP addresses.

Sequence diagram for Port Scanning with get_or_create_port

sequenceDiagram
    participant PS as Port Scan Task
    participant IP as IpAddress
    participant P as Port

    PS->>IP: Get or create IP Address
    IP-->>PS: Returns IP Address
    PS->>P: get_or_create_port(ip, port_number)
    alt Port exists
        P-->>PS: Returns existing Port
    else Port does not exist
        P->>P: Creates new Port with default values
        P-->>PS: Returns new Port
    end
    PS->>P: Updates Port service information (if available)
    P-->>PS: Port updated
Loading

Sequence diagram for HTTP Crawl task

sequenceDiagram
    participant HT as HTTP Crawl Task
    participant S as Subdomain
    participant E as Endpoint

    HT->>HT: Checks if URLs are provided directly
    alt URLs are provided
        HT->>HT: Filters URLs based on URL filter
        HT->>HT: Writes URLs to input file
    else URLs are not provided
        HT->>S: Get subdomain endpoints
        S-->>HT: Returns HTTP URLs
        HT->>HT: Extends URLs with subdomain endpoints
    end
    HT->>HT: Executes httpx command
    loop For each httpx result
        HT->>E: Saves/Updates Endpoint
        E-->>HT: Returns Endpoint
    end
    HT->>HT: Removes input file
Loading

Sequence diagram for create_first_endpoint_from_nmap_data

sequenceDiagram
    participant SF as save_imported_subdomains
    participant N as Nmap
    participant CF as create_first_endpoint_from_nmap_data
    participant E as Endpoint

    SF->>N: get_nmap_http_datas(subdomain_name, ctx)
    N-->>SF: hosts_data
    SF->>CF: create_first_endpoint_from_nmap_data(hosts_data, domain, subdomain, ctx)
    CF->>E: save_endpoint(http_url, ctx, is_default, subdomain)
    E-->>CF: endpoint
    CF-->>SF: endpoint
Loading

Updated class diagram for Port and IpAddress models

classDiagram
    class IpAddress {
        +id: int
        +address: str
        +is_cdn: bool
        +geo_iso: ForeignKey
        +version: int
    }
    class Port {
        +id: int
        +number: int
        +is_uncommon: bool
        +service_name: str
        +description: str
        +ip_address: ForeignKey
    }

    Port --|> IpAddress : ForeignKey
    note for Port "Unique together constraint on (ip_address, number)"
Loading

File-Level Changes

Change Details Files
Refactor dashboard charts to use template tags for improved readability and maintainability.
  • Replaced hardcoded date and data lists with template loops using {% with %} and {% for %} tags.
  • Reversed the order of data in the lists to match the expected chart format.
  • Used `
slice:":7"to limit the data to the last 7 entries.</li><li>Used
Fix duplicate port handling and improve port management during scans.
  • Modified the port_scan task to accept None as a default value for mutable arguments.
  • Used get_or_create_port to either retrieve an existing port or create a new one, ensuring no duplicates.
  • Removed the conditional check for ports 80 and 443 before adding endpoints.
  • Updated Nmap to run on the host instead of common ports.
  • Added a new migration to add a foreign key from the Port model to the IpAddress model, and to add a unique constraint on the ip_address and number fields.
web/reNgine/tasks.py
web/startScan/models.py
web/startScan/migrations/0059_auto_20250216_1450.py
Improve HTTP crawl task to handle null URLs and prevent errors.
  • Initialized urls as an empty list if None is passed to the http_crawl task.
  • Added a check to filter out null URLs from the list of URLs to crawl.
  • Added a try-except block to handle potential exceptions when deleting the input file.
  • Added a check to ensure the input file exists before attempting to delete it.
web/reNgine/tasks.py
Enhance Nmap result processing and port service information updates.
  • Modified parse_nmap_results to handle multiple IP addresses and extract address types.
  • Updated process_nmap_service_results to correctly process Nmap service detection results.
  • Implemented get_or_create_port to centralize port handling and service info management.
  • Implemented update_port_service_info to update port service information consistently.
web/reNgine/tasks.py
web/reNgine/common_func.py
Improve UI for engine task loading and subdomain management.
  • Updated the engine task loading in the UI to use the engine ID instead of the engine name.
  • Added signals to clean up orphaned IP addresses when subdomains are deleted or IPs are removed from subdomains.
web/startScan/static/startScan/js/detail_scan.js
web/startScan/signals.py
Enhance API serializers and views for engine and port management.
  • Updated the EngineSerializer to include a get_tasks method that retrieves tasks from the engine's YAML configuration.
  • Modified the ListEngines API view to filter engines by ID if provided.
  • Updated the port query in the API view to filter based on the new IP address relationship.
web/api/serializers.py
web/api/views.py
Miscellaneous changes.
  • Updated the default auto field in startScan/apps.py to django.db.models.BigAutoField.
  • Added ENGINE_NAMES and ENGINE_DISPLAY_NAMES to reNgine/definitions.py.
  • Removed unused code and comments.
  • Fixed minor UI issues and typos.
web/reNgine/definitions.py
web/startScan/apps.py
web/startScan/templates/startScan/detail_scan.html
web/static/plugins/datatable/global.css

Possibly linked issues

  • #987: The PR directly addresses the port recognition and duplication bugs described in the issue by improving port scanning logic and handling.
  • fix(scan): first endpoint detection for IP #251: The PR includes a fix for the first endpoint detection bug mentioned in the issue.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Fixes a bug where None values could be written to the HTTP URLs file.
Data migration to correctly update existing port data to use the new ForeignKey field.
Some service_name datas will be lost as old system do not track service_name by ip and by port, but only by port
This commit refactors the dashboard charts to enhance data handling and consistency. Key improvements include:

More accurate and efficient database queries for chart data.
Standardized data retrieval and processing for charts.
Improved handling of optional chart data.
Corrected a minor typo in a chart title.
Added a check for file existence before deletion in http_crawl task.
Corrected a redirect issue in admin_interface_update.
@psyray psyray marked this pull request as ready for review February 18, 2025 19:05
@psyray psyray requested a review from 0b3ud February 18, 2025 19:05
@psyray psyray self-assigned this Feb 18, 2025
@psyray psyray added the bug Something isn't working label Feb 18, 2025
@psyray psyray linked an issue Feb 18, 2025 that may be closed by this pull request
3 tasks
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @psyray - I've reviewed your changes - here's some feedback:

Overall Comments:

  • The conversion of the javascript charts to use django's templating language is a good change.
  • The migration to move the port relationship from a many-to-many to a foreign key looks correct.
Here's what I looked at during the review
  • 🟡 General issues: 5 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@AnonymousWP
Copy link
Member

I'll review this once suggestions have been implemented and conversations have been resolved.

@0b3ud
Copy link
Contributor

0b3ud commented Feb 19, 2025

@psyray I have tested this PR everything works no more port duplication

This commit introduces significant improvements to the dashboard and backend, focusing on efficient data retrieval and presentation. Key changes include optimized database queries for scan statistics, vulnerability data, and asset information, as well as enhanced timeline visualizations and activity feeds. Additionally, cleanup processes for orphaned data and temporary files have been refined. These enhancements improve performance and provide a more comprehensive overview of project security posture.
Error handling for YAML parsing and subdomain deletion has been enhanced to provide more informative logging and prevent unexpected behavior.
Copy link
Contributor

@0b3ud 0b3ud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re tested everything works code is clean

@0b3ud 0b3ud merged commit bbb5a25 into release/2.1.1 Feb 19, 2025
5 checks passed
@0b3ud 0b3ud deleted the fix/268-portscan branch February 19, 2025 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug(portscan): ports are duplicated by Naabu when launching a port scan
3 participants