HALIT
ALPTEKIN
HOMEWHOAMIRESEARCHPROJECTS
POSTS
hblog-ng v0.1.0
RX:0 B/s
TX:0 B/s
Cover
PROJECT
/2016/ARCHIVED/

Leiurus

AI-powered vulnerability scanner that fingerprints pages and prioritizes attack vectors.

KEYWORDS:
+6
#dast#fuzzing#machine-learning#penetration-testing#python#vulnerability-assessment
The name Leiurus is a genus of scorpions (including the Deathstalker), known for fast, precise venom. I picked it because the tool works the same way: it studies a target, finds the weak spot, and strikes there instead of spraying every test at every page.
Leiurus is a #vulnerability-assessment tool that goes after the main waste in traditional #DAST (Dynamic Application Security Testing): running every test on every page. Instead, it uses #machine-learning to fingerprint a page and predict which vulnerabilities it is most likely to have. It is the working implementation of our paper Towards Prioritizing Vulnerability Testing, and it tests that paper's idea: pages with a similar structure tend to share the same kinds of vulnerabilities.
It is written in #python as a modular CLI. It covers the whole assessment, from scraping and vectorization to model training and reporting.

System Architecture

The system is a linear pipeline that turns raw URLs into confirmed findings. Input flows from the user through a feature extraction engine, into a #machine-learning model, and finally into an attack engine that checks the predictions.

Logical Workflow

The core logic sits in the ApplicationIdentifier. It turns a raw URL into a feature vector, predicts the vulnerability class, and hands the result to the Attack Engine to verify.
Loading diagram...

Feature Extraction Engine

The most complex part is the Feature Extractor. I used a Factory Pattern here so it is easy to extend. The FeatureManager does not need to know how a feature is extracted; it reads a config file and instantiates the right class at runtime.

Configuration Layer

Features are defined in simple INI files, not hardcoded. This allows users to add new detection signatures without touching the code.
ini
# Example Feature Configuration (features.ini)
[identifier_header_pingback_xmlrpc]
feature_name = header_contains
description = X-Pingback contains xmlrpc.php
key = X-Pingback
value = xmlrpc.php
path = /
active = 1
order = 1

Execution Layer

The factory reads feature_name (header_contains) and instantiates the corresponding class (HeaderContains), passing the key and value as arguments.
python
class HeaderContains(FeatureExtractor):
    def perform(self):
        # Strict equality check for header values to fingerprint backend
        if self.key in self.headers:
            if self.value in self.headers[self.key]:
                return 1
        return 0

AI Core

The AI module isn't a "black box" but a tunable component. The ModelManager currently defaults to a Multinomial Naive Bayes algorithm (MultinomialNB). It operates in two modes: Training Mode, where it learns from labeled datasets, and Prediction Mode, where it classifies new targets to prioritize attack vectors using #machine-learning.

Attack Engine

Once the AI classifies the target, the Attack Engine takes over. Unlike legacy scanners that rely on blind #fuzzing, Leiurus selects specific "Warheads" (payload lists). It doesn't just predict the type of vulnerability (e.g., SQLi), but also the most likely location (e.g., GET parameter id vs POST body vs User-Agent header).

The Attack Phase

The goal of Leiurus is not just classification but confirmation. The Attack Phase turns a prediction into actual #penetration-testing.

Vulnerability Verification Logic

Once the ModelManager returns a probability score, the system constructs a targeted Attack Strategy. This strategy dictates not just what to send, but where to send it.
  • •
    Payload Selection: Filters payloads based on the predicted backend (e.g., only PostgreSQL payloads if the fingerprint suggests Postgres).
  • •
    Target Selection: The AI identifies the most probable injection vectors. It might skip the URL parameters entirely and focus payload injection attempts solely on the X-Forwarded-For header if the model predicts an internal logging vulnerability.
Loading diagram...

Key Capabilities

The tool leans on automation instead of manual guess-and-check.

Fingerprinting

The system converts a webpage into a numerical fingerprint (vector) based on structural features (URL patterns), response features (HTTP status codes), and content features (keywords in the HTML body). This creates a unique signature for every page type in an application.

Attack Prioritization

This is the main engineering goal. Leiurus prioritizes testing based on context. If the fingerprint resembles a "Search Page," the system prioritizes Reflected XSS and SQL Injection on the query parameters. If it resembles a "Contact Form," it shifts focus to Stored XSS and SMTP Injection in the POST body.

Human-in-the-Loop Training

The system gets smarter over time. It supports an interactive training mode where users can correct the AI's predictions or feed it new datasets via the CLI. By using the train command, operators can associate specific feature vectors with confirmed vulnerabilities, effectively teaching the system to recognize new or custom attack surfaces.

Related Nodes

research
2020
Towards prioritizing vulnerability testing

A machine learning approach to accelerate vulnerability scanning by prioritizing security tests based on web page features.

+7
#automated-testing#cwe#machine-learning#neural-network#test-prioritization#vulnerability#vulnerability-assessment
Mermaid Diagram
Rendering diagram...
# Example Feature Configuration (features.ini)
[identifier_header_pingback_xmlrpc]
feature_name = header_contains
description = X-Pingback contains xmlrpc.php
key = X-Pingback
value = xmlrpc.php
path = /
active = 1
order = 1
class HeaderContains(FeatureExtractor):
    def perform(self):
        # Strict equality check for header values to fingerprint backend
        if self.key in self.headers:
            if self.value in self.headers[self.key]:
                return 1
        return 0
Mermaid Diagram
Rendering diagram...