Skip to content

BUseclab/cve-genie

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CVE-Genie 🧞‍♂️

An LLM-based multi-agent framework for end-to-end reproduction of CVEs.

📜 Results

RESULTS.md provides details on accessing the results.

🏃‍♂️ How to Run

A) Extract and Prepare Data

If you want to reproduce a CVE follow the steps in (A-❶) to extract CVE data, otherwise, to reproduce a vulnerability that is not a CVE, follow step in (A-❷).

❶ CVE Data Extraction

‼️ Currently to extract CVE data we rely on cvelist but it will be deprecated in the future, this we are planning to move to cvelistV5 soon!

  1. Install necessary packages

    python3 -m venv env
    pip install -r src/data/requirements.txt
    playwright install
  2. Clone the cvelist repository

    cd src/
    git clone https://github.com/CVEProject/cvelistV5.git data/cvelistV5/
  3. Create .env in src/ and make sure it has your GITHUB_TOKEN

  4. Run the following script to extract the given CVE data

    python3 ./data/scripts/cve_data.py --cve_id CVE-2024-4340 --output_path ./data/example/test.json
  5. If the above script returns ✅ Ready to reproduce!! you can move to next step, otherwise go to PROCESSING.md, you might have to add some missing CVE context (mainly source code url and its version) manually. This can happen because the CVE records in cvelistV5 (1) do not contain the source code information (maybe because the CVE belongs to a commercial product), (2) records were modified, or (3) our parsing script was not able to automatically extract some of the content.

❷ Extract Non-CVE Vulnerability Data

Currently, we do not support automated extraction of non-CVE data, so please refer to PROCESSING.md and you can manually add vulnerability context to reproduce it.

B) Run CVE-Genie on the Extracted CVE Data

We provide the following two options to run CVE-Genie:

❶ In DevContainer

‼️ Easy to setup but it might not be compatible for CVEs that require running multiple services, as it can crash the DevContainer

  1. Start the devcontainer in VS Code

  2. cd into the src directory

  3. Create .env file in src, and add the OPENAI_API_KEY to use

  4. Run the following command to reproduce the given CVE

    ENV_PATH=.env MODEL=example_run python3 main.py --cve CVE-2024-4340 --json data/example/test.json --run-type build,exploit,verify
  5. The final results will be stored in /shared/<cve_id>/

❷ In a Virtual Machine

Read the VM Library Documentation on how to run it in a VM.

C) Visualize the Reproduction Run of CVEs

  1. Make sure the CVE reproduction run that you are trying to visualize is in results/reproduced_cve/

  2. Run the visualizer

    cd visualizer/
    python3 serve.py
  3. Click on the url generated by the above script and it will take you to the web application.

  4. Enter CVE-ID in the given field and click Load CVE. Now you can navigate through the agent conversations and tool calls and intermediate artifacts for all components of CVE-Genie for the given CVE's reproduction run.