# FirmAgent **Repository Path**: eydelu/FirmAgent ## Basic Information - **Project Name**: FirmAgent - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-02-23 - **Last Updated**: 2026-02-23 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # FirmAgent **⚠️ Legal/Ethical Notice** Only test targets you own or have explicit written permission to assess. You are solely responsible for compliance with local laws, licenses, and organizational policies. ------ ## Table of Contents - [Overview](#overview) - [Prerequisites](#prerequisites) - [Quick Start](#quick-start) - [Workflow](#workflow) - [1) Pre-fuzzing Analysis](#1-pre-fuzzing-analysis) - [2) Runtime Monitoring](#2-runtime-monitoring) - [3) Fuzzing](#3-fuzzing) - [4) Taint-to-PoC Agent](#4-taint-to-poc-agent) - [Input/Output Schemas](#inputoutput-schemas) - [Pre_fuzzing.json](#pre_fuzzingjson-api-definitions) - [Fuzzing output files](#fuzzing-output-files) - [Configuration Reference](#configuration-reference) - [Tips & Best Practices](#tips--best-practices) - [Troubleshooting](#troubleshooting) - [FAQ](#faq) ------ ## Overview ![alt text](Pre_Fuzzing/image.png) This toolkit automates an end-to-end workflow for discovering and validating vulnerabilities in re-hosted IoT firmware: 1. **Fuzzing-Driven Information Collection** - Extract all target URIs and input parameters. - Locate sink functions and compute distances from basic blocks to sinks. - Produce mutation dictionaries for the fuzzer. - Instrumented QEMU binaries collect control-flow data during fuzzing. - Optional: Build emulation images directly with a helper script. - Generic, device-aware API fuzzer with configurable `Host` header. - Uses the pre-fuzzing artifacts to generate requests and mutations. 2. **Taint-to-PoC Agent** - LLM-assisted taint analysis to findings vulnerabilities and produce PoCs. ------ ## Prerequisites - **Python**: 3.8+ Install Python dependencies: ``` pip install -r requirements.txt ``` - **Emulation**: Greenhouse (Details can be found in https://github.com/sefcom/greenhouse.git) - We ship instrumented binaries under `FirmAgent/FuzzingRecord/gh3fuzz/fuzz_bins/qemu/`. - Control-flow collector: `libibresolver.so`. - **Target firmware**: Obtain and re-host the firmware image - **Setting Environment Variables:** ​ export idat64 = path/to/idat64 ​ export OPENAI_API_KEY = {OPENAI_API_KEY} or export Deepseek_API_KEY = {Deepseek_API_KEY} ------ ## Quick Start 1. **Run pre-fuzzing** **in ida** (extract inputs & sinks): These generate: - `Pre_fuzzing.json` (API endpoints + request templates) 2. **Build an emulated image (optional, automated wiring)**: ``` python build_fuzz_img.py ``` This integrates the instrumentation and supporting binaries into the emulation image. 3. **Start fuzzing (from the host)**: ``` python fuzzer.py \ --json-file Pre_fuzzing.json \ --delay 0.5 \ --host {target_ip_or_domain} ``` 4. **Run taint analysis to generate PoCs**: ``` python LLMATaint.py \ -b {path_to_binary} \ -p {True|False} \ -t {vuln_type} \ -o {path_to_resultfolder} \ -m {model} ``` > During fuzzing, a **Source** address list and an **indirect call** JSON are produced—place these files in the **same folder as the target binary** before running `LLMATaint.py`. ------ ## Workflow ### 1) Pre-fuzzing Analysis **Goal:** Collect everything needed to drive effective mutations and triage. - **URIs & Parameters** — `get_args.py` Extracts all API endpoints and “best-effort” parameter templates. - **Sink Functions** — `Get_sinkFunc.py` Identifies sink function address ranges (e.g., dangerous APIs, command execs). - **Distances** — `Distance.py` Computes the distances between each basic block and sink locations. Distances are helpful to prioritize inputs that are “closer” to sensitive sinks. **Outputs:** - `Pre_fuzzing.json` (API and parameter set for fuzzing) ### 2) Runtime Monitoring We provide an instrumented QEMU stack and a control-flow recording library: - **Location:** `FirmAgent/FuzzingRecord/gh3fuzz/fuzz_bins/qemu/` - `libibresolver.so` collects control-flow events. - **Recommended path:** - Use `build_fuzz_img.py` to assemble the emulation image. - Instrumentation and collectors are automatically integrated. - **Customization:** You can extend `qemuafl` to log additional runtime signals. We provide our **Fuzzing-SA** source for reference (see repo). ### 3) Fuzzing Run the generic fuzzer against your re-hosted device: ``` python fuzzer.py \ --json-file Pre_fuzzing.json \ --delay 0.5 \ --host {target_ip_or_domain} ``` - `--json-file` — Pre-fuzzing output containing API definitions. - `--delay` — Inter-request sleep to avoid rate-limits / WAFs. - `--host` — Value for the `Host` header (supports SNI / vhosts). **What happens:** - The fuzzer parses `Pre_fuzzing.json` and parameter templates. - For each endpoint & parameter, it generates mutations (e.g., command-injection, XSS, traversal). - Requests and responses are logged to `fuzzing_results.log`; structured details to `detailed_results.json`. - Potential vulnerabilities are flagged based on error codes, timing, and content indicators. ### 4) Taint-to-PoC Agent Once fuzzing surfaces suspicious behavior, validate and transform it into concrete PoCs: ``` python LLMATaint.py \ -b {path_to_binary} \ -p {True|False} \ -t {vuln_type} \ -o {path_to_resultfolder} \ -m {model} ``` **Important:** Place the **Source addresses** file and the **indirect call JSON** produced during fuzzing into the **same directory as the target binary** before running `LLMATaint.py`. This lets the agent align runtime signals with code structures for precise taint propagation and PoC crafting. ------ ## Input/Output Schemas ### Pre_fuzzing.json (API definitions) An array of API objects. Minimal fields required by `fuzzer.py`: ``` [ { "api_url": "/nitro/v1/config/example_endpoint", "http_method": "POST", "request_payload": "{ \"username\": , \"cmd\": , \"enabled\": true }" } ] ``` - `api_url`: Relative path joined with `--base-url`. - `http_method`: `GET`, `POST`, `PUT`, `DELETE`, … - `request_payload`: A JSON string containing placeholders: - `` indicates a fuzzable string parameter. - Boolean literals `true/false` are supported and extracted but (by default) only string-type placeholders are fuzzed. ### Fuzzing output files - **`fuzzing_results.log`** — Human-readable progress and warnings. - **`detailed_results.json`** — Line-delimited JSON of each attempt: - **`Source.json`** — source address - **`Indirect_call.json`** — (caller, callee) ------ ## Configuration Reference `fuzzer.py` CLI: ``` --json-file (str, required) Path to Pre_fuzzing.json --delay (float, default 0.5) Seconds between requests --host (str, required) Host header value (IP or DNS name) ``` **Headers** (inside `send_request`): - `Host` is set from `--host`. - `Content-Type: application/json` by default. - Adjust or extend headers as needed for auth, cookies, etc. **Payload dictionary** (default built-ins): - `buffer_overflow` – long `'A' * 60` - `command_injection_*` – `|`, `;`, `&`, backticks, `$()` forms - `xss_payload` – basic `` - `path_traversal` – `../../../etc/passwd` > Extend `self.payloads` in `APIFuzzer.__init__` to add more patterns. ------ ## Tips & Best Practices - **Rate limiting / IDS**: Tune `--delay`, random jitter is already built-in between endpoints. - **HTTPS**: If you hit certificate issues, you can allow `verify=False` (already used for non-GET). Prefer installing proper CA bundles in production assessments. - **Virtual hosts / SNI**: Always pass `--host` when the server expects a specific `Host` header different from the IP. - **Schema hygiene**: Keep `Pre_fuzzing.json` valid. The fuzzer does some lightweight repair, but broken JSON templates will reduce coverage. - **Narrow scope rapidly**: Use sink distance data to prioritize endpoints closer to danger zones. - **Artifact hygiene**: Keep fuzzing artifacts (source addresses, indirect call JSON) side-by-side with binaries before launching `LLMATaint.py`. ------ ## Troubleshooting - **`No API definitions found`**: Ensure `Pre_fuzzing.json` exists and is valid JSON. Run pre-fuzzing scripts again. - **`No parameters found` for an endpoint**: Make sure your `request_payload` string includes `` placeholders for fuzzable fields. - **`Request error: ...` / timeouts**: - Verify network reachability from the host to the emulated firmware. - Increase timeouts in `send_request` if needed. - Check target service actually listens on the base URL. - **Target requires auth**: Add cookies/headers/tokens in `send_request()` or pre-set in `para_results.json`. You can also add an `Authorization` header. - **Instrumentation not recording**: Prefer using `build_fuzz_img.py`. If doing it manually, confirm your loader paths and (if applicable) `LD_PRELOAD`/QEMU plugins are active. Ensure `libibresolver.so` is discoverable by the emulated environment. - **LLMATaint cannot find source/indirect call files**: Place both artifacts in the **same directory** as the target binary passed via `-b`. ------ ## FAQ **Q:** How do I add new payloads (e.g., SQLi)? **A:** Extend `self.payloads` in `APIFuzzer.__init__`. Add detection logic (error signatures, timing, reflections) in `analyze_response()`. **Q:** Can I parallelize fuzzing? **A:** The provided script is single-process to preserve ordering and throttling. You can shard `Pre_fuzzing.json` by endpoints across multiple processes/containers. **Q:** Where do logs go? **A:** Human logs → `fuzzing_results.log`; structured artifacts → `detailed_results.json`. ### Example Commands Recap **Build emulation image (optional):** ``` bash python build_fuzz_img.py ``` **Run fuzzing (host):** ``` python fuzzer.py \ --json-file Pre_fuzzing.json \ --delay 0.5 \ --host 192.168.0.1 ``` **Run taint-to-PoC agent:** ``` python LLMATaint.py \ -b ./bin/httpd \ -p True \ -t ci \ -o ./results/ASUS \ -m R1 ```