# dpdk_test
**Repository Path**: wudi202/dpdk_test
## Basic Information
- **Project Name**: dpdk_test
- **Description**: 基于dpdk的一个简单的收包及处理程序,可以输出ck日志
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-02-10
- **Last Updated**: 2026-02-24
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# DPDK Packet Processing System
High-performance network traffic analysis system based on DPDK (Data Plane Development Kit). Designed for efficient protocol identification, flow tracking, and structured logging to ClickHouse.
## Features
- **High Performance Architecture**:
- Built on DPDK for line-rate packet processing.
- Scalable multi-core design using RSS (Receive Side Scaling).
- **Lockless Design**: Per-thread Flow Tables and Aging Lists to minimize contention.
- **Optimized Memory**: Dedicated `rte_mempool` for log messages to avoid expensive allocations in the fast path.
- **Protocol Detection Engine**:
- **HTTP**: Identifies GET requests; extracts Host, URL, User-Agent, and custom headers.
- **TLS**: Strict ClientHello validation; extracts SNI, JA3 (MD5), and JA4 (SHA256) fingerprints.
- **STUN**: Detects STUN traffic via standard Magic Cookie.
- **Port-based**: Fast lookup for well-known ports (thread-local tables).
- **Modular**: Registration-based framework (`proto_register`) makes adding new protocols easy.
- **Traffic & Flow Analysis**:
- **Bi-directional Flow Tracking**: Tracks 5-tuple flows (Src/Dst IP, Port, Proto).
- **Smart Aging**: 3-tier aging strategies (TCP active, UDP, TCP FIN/RST) to manage state efficiently.
- **IP Watchlist**: High-performance CIDR matching using DPDK LPM (Longest Prefix Match).
- **Logging & Analytics**:
- **ClickHouse Backend**: High-throughput insertion for massive scale logging.
- **Batch Processing**: Logs are batched by type (HTTP/TLS/Flow) for efficient database writes.
- **Rich Metadata**: Logs include timestamps, IP/Ports, and protocol-specific fields (headers, fingerprints).
## System Architecture
### 1. Data Flow Pipeline
```mermaid
graph TB
subgraph NIC["Network Interface (DPDK Port)"]
Q0[RX Queue 0]
Q1[RX Queue 1]
end
subgraph Workers["Worker Threads (lcore pinned)"]
W0["Worker 0"]
W1["Worker 1"]
end
Q0 --> W0
Q1 --> W1
subgraph Processing["Packet Processing Pipeline"]
PP[Packet Parse
Eth/IP/TCP/UDP]
FK[Flow Key
5-Tuple + Direction]
FT[Flow Lookup
rte_hash table]
FA[Flow Aging
Active/UDP/FIN Chains]
PD[Proto Detect
Dispatch Engine]
end
W0 --> PP --> FK --> FT --> FA --> PD
W1 --> PP
subgraph Detectors["Protocol Detectors"]
HTTP[HTTP
Method/Header Analysis]
TLS[TLS
ClientHello Parsing]
STUN[STUN
Magic Cookie]
PORT[Port
Well-known Lookup]
end
PD --> HTTP
PD --> TLS
PD --> STUN
PD --> PORT
subgraph Logging["Logging System"]
Ring["DPDK Ring
(MP/SC mode)"]
LT["Log Thread
(Dedicated lcore)"]
CK[(ClickHouse)]
end
HTTP -->|Log Msg| Ring
TLS -->|Log Msg| Ring
FA -->|Log Msg| Ring
Ring --> LT -->|Batch Insert| CK
```
### 2. Core Components
#### Threading Model
- **Worker Threads**:
- Bound to specific CPU cores (`lcore`).
- Each processing a dedicated RX queue (Run-to-completion model).
- Uses **Thread-Local Storage (TLS)** for Flow Tables and Aging Lists to avoid lock contention.
- **Log Thread**:
- Runs on a dedicated core.
- Consumes messages from a shared `rte_ring`.
- Batches writes to ClickHouse to maximize throughput.
#### Memory Management
- **Mempools**: Extensive use of `rte_mempool` to prevent memory fragmentation and overhead.
- **Packet Pool**: Standard DPDK mbuf pool.
- **Flow Pool**: Fixed-size pool for `flow_entry_t` structures per thread.
- **Log Message Pool**: dedicated pool for passing analytics data from Worker to Log thread.
#### Flow Table Design
- **Key**: standard 5-tuple (Src/Dst IP, Src/Dst Port, Proto).
- **Structure**: `rte_hash` (Cuckoo hash) for fast lookups.
- **Aging**:
- Three linked lists per thread: `TCP_Active`, `UDP`, `TCP_FIN`.
- **LRU-like behavior**: Packets move flows to the tail; aging scanner checks from the head.
- Efficient: Stops scanning once a non-expired flow is found (since list is time-ordered).
#### Protocol Detection
- **Mechanism**: Pluggable framework.
- **Registration**: Detectors register via `proto_register()` at startup.
- **Priority**:
1. **Strict Detectors** (e.g., HTTP method match, TLS ClientHello).
2. **Port-based** (Fallback if payload analysis fails).
- **Extensibility**:
- Add `src/proto/proto_new.c`.
- Implement `detect_fn` and `process_fn`.
- Register in `proto_detect_init()`.
## Directory Structure
```
dpdk_test/
├── src/
│ ├── config/ # JSON configuration loading
│ ├── flow/ # Flow table and aging management
│ ├── iplist/ # IP watchlist (LPM)
│ ├── log/ # Logging thread and ClickHouse writer
│ ├── packet/ # Packet parsing and processing loop
│ ├── proto/ # Protocol detection modules (HTTP, TLS, etc.)
│ ├── utils/ # Hashing (MD5, SHA256) and helpers
│ ├── dpdk_init.c # EAL and resource initialization
│ └── main.c # Entry point and thread launching
├── conf/ # Configuration files
├── sql/ # ClickHouse DDL scripts
└── Makefile # Build script
```
## Requirements
- **OS**: Linux (kernel supporting DPDK, e.g., Hugepages enabled).
- **DPDK**: Version 21.11 LTS or compatible.
- **libs**:
- `libcjson`
- `clickhouse-cpp` (and its dependencies: `lz4`, `zstd`, `cityhash`, `openssl`)
## Build Instructions
1. **Setup Environment**: Ensure `PKG_CONFIG_PATH` includes DPDK and clickhouse-cpp.
2. **Compile**:
```bash
make
# or for verbose output
make V=1
```
3. **Database Setup**: Apply the schema to ClickHouse.
```bash
clickhouse-client --multiline < sql/create_tables.sql
```
## Configuration
Edit `conf/config.json` to match your environment. Key settings:
- **dpdk**: `mbuf_pool_size`, `rx_desc` size.
- **ports**: Map physical DPDK ports to RX queues.
- **workers**: Bind logical cores (`lcore_id`) to specific Port/Queue pairs.
- **clickhouse**: Database host, port, credentials, and table names.
- **modules**: Toggle protocol parsers (`http`, `tls`, `stun`, etc.).
## Usage
Run the application using the generated binary. You must provide EAL arguments (for DPDK) and Application arguments.
```bash
# Syntax
sudo ./build/dpdk_pkt_proc [EAL Options] -- -c
# Example
# -l 1-4 : Use cores 1, 2, 3, 4 (e.g., 3 workers + 1 log thread)
# -n 4 : 4 Memory channels
# -- : Separator
# -c ... : Config file path
sudo ./build/dpdk_pkt_proc -l 1-4 -n 4 -- -c conf/config.json
```
## Architecture Notes
- **Threading**:
- **Worker Threads**: Pinned to cores. Each polls a specific RX queue, processes packets (Parse -> Flow -> Detect), and enqueues logs to a ring.
- **Log Thread**: Pinned to a separate core. Dequeues batches from the ring and writes to ClickHouse.
- **Memory Management**: Uses `rte_mempool` for all high-frequency allocations (Packets, Flow Entries, Log Messages) to ensure stable latency and prevent fragmentation.