# pipa **Repository Path**: bernard5/pipa ## Basic Information - **Project Name**: pipa - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-06-05 - **Last Updated**: 2025-12-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # PIPA PIPA (Progressive & Intelligent Performance Analytics) is a platform that aggregates a complete toolchain of performance data collection, processing, and analysis with advanced algorithms, enabling users to effortlessly obtain in - depth insights into the performance of their systems and applications. It bridges the gap between raw performance data and actionable information, allowing for quick identification of bottlenecks and optimization opportunities.
PIPA (ๆž‡ๆท, loquat) is a local fruit of Zhejiang, China. PIPA consists of three parts: loquat tree, flower and fruit, which represent the collecting & processing, analysis and conclusion of performance data respectively. PIPA is still in the active development process, and the current development focus is on the loquat tree. ![GitHub License](https://img.shields.io/github/license/ZJU-SPAIL/pipa) ![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/ZJU-SPAIL/pipa/main.yml) ![GitHub top language](https://img.shields.io/github/languages/top/ZJU-SPAIL/pipa) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) [![Coverage Status](https://coveralls.io/repos/github/ZJU-SPAIL/pipa/badge.svg?branch=main)](https://coveralls.io/github/ZJU-SPAIL/pipa?branch=main) ## Features - **Data Collecting**: PIPA can collect data from a variety of sources, using tools like perf, sar, and more. It supports multiple platforms including x86_64, ARM, and RISC-V, making it versatile and adaptable. Currently PIPA is capable of collecting and parsing perf and sar data, providing detailed performance metrics. - **pipa-tree Collector**: A specialized two-phase performance data collection tool: - **Counting Phase**: Uses `perf stat` and `sar` to gather system-wide metrics at configurable intervals - **Profiling Phase**: Uses `perf record` with DWARF call-graph for detailed function-level profiling - **Cross-Platform**: Automatically adapts event groups based on CPU architecture (x86_64/ARM/RISC-V) - **Automated Packaging**: Generates standard `.tar.gz` archives with metadata and machine identifiers - **MCP Service**: Provides Model Context Protocol (MCP) server for flamegraph analysis tools, enabling programmatic stack trace inspection, symbol overhead calculation, and subset analysis via HTTP transport. - **Script Generation**: To reduce the noise generated by the Python runtime, PIPA can generate scripts that collect performance data. - **Data Processing**: PIPA can process the collected performance data, including alignment and segmentation, to serve meaningful analysis. - **Data Visualization**: PIPA can visualize based on the performance data collected to provide intuitive insights. - **Data Analytics**: PIPA will integrate SPAIL's performance methodology and models to provide meaningful analysis and reveal software and hardware bottlenecks. ```mermaid sequenceDiagram participant User as ๐Ÿ‘ค User/External System participant Fruit as ๐ŸŽ PIPA Fruit
๐Ÿ“Š Insights & Visualization participant Flower as ๐ŸŒธ PIPA Flower
๐Ÿ” Analysis & Attribution participant Tree as ๐ŸŒณ PIPA Tree
๐Ÿ“ฅ Data Collection & QC participant Source as ๐Ÿ”Œ Data Sources
(perf, sar, eBPF...) Note over User, Source: ๐Ÿš€ Main Data Flow: From Collection to Insights User->>Fruit: ๐Ÿ“จ Submit performance analysis request activate Fruit Fruit->>Flower: ๐Ÿ“‹ Forward analysis requirements activate Flower Flower->>Tree: ๐Ÿ“Š Request performance data activate Tree Tree->>Source: ๐Ÿ” Collect multi-source performance data activate Source Source-->>Tree: ๐Ÿ“„ Return raw data deactivate Source Note over Tree: ๐Ÿ›ก๏ธ Execute Data Quality Control
- โœ… Validation rules
- ๐Ÿงน Auto-cleansing & tagging
- ๐Ÿ’ป Platform abstraction (x86/ARM/RISC-V) Tree-->>Flower: โœ… Return validated, reliable data deactivate Tree Note over Flower: ๐Ÿ”ฌ Perform Deep Analysis & Attribution
- ๐Ÿ“ˆ System-level workload characterization analysis
- ๐Ÿ”ฅ Code-level profiling (hotspot/lock)
- โš™๏ธ Instruction-level tracing
- ๐Ÿ”— Cross-layer correlation & attribution Flower-->>Fruit: ๐Ÿ“Š Return bottleneck analysis results &
๐Ÿ” root cause insights deactivate Flower Note over Fruit: ๐Ÿ’ก Generate Actionable Insights
- ๐Ÿ“ˆ Interactive dashboards
- ๐Ÿ’ก Optimization opportunities
- โœ… Actionable recommendations Fruit-->>User: ๐Ÿ“‹ Deliver visualization report &
๐Ÿ’ก suggestions deactivate Fruit Note over Flower, Tree: ๐Ÿ”„ Feedback Loop: Continuous Optimization Flower->>Tree: ๐Ÿ’ฌ Provide data quality feedback &
๐Ÿ“‹ new requirements activate Tree Tree-->>Flower: โœ… Confirm & adjust collection strategy deactivate Tree ``` ## Installation PIPA can be easily installed using pip: ```sh pip install PyPIPA ``` ## Quickstart ### Installation #### Option 1: Python Package Installation ```sh pip install PyPIPA ``` #### Option 2: pipa-tree Collector ```bash git clone https://github.com/ZJU-SPAIL/pipa.git cd pipa/script/pipa-tree sudo ./install.sh # Install to /usr/local/bin sudo ./install.sh --all-users # Optional: configure sudo for perf (recommended for multi-user environments) ``` For detailed installation instructions, see [pipa-tree Installation Guide](script/pipa-tree/INSTALL.md). ### Using pipa-tree The pipa-tree collector provides a simple interface to gather performance metrics: ```bash # Basic usage (default: 60s counting + 60s profiling) pipa-tree collect # Custom durations pipa-tree collect --duration-stat 90 --duration-record 30 # Specify output file pipa-tree collect --output ./mydata.tar.gz # Advanced options pipa-tree collect \ --perf-record-freq 199 \ --perf-events "cycles,instructions" \ --output ./pipa-collection-demo.tar.gz ``` **Output**: A `.tar.gz` archive containing: - `spec_info.yaml` - Hardware specifications - `attach_session/pipatree.log` - Detailed collection logs - `attach_session/perf_stat.txt` - Counting phase metrics - `attach_session/perf.data` - Profiling data - `attach_session/sar_*.csv` - System activity reports ### Using PIPA CLI To generate a script that collects performance data: ```sh pipa generate ``` Then you can complete the interaction through the CLI to provide the necessary parameters. You can choose to start the workload with perf, or you can choose to observe the system directly. ### Analyzing Collected Data After collecting data with pipa-tree, analyze it with the PIPA CLI: ```bash make # Build and install pipa pipa analyze ./pipa-collection-demo.tar.gz \ --expected-cpus 0-7 \ --symfs /path/to/symbols \ --kallsyms /proc/kallsyms ``` This generates a `report.html` with CPU clustering, NUMA load, disk warnings, and hot spots. ### MCP Service Start the MCP server for flamegraph analysis: ```bash PYTHONPATH=src python -m pipa.service.mcp --host 0.0.0.0 --port 8000 --path /mcp ``` For detailed MCP tool usage and examples, see [MCP Service Guide](doc/mcp.md). ### Documentation - **Quick Start Guide**: [doc/quick-start.md](doc/quick-start.md) - **Development Guidelines**: [doc/development.md](doc/development.md) - **MCP Service**: [doc/mcp.md](doc/mcp.md) PIPA's API documentation is available at [https://zju-spail.github.io/pipa/](https://zju-spail.github.io/pipa/). ## Build To build PIPA, you can use the `python` command with the `build` module: `python -m build`, we use `hatchling` as the build backend. ## LICENSE PIPA is distributed under the terms of the [MIT License](LICENSE). ## Contributing Contributions to PIPA are always welcome. Whether it's feature enhancements, bug fixes, or documentation, your contributions are greatly appreciated.