# fluss **Repository Path**: mirrors_apache/fluss ## Basic Information - **Project Name**: fluss - **Description**: Apache Fluss is a streaming storage built for real-time analytics. - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-07-02 - **Last Updated**: 2026-02-07 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

Apache Fluss logo

Documentation | QuickStart | Development

CI License Slack Ask DeepWiki

## What is Apache Fluss (Incubating)? Apache Fluss (Incubating) is a streaming storage built for real-time analytics & AI which can serve as the real-time data layer for Lakehouse architectures. It bridges the gap between **data streaming** and **data Lakehouse** by enabling low-latency, high-throughput data ingestion and processing while seamlessly integrating with popular compute engines like **Apache Flink**, while Apache Spark, and StarRocks are coming soon. **Fluss (German: river, pronounced `/flus/`)** enables streaming data continuously converging, distributing and flowing into lakes, like a river 🌊 ## Features - **Sub-Second Data Freshness**: Continuous ingestion and immediate availability of data enable low-latency analytics and real-time decision-making at scale. - **Streaming & Lakehouse Unification**: Streaming-native storage with low-latency access on top of the lakehouse, using tables as a single abstraction to unify real-time and historical data across engines. - **Columnar Streaming**: Based on Apache Arrow it allows database primitives on data streams and techniques like column pruning and predicate pushdown. This ensures engines read only the data they need, minimizing I/O and network costs. - **Compute–Storage Separation**: Stream processors focus on pure computation while Fluss manages state and storage, with features like deduplication, partial updates, delta joins, and aggregation merge engines. - **ML & AI–Ready Storage**: A unified storage layer supporting row-based, columnar, vector, and multi-modal data, enabling real-time feature stores and a centralized data repository for ML and AI systems. - **Changelogs & Decision Tracking**: Built-in changelog generation provides an append-only history of state and decision evolution, enabling auditing, reproducibility, and deep system observability. ## Building Prerequisites for building Apache Fluss: - Unix-like environment (we use Linux, Mac OS X, Cygwin, WSL) - Git - Maven (we require version >= 3.8.6) - Java 11 ```bash git clone https://github.com/apache/fluss.git cd fluss ./mvnw clean package -DskipTests ``` Apache Fluss is now installed in `build-target`. The build command uses Maven Wrapper (`mvnw`) which ensures the correct Maven version is used. ## Contributing Apache Fluss (Incubating) is open-source, and we’d love your help to keep it growing! Join the [discussions](https://github.com/apache/fluss/discussions), open [issues](https://github.com/apache/fluss/issues) if you find a bug or request features, contribute code and documentation, or help us improve the project in any way. All contributions are welcome! ## License Apache Fluss (Incubating) project is licensed under the [Apache License 2.0](https://github.com/apache/fluss/blob/main/LICENSE).