# dataflow-cpu-vit

**Repository Path**: mr-x2021/dataflow-cpu-vit

## Basic Information

- **Project Name**: dataflow-cpu-vit
- **Description**: Dataflow System built on CPU environment, specific for vit tasks.
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-11-01
- **Last Updated**: 2025-11-01

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Dataflow System for Vision Transformer Tasks

This project implements a dataflow system optimized for Vision Transformer (ViT) tasks in CPU environments.

## Overview

The system provides comprehensive support for:
- Vision Transformer model architectures
- Data preprocessing and augmentation
- Training and evaluation pipelines
- Performance measurement and analysis
- Model weight conversion between frameworks

## Key Components

### Core Architecture
- **ViT Implementation**: Complete Vision Transformer implementations in both PyTorch (`dataflow/vit_model.py`) and TensorFlow/Keras (`tensorflow/vit_model.py`)
- **Model Variants**: Supports multiple ViT architectures including base, large, and huge variants with different patch sizes

### Data Handling
- Custom dataset implementations for classification and segmentation tasks
- Data splitting and preprocessing utilities
- Batch collation functions optimized for ViT tasks

### Training & Evaluation
- Complete training pipelines with configurable parameters
- Learning rate scheduling and optimization strategies
- Performance metrics tracking (GFLOPS, training time, accuracy)

### Utilities
- Model weight conversion tools between PyTorch and TensorFlow
- Performance benchmarking and analysis
- Prediction/inference capabilities

## Usage Guide

### Dataset Preparation
1. Download the flower classification dataset from:
   - Official source: https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz
   - Alternative Baidu Cloud link: https://pan.baidu.com/s/1QLCTA4sXnQAw_yvxPj9szg (Extract code: 58p0)

2. Set the dataset path in training script:
   ```bash
   --data-path /path/to/flower_photos
   ```

### Model Training
1. Download pretrained weights as specified in each model's documentation
2. Set the weights path in training script:
   ```bash
   --weights /path/to/pretrained_weights.pth
   ```
3. Run training:
   ```bash
   python dataflow/train.py --data-path /path/to/dataset --weights /path/to/pretrained_weights.pth
   ```

### Prediction
1. Configure `predict.py` with:
   - Model architecture matching training
   - Trained model weights path
   - Input image path
2. Run prediction:
   ```bash
   python dataflow/predict.py
   ```

## Performance Analysis

The system includes built-in performance measurement capabilities:
- Tracks GFLOPS during training
- Measures epoch execution time
- Calculates end-to-end training time
- Provides average performance metrics

## Implementation Details

### Model Architecture
The Vision Transformer implementation includes:
- Patch embedding layer
- Positional encoding
- Multi-head attention mechanisms
- MLP blocks
- Stochastic depth regularization

### Training Configuration
- Uses SGD optimizer with cosine learning rate schedule
- Includes data augmentation (random resized crop, horizontal flip)
- Implements proper weight initialization

## License
Please check the LICENSE file in the repository root for licensing information.