# Diffusion-v-FlowMatching **Repository Path**: Heconnor/Diffusion-v-FlowMatching ## Basic Information - **Project Name**: Diffusion-v-FlowMatching - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-03-31 - **Last Updated**: 2025-03-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: DiffusionModel, FlowMatching, tutorial ## README # Diffusion vs Flow Matching Comparison This repository provides a visual comparison between two generative modeling approaches: Diffusion Models and Flow Matching. The implementation focuses on a simple 2D toy dataset to help understand and visualize the fundamental differences between these methods. ![Diffusion vs Flow Matching Animation](./comparison_animation.gif) ## Overview ### Diffusion Models Diffusion models work by gradually adding noise to data and then learning to reverse this process. They: - Start with pure noise - Gradually denoise the data through multiple steps - Use a neural network to predict and remove noise at each step - Follow a predefined noise schedule ### Flow Matching Flow matching directly learns a continuous transformation between noise and data distributions. They: - Learn velocity fields that transform noise to data - Use a single neural network to predict velocities - Transform data in a single continuous flow - Don't require a noise schedule In this implementation, we use a linear interpolation path with added noise: ```python x_t = (1 - t) * z + t * x_0 + noise * torch.sqrt(t * (1 - t)) ``` where: - `z` is the noise distribution - `x_0` is the target data - `t` is the time parameter (0 to 1) - `noise` is small Gaussian noise scaled by `sigma=0.1` - The noise term `torch.sqrt(t * (1 - t))` ensures smooth transitions This path choice provides a simple yet effective way to transform between distributions, with the noise term helping to stabilize training and prevent mode collapse. ## Installation Create a conda environment and install dependencies: ```bash conda create -n difffm python=3.10 conda activate difffm pip install torch numpy matplotlib tqdm ipython ``` ## Usage Run the main script to see the comparison: ### Run Diffusion Demo ```bash python diff_vs_flowmatch.py run-diffusion-demo --n-epochs 200 --batch-size 128 --n-samples 1000 ``` ### Run Flow Matching Demo ```bash python diff_vs_flowmatch.py run-flow-matching-demo --n-epochs 200 --batch-size 128 --n-samples 1000 ``` ### Run Comparison Demo ```bash python diff_vs_flowmatch.py run-comparison-demo --n-epochs 200 --batch-size 128 --n-samples 1000 ``` This will: 1. Train both models on a toy dataset (3 Gaussian clusters) 2. Generate samples from both models 3. Create various visualizations to compare the approaches ## Visualizations The code generates several types of visualizations: 1. **Sample Comparison** - Real vs Generated samples for both methods - Shows final quality of generated samples 2. **Trajectory Visualization** - Shows how individual points move during generation - Helps understand the path from noise to data 3. **Intermediate Steps** - Displays snapshots of the generation process - Shows how the distribution evolves over time 4. **Vector Fields** - Visualizes the learned vector fields at different timesteps - Shows how each model guides points toward the data distribution 5. **Animations** - Dynamic visualization of the generation process - Creates GIFs showing the full transformation ## Implementation Details The repository contains several key components: - `SimpleMLP`: Neural network architecture shared by both models - `DiffusionModel`: Implementation of the diffusion process - `FlowMatching`: Implementation of the flow matching approach - Various visualization functions for analysis and comparison ## Key Differences 1. **Training Objective** - Diffusion: Predicts noise at each step - Flow Matching: Predicts velocity vectors 2. **Generation Process** - Diffusion: Discrete steps from noise to data - Flow Matching: Continuous flow from noise to data 3. **Sampling** - Diffusion: Uses multiple denoising steps - Flow Matching: Uses ODE solvers (Euler or Heun's method) ## Results The visualizations help understand how each method approaches the generation task: - Diffusion models gradually denoise the data through many small steps - Flow matching creates a smooth transformation from noise to data - Both methods can successfully learn the target distribution - Each method has its own characteristic generation trajectory ## References - [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239) - [Flow Matching for Generative Modeling](https://arxiv.org/abs/2210.02747) - [Blog Post on Diffusion vs Flow Matching](https://harshm121.medium.com/flow-matching-vs-diffusion-79578a16c510)