About
Diffusion models have emerged as a popular model for image and video synthesis due to its to model complex, high-dimensional data through an iterative denoising process.
We are systematically recreating Stable Diffusion from the ground up, and running targeted ablation experiments to identify which architectural components, training choices, and conditioning mechanisms most strongly impact performance, stability, and generation quality.
Building on this, we will then fine-tune our model on axonometric data to test whether controlled training can encode stronger spatial reasoning, consistent perspective, and improve geometric coherence in generated scenes. Coupled with the ablation experiments, we aim to gain a deep understanding of diffusion architectures and provide insights into building state of the art models.
Key Insights
Our three most significant results from implementation and ablation. Updated as experiments complete.
People