Create Breathtaking Mini Worlds with Capsule Micro LoRA and Stable Diffusion

CN
ComfyUI.org
2025-05-07 01:14:58

Generate Stunning Miniature Cityscapes with AI! Learn how to combine Stable Diffusion with Capsule Micro World LoRA for breathtaking 2x upscaled city skylines. Discover the workflow, core models, and key nodes behind this innovative technique.

Models
Lora
Sd
Key Nodes
Upscaler
VRAM
Low VRAM (≤8GB)
Reading Time
3 min
View Required Models

Workflow Overview

Generate Stunning Miniature Cityscapes with AI! Learn how to combine Stable Diffusion with Capsule Micro World LoRA for breathtaking 2x upscaled city skylines. Discover the workflow, core models, and key nodes behind this innovative technique.

Content type: Workflow

Primary intent: Download

Required Models

  • Lora
  • Sd

Required Nodes

  • Upscaler

Setup Notes

  • Install the required models before opening the workflow template.
  • Recommended hardware: Low VRAM (≤8GB).

1. Workflow Overview

mad8xc2r0novv1yx8wek372369197335fe435acb50360792b2577a70087725b741a682aa6d3b8d021a39.png

This workflow generates miniature cityscapes (e.g., Qingdao skyline inside a pill capsule) by combining Stable Diffusion with a "Capsule Micro World" LoRA. It includes text encoding, latent space generation, and 2x upscaling for HD output.


2. Core Models

  • Stable Diffusion (UNETLoader): Base model (基础算法_F.1), FP8 precision.

  • Capsule Micro World LoRA: Enhances miniature details (weight=0.8).

  • T5-XXL & CLIP-L: Dual text encoders for prompt understanding.

  • 2xNomosUni Upscaler: Post-processing 2x super-resolution.


3. Key Nodes

Node Name

Function

Installation

Dependencies

DualCLIPLoader

Loads T5+CLIP text encoders

Built-in

Requires t5xxl_fp8_e4m3fn

Lora Loader Stack

Dynamic LoRA loading

Install rgthree-comfy

Needs LoRA files

SamplerCustomAdvanced

Advanced noise-controlled sampler

Built-in

None

ImageUpscaleWithModel

2x image upscaling

Built-in

Requires 2xNomosUni model


4. Workflow Structure

  • Group 1: Text Input

    • CLIPTextEncode: Processes prompts (e.g., "Shanghai micro-city in a capsule").

  • Group 2: Image Generation

    • EmptyLatentImage: Sets resolution (768x1024).

    • SamplerCustomAdvanced: Generates latent image with LoRA.

  • Group 3: Post-Processing

    • VAEDecode: Decodes latent to image.

    • ImageUpscaleWithModel: 2x upscaling.


5. Inputs & Outputs

  • Input Parameters:

    • Required: Prompts, resolution (768x1024).

    • Optional: Seed (random by default), LoRA weight (default=0.8).

  • Output: Upscaled PNG (saved to wangyi AI Studio folder).


6. Notes

  1. Plugin: Install rgthree-comfy via ComfyUI Manager for LoRA stacking.

  2. Model Paths: Place LoRA (胶囊微缩世界) and upscaler (2xNomosUni) in correct folders.

  3. Performance: ≥8GB VRAM recommended; use fp8_e4m3fn for lower resource usage.

  4. Debug: Check CLIP model if prompts fail to encode.

FAQ