Unleash Artistic Potential: Leveraging Flux.1 for Hand-Drawn Watercolor Images

CN
ComfyUI.org
2025-03-12 08:01:43

Unlock artistic image generation with Flux.1! Discover how to transform photos into stunning hand-drawn watercolor-style images with depth control and AI-powered prompts. Learn more and create your own masterpieces!

VRAM
Medium VRAM (12–16GB)
Reading Time
11 min
View Required Models

Workflow Overview

Unlock artistic image generation with Flux.1! Discover how to transform photos into stunning hand-drawn watercolor-style images with depth control and AI-powered prompts. Learn more and create your own masterpieces!

Content type: Workflow

Primary intent: Download

Required Models

  • Flux
  • Controlnet
  • Lora

Required Nodes

  • Controlnet
  • Upscaler

Setup Notes

  • Install the required models before opening the workflow template.
  • Recommended hardware: Medium VRAM (12–16GB).

Workflow Overview

m85mrki4umez68nnjwc6534481b0119501b6c0e7b7668f752ebfe1ae134b74970b15dd33be8bb7405c4.png

This workflow’s primary purpose is to leverage the Flux.1 model and depth control techniques to generate high-quality artistic-style images (hand-drawn watercolor) from an input image, enhanced by Joy2 captioning to derive descriptive prompts. The specific goals are:

  • Image Processing and Generation: Generate a 1024x1024 artistic-style image based on the input image (20230304185125_b966e.jpg).

  • Depth Control: Use the DepthAnythingV2 model to extract depth information and guide generation via ControlNet.

  • Prompt Optimization: Utilize the Joy_caption_two node to reverse-engineer detailed descriptive text from the input image, combined with predefined prompts for final generation. This workflow is suitable for art creation, image stylization, or generating hand-drawn effects from photos.

Core Models

  1. Flux.1 (基础算法_F.1)

    • Function: An efficient text-to-image model supporting high-resolution generation, ideal for artistic-style images.

    • Source: Download from Civitai or official repositories, place in ComfyUI/models/checkpoints/, e.g., 基础算法_F.1_fp8_e4m3fn.safetensors.

  2. DepthAnythingV2 (depth_anything_v2_vitl_fp32.safetensors)

    • Function: Extracts depth information from images for ControlNet guidance, enhancing spatial structure.

    • Source: Automatically downloaded via DownloadAndLoadDepthAnythingV2Model, stored in ComfyUI/models/.

  3. Lora Model (姑苏_F.1-手绘水彩风萌宠_V1.0)

    • Function: Fine-tunes the Flux.1 model to generate hand-drawn watercolor-style pet images.

    • Source: Download from Civitai or custom Lora repositories, place in ComfyUI/models/loras/.

  4. Upscale Model (4x-UltraSharp)

    • Function: Upscales generated images to enhance details.

    • Source: Download from ComfyUI model library, place in ComfyUI/models/upscale_models/.

Component Explanation

Below are the key nodes in the workflow, including their purpose, function, and installation method, along with dependencies:

  1. Joy_caption_two_load

    • Purpose: Loads the Joy2 pipeline for image captioning.

    • Function: Outputs a JoyTwoPipeline object, processed with the Llama 3.1 model.

    • Installation: Requires the JoyCaption plugin, install via ComfyUI Manager (search “JoyCaption”) or GitHub (https://github.com/comfyanonymous/ComfyUI_JoyCaption).

    • Dependencies: Requires unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit model, download and place in ComfyUI/models/joy_caption/.

  2. Joy_caption_two

    • Purpose: Generates descriptive text from input images.

    • Function: Outputs a detailed string (e.g., describing image content), supports Descriptive mode with a max length of 150 characters.

    • Installation: Shares plugin with Joy_caption_two_load.

    • Dependencies: Requires JoyTwoPipeline.

  3. ttN concat

    • Purpose: Concatenates multiple text strings.

    • Function: Merges predefined text (e.g., “Hand-drawn watercolor illustration”) with Joy2-generated descriptions.

    • Installation: Requires ttN Nodes plugin, install via ComfyUI Manager (search “ttN”) or GitHub (https://github.com/ttN-ComfyUI/ttN_nodes).

  4. ShowText|pysssss

    • Purpose: Displays and passes text content.

    • Function: Shows Joy2-generated descriptions or merged text.

    • Installation: Built into ComfyUI, no additional setup needed.

  5. LoadFluxControlNet

    • Purpose: Loads a Flux-compatible ControlNet model.

    • Function: Outputs a FluxControlNet object for depth control.

    • Installation: Requires XLabs plugin, install via ComfyUI Manager (search “XLabs”) or GitHub (https://github.com/XLabs-AI/ComfyUI-XLabs).

    • Dependencies: Requires XLabs-flux-depth-controlnet_v3 file, download and place in ComfyUI/models/controlnet/.

  6. ApplyFluxControlNet

    • Purpose: Applies ControlNet depth control.

    • Function: Combines depth maps to generate conditioning, enhancing structure.

    • Installation: Shares plugin with LoadFluxControlNet.

    • Dependencies: Requires depth map input.

  7. DownloadAndLoadDepthAnythingV2Model

    • Purpose: Downloads and loads the DepthAnythingV2 model.

    • Function: Automatically retrieves the depth model for use.

    • Installation: Requires DepthAnything plugin, install via ComfyUI Manager (search “DepthAnything”) or GitHub (https://github.com/comfyanonymous/ComfyUI_DepthAnything).

  8. DepthAnything_V2

    • Purpose: Generates depth maps from input images.

    • Function: Outputs depth images for ControlNet use.

    • Installation: Shares plugin with DownloadAndLoadDepthAnythingV2Model.

    • Dependencies: Requires depth_anything_v2_vitl_fp32.safetensors.

  9. ImageResize+

    • Purpose: Resizes input images.

    • Function: Adjusts the image to 1024x1024, maintaining proportions.

    • Installation: Built into ComfyUI.

  10. DualCLIPLoader

    • Purpose: Loads CLIP models.

    • Function: Outputs CLIP objects for text encoding.

    • Installation: Built into ComfyUI.

    • Dependencies: Requires clip_l and t5xxl_fp16 files, place in ComfyUI/models/clip/.

  11. UNETLoader

    • Purpose: Loads the Flux.1 UNET model.

    • Function: Outputs a model object to drive generation.

    • Installation: Built into ComfyUI.

    • Dependencies: Requires 基础算法_F.1_fp8_e4m3fn file.

  12. LoraLoader

    • Purpose: Loads a Lora model.

    • Function: Fine-tunes the model for hand-drawn watercolor style.

    • Installation: Built into ComfyUI.

    • Dependencies: Requires 姑苏_F.1-手绘水彩风萌宠_V1.0 file.

  13. EmptyLatentImage

    • Purpose: Creates an initial latent image.

    • Function: Provides a 1024x1024 latent space for generation.

    • Installation: Built into ComfyUI.

  14. XlabsSampler

    • Purpose: Performs sampling for generation.

    • Function: Combines model, conditioning, and ControlNet to generate latent images.

    • Installation: Requires XLabs plugin.

  15. VAEDecode

    • Purpose: Decodes latent images into pixel images.

    • Function: Outputs the generated image.

    • Installation: Built into ComfyUI.

    • Dependencies: Requires ae.sft VAE file.

  16. UpscaleModelLoader

    • Purpose: Loads an upscale model.

    • Function: Outputs an upscale model object.

    • Installation: Built into ComfyUI.

  17. ImageUpscaleWithModel

    • Purpose: Upscales the generated image.

    • Function: Increases the 1024x1024 image to a higher resolution.

    • Installation: Built into ComfyUI.

  18. SaveImage

    • Purpose: Saves the generated image.

    • Function: Outputs the file to a specified path.

    • Installation: Built into ComfyUI.

  19. Image Comparer (rgthree)

    • Purpose: Compares original and generated images.

    • Function: Offers a slide comparison mode to show input-output differences.

    • Installation: Requires rgthree plugin, install via ComfyUI Manager (search “rgthree”) or GitHub (https://github.com/rgthree/rgthree-comfy).

Workflow Structure

  1. Joy2 Reverse Prompt Group

    • Role: Generates descriptive text from the input image to optimize prompts.

    • Input Parameters: Input image (20230304185125_b966e.jpg), mode (Descriptive), length (150).

    • Output: Detailed descriptive text (e.g., panda description paragraph).

  2. Depth Control Group

    • Role: Extracts depth information and applies ControlNet guidance.

    • Input Parameters: Input image, depth model (depth_anything_v2_vitl_fp32.safetensors), ControlNet weight (0.8).

    • Output: Depth map and ControlNet conditioning.

  3. Image Generation Group

    • Role: Executes image generation and post-processing.

    • Input Parameters: Latent image (1024x1024), positive prompt (merged text), negative prompt (“Worst quality, blurry, wrong, ugly”), Lora weight (1.2), guidance scale (3.5), sampling steps (20).

    • Output: Generated image (initial 1024x1024, upscaled).

Inputs and Outputs

  • Expected Inputs:

    • Image: 20230304185125_b966e.jpg (initial resolution 979x923).

    • Resolution: 1024x1024.

    • Seed: 722511220491392 (randomizable).

    • Prompt: Dynamically generated (including “Hand-drawn watercolor illustration”).

    • Negative Prompt: “Worst quality, blurry, wrong, ugly”.

    • Lora Weight: 1.2.

    • Guidance Scale: 3.5.

    • Sampling Steps: 20.

  • Final Output:

    • High-quality artistic-style image (PNG format, upscaled beyond 1024x1024).

    • Comparison file (saved via Image Comparer).

Notes and Tips

  1. Resource Requirements: Flux.1 and Lora generation require 12GB+ VRAM; an NVIDIA GPU is recommended.

  2. Model Files: Ensure 基础算法_F.1_fp8_e4m3fn, ae.sft, and Lora files are in the correct paths, or errors will occur.

  3. Plugin Installation: Install JoyCaption, XLabs, DepthAnything, and rgthree plugins, or nodes will be unavailable.

  4. Performance Optimization: Reduce sampling steps (20→10) or resolution (1024→512) for faster generation.

  5. Compatibility: ComfyUI version should be 0.3.18 or higher, with plugins compatible with Flux.1.

  6. Input Image: Ensure 20230304185125_b966e.jpg exists in the specified path.

Example Illustration

Suppose the input image is a panda photo; the workflow will:

  • Reverse-engineer a description: “This photograph captures a large, adorable panda...”.

  • Merge prompt: “Hand-drawn watercolor illustration, This photograph...”.

  • Generate a hand-drawn watercolor-style panda image, upscaled and saved as ComfyUI.png.

FAQ