I built a native Comfy Cloud mobile app on nothing but the public API

CN
2026-06-23 02:04:29

Experience the essence of Comfy Vibes: intuitively crafted applications released to demonstrate rather than merely describe possibilities on our platform. These aren't polished products nor data-collection tools—just rapid creations embracing imperfections.

Subgraph Parameter Panel

Grasp the concept before explanations: Comfy Cloud generation now fits in your palm—capture inputs from your gallery, receive outputs in Photos, select models by tapping instead of manual entry.

Comfy Go represents our SwiftUI-native mobile adaptation of Comfy Cloud. It streamlines four generation workflows tailored for handheld devices, excluding the full node editor. Launched on TestFlight June 12, this showcases dual accessibility: effortless usage and foundational API flexibility—the latter enabling the former.

Accessible features

Four creation pathways covering text/image to image/video conversions:

  • Text-to-image: Enter prompts, choose among 18 models, receive visuals

  • Image transformation: Modify gallery photos using the same model catalog

  • Text-to-video: Generate motion sequences without manual keyframing

  • Image animation: Breathe movement into static images

These workflows provide access to 18 models matching the web version. Authenticate instantly via Comfy Sign-In. Outputs save directly to your Photos library while an in-app gallery archives creations. The entire process remains device-native—gallery as input/output hub with tactile model selection.

Subgraph Parameter Panel

This covers user interaction; the infrastructure aspects will captivate developers.

Technical foundation

Comfy Go operates exclusively through Comfy Cloud's public API—identical to external developer access. No private endpoints, mobile-specific pathways, or privileged protocols exist. Every app function remains replicable externally.

The underlying Swift SDK (ComfySwiftSDK) simplifies interaction to two core methods and one critical event:

Subgraph Parameter Panel

Submit workflows, monitor real-time events, receive outputs.

  1. Initiate workflows

  2. Track job progression (queued/running/completed)

  3. Receive outputs upon completion—no separate retrieval required

This submit-stream-receive sequence forms the entire contract. SwiftUI components remain unaware of HTTP mechanics, focusing solely on job states. Every screen constructs from this minimal API surface.

In Swift, this translates to two calls and state handling:

Subgraph Parameter Panel

Authentication, submission, streaming, output retrieval.

Integrate via Package.swift:

.package(url: "https://github.com/Comfy-Org/ComfySwiftSDK.git", from: "0.1.0")

Version 0.1.0 uses Apache-2.0 licensing, requiring only Foundation and CryptoKit (iOS 17+/macOS 14+).

This development accessibility manifests concretely: one developer created the entire application in approximately one week. Initial scaffolding began April 7 using agentic workflows (Claude Code + BMAD methodology), producing ~17,200 lines across 8 epics and 57 stories. The outcome includes functional iOS pipelines, model selection, galleries, and authentication.

Such rapid development stems from the API's minimal learning curve. With three core calls, solo developers can materialize applications swiftly—transforming concepts into tangible products.

The true challenge lies not in API complexity, but in envisioning valuable applications.

Availability

Beta access: Experience Comfy Go via public TestFlight enrollment.
Development resources:

.package(url: "https://github.com/Comfy-Org/ComfySwiftSDK.git", from: "0.1.0")

Development metrics

  • Functionality: 18 models across 4 workflows

  • API surface: Submission, event streaming, output delivery

  • Development: Solo developer

  • Timeline: ~1 week (commenced April 7)

  • Code volume: ~17,200 lines

  • Methodology: Agentic workflow (Claude Code + BMAD)

  • API: Public Comfy Cloud endpoints

Comfy Go emerges from our Vibes initiative—internally crafted projects deserving public release. What would you create using this three-call framework? Share concepts below.