tool Live

StyleFusion

Upload any reference image. The system extracts its visual DNA, enriches it with art vocabulary from the Grimoire, and generates consistent results across 55+ models from 8 AI providers. Same character, any style, full control.

Visit Project Source Code

Published

February 15, 2026

Tech Stack

Cloudflare Workers Cloudflare D1 Cloudflare R2 Workers AI AI Gateway React Vite Hono

Key Highlights

8-provider, 55+ model orchestration with BYOK
Grimoire-enriched prompt compilation (167,000+ creative atoms)
Persistent character DNA with identity locking
Multi-provider simultaneous generation and comparison

Overview

Multi-provider AI image generation with structured visual mediation. Upload references, extract visual DNA, generate across 55+ models from 8 providers.

A Note on the Current State

StyleFusion is a live product under active development by a solo developer. It works, and when it works well, it produces results you can’t get anywhere else. But it’s not polished, and some things will break.

Provider reliability is inconsistent. The AI providers StyleFusion connects to (Google, xAI, fal.ai, etc.) have their own uptime issues. You’ll occasionally see 503 Service Unavailable errors, especially during peak hours or when a provider is running hot. This isn’t a StyleFusion bug; it means the upstream provider is temporarily overloaded or down. Wait a minute and try again, or switch to a different provider.

There are bugs. Some features are half-built, some edge cases aren’t handled, and the UI has rough spots. I’m shipping fixes constantly, but there’s only one of me and a lot of moving parts across 8 providers and 55+ models. If something doesn’t work, try refreshing, switching providers, or coming back later.

This is BYOK (Bring Your Own Key). You need API keys from the providers you want to use. If you click generate and nothing happens, you probably haven’t added a key yet. The Provider Guide below explains how to get set up.

If you find a bug or have feedback, I genuinely want to hear it. Just know that this project represents thousands of hours of solo work across the pipeline, the knowledge graph, the character system, and the provider integrations. Patience is appreciated.

How It Works

StyleFusion sits between your creative intent and AI execution. Instead of writing prompts by hand and hoping for the best, you provide reference images and the system does the rest.

The pipeline has four stages.

Source. You upload one or more reference images and assign each a role: subject (your character or scene), style (the aesthetic you want), or composition (the framing and layout). Seven specialized AI agents analyze each reference independently, extracting subject descriptions, style vocabulary, color palettes, lighting conditions, textures, and compositional structure.

Detail. A role-scoped assembly enforces boundaries between references. Your subject’s scene stays clean; the style reference contributes only its aesthetic treatment, not its content. Identity colors from your subject are protected. Lighting and atmosphere blend by weight. This is where the system prevents the most common multi-reference problem: style contamination, where elements from one image leak into another.

Compile. The Grimoire (HobFarm’s visual knowledge graph of 167,000+ creative atoms) enriches the assembled data with real art vocabulary: movement references, technique descriptions, material properties. The system matches your style to the closest arrangement in the Grimoire and pulls relevant vocabulary to make the fusion feel intentional rather than random. The result is a structured Intermediate Representation (IR) that captures everything about what you want to generate.

Output. The IR gets compiled into provider-specific prompts and sent to whatever generation model you choose. Different models interpret the same prompt differently, and that’s a feature: you can generate from multiple providers simultaneously and compare results.

Getting Started

StyleFusion uses a BYOK (Bring Your Own Key) model. You provide API keys from the image generation providers you want to use. Your keys are sent per-request in headers and are never stored on our servers.

Here’s how to start generating in under five minutes.

Step 1: Get at least one API key. Pick a provider from the guide below based on what you want to generate. Most have free tiers or trial credits.

Step 2: Open StyleFusion. Go to sf.hob.farm and click the Providers tab in the navigation bar.

Step 3: Enter your key. Paste your API key into the field for your chosen provider. Keys are stored in your browser’s local storage only.

Step 4: Upload a reference image. Drop an image into the workspace. Select a role (Subject Reference for characters, Style Reference for aesthetics).

Step 5: Click Fuse. The extraction agents analyze your reference and build the IR. Review the extracted data in the center panel.

Step 6: Click Compile, then generate. Choose your generation model and aspect ratio, then generate.

Provider Guide

StyleFusion supports 8 providers with 55+ models. You only need one key to start, and you can add more anytime.

Generation Providers

fal.ai (Recommended starting point)

Models: FLUX.2 series, Nano Banana, Wan video models, and more
Good for: Fast generation, wide model variety, good free tier
Pricing: Pay per generation, free trial credits available
Get your key

RunPod

Models: FLUX, SDXL, custom endpoints
Good for: Running custom models, bulk generation, GPU access
Pricing: Pay per second of GPU time
Get your key (affiliate link)
Sign up through this link and you’ll receive a one-time credit of $5-$500 when you add $10 for the first time, plus instant access to RunPod’s GPU resources.

OpenAI

Models: GPT-Image (DALL-E), ChatGPT image generation
Good for: Strong style fusion, consistent character preservation, reliable across prompt formats
Pricing: Pay per generation
Get your key

Midjourney

Models: Midjourney v6, Niji7
Good for: Highest quality stylized output, strong style fusion
Pricing: Subscription based ($10/month+)
Note: Requires Midjourney subscription; API access through authorized endpoints

Bria (coming soon)

Models: Bria generation models
Good for: Commercial-safe generation (trained on licensed data only)
Pricing: Pay per generation
Direct API key integration is in progress. Bria models are currently accessible through fal.ai.

Extraction Providers

These providers power the AI agents that analyze your reference images.

Google Gemini

Models: Gemini Flash Lite (fast/cheap), Gemini Pro (detailed), Gemini Flash (balanced)
Good for: IR extraction (analyzing your reference images), image generation via Imagen
Pricing: Generous free tier (Flash Lite), pay per token for Pro
Get your key

xAI / Grok

Models: Grok vision, Grok image generation
Good for: Extraction with strong character interpretation
Pricing: Free tier available
Get your key

GLM / Z.AI

Models: GLM-4V (vision), CogView (generation)
Good for: Richest extraction output, deep art vocabulary, 200K context
Pricing: Free tier available
Get your key

Qwen / Alibaba (DashScope)

Models: Qwen VL (vision), Qwen Image 2512 (generation)
Good for: Genre-aware extraction (especially anime/illustration), strong style fusion in generation
Pricing: Free tier available
Get your key

Best Free Starting Points

You don’t need to spend money to start using StyleFusion. Two providers offer generous free tiers that cover both extraction and generation.

Google Gemini (Recommended first key)

The Gemini API has a free tier through Google AI Studio that requires no credit card. You get access to Gemini Flash Lite, Flash, and Pro models for IR extraction, plus Nano Banana (gemini-2.5-flash-image) for image generation. Nano Banana 2 and Imagen 4 require a paid tier key. Free tier limits are roughly 15 requests per minute and up to 1,000 per day depending on the model, which is more than enough for regular use. To unlock Nano Banana 2, link a billing account in AI Studio (pay-as-you-go, no monthly fee; Nano Banana 2 runs about $0.04 per image).

Alibaba Cloud / Qwen (Best extraction quality)

Alibaba Cloud Model Studio gives new accounts a free quota for every model in the Singapore region, valid for 90 days from activation. This covers Qwen 3.5 Plus (which produced the best IR extraction results in our testing) and Qwen Image 2512 (which produced some of the strongest style fusion outputs). The free quota is substantial enough for heavy prototyping and regular creative use.

Important: Use the Singapore region when creating your account and API key. Other regions don’t offer free quota, and the API key must match the region it was created in.

Between these two free providers, you have access to high-quality extraction (Qwen for detailed, genre-aware IR output; Gemini for fast, reliable extraction) and strong generation models (Nano Banana 2 or Wan 2.6 T2I). You can use StyleFusion extensively without spending anything.

Multi-Provider Generation

One of StyleFusion’s key capabilities is generating from the same prompt across different providers simultaneously. This matters because different models interpret the same instructions differently.

Some models integrate style directly into a character’s form (fractal patterns become dress fabric, hair texture, skin markings). Others place the character inside a styled environment. Others preserve the character cleanly and apply style only as background atmosphere. These aren’t quality differences; they’re different creative interpretations of the same instruction.

By generating across multiple providers, you get a range of creative outcomes from a single extraction and can pick the one that matches your intent.

Character System

StyleFusion includes a Character DNA system for maintaining consistent character identity across generations.

When you extract a reference image, you can save the character with their full visual profile: physical description, facial features, hair, clothing, identity colors (with hex values), distinguishing features, and style anchors. This data persists and can be loaded back into the workspace at any time.

When you activate a saved character and load a new style reference, the compiler locks the character’s identity traits while adopting the new style. The character’s face, hair, signature colors, and distinguishing features stay consistent. The style, scene, lighting, and atmospheric palette come from the new reference. This is what “same character, any style” actually means at a technical level.

Characters can be exported as structured data sheets with all their visual profile information, or as PNG/PDF for sharing.

The Grimoire

Behind StyleFusion’s compilation step sits the Grimoire: HobFarm’s self-enriching knowledge graph of visual vocabulary. It contains over 167,000 creative atoms organized into arrangements (art movements, aesthetic traditions, visual styles) with harmonic relationships between them.

When StyleFusion compiles a prompt, the Grimoire’s Conductor matches your style reference to the closest arrangement and enriches the prompt with relevant vocabulary. A fractal art reference doesn’t just get “fractal” in the prompt; it gets specific technique references, material descriptions, and movement context that help the generation model understand what you’re actually going for.

This is the difference between telling a generation model “fractal art” and telling it about escape-time algorithms, orbit traps, flame fractals, Seahorse valleys, and the specific color relationships that define the genre. The Grimoire provides the depth; StyleFusion provides the structure.

Tech Details

Stack: Cloudflare Workers (API), Cloudflare Pages (frontend), D1 (database), R2 (asset storage), Workers AI (edge-native models), AI Gateway (provider routing, logging, caching)

Frontend: React, Vite, Zustand, Tailwind CSS

Pipeline: 7 extraction agents, role-scoped fusion assembly, Grimoire enrichment via Conductor, multi-target compilation (generation JSON, creative slots, compact prompts, descriptions)

Architecture: Fractal Fusion Engine (FFE): INGEST > INDEX > MEDIATE > EXECUTE > VALIDATE > DELIVER. Same six-phase pattern at every scale. The difference between a simple single-image extraction and a complex multi-reference fusion is depth, not shape.

System Features

Visual DNA Extraction

Seven specialized AI agents analyze each reference image independently, extracting subject descriptions, style vocabulary, color palettes, lighting, textures, and compositional structure.

Role-Scoped Fusion

Enforces strict boundaries between subject, style, and composition references. Identity colors are protected. Style contamination is eliminated at the assembly level.

Grimoire Enrichment

The Grimoire knowledge graph (167,000+ creative atoms) enriches prompts with real art vocabulary: movement references, technique descriptions, material properties.

Multi-Provider Output

Compile to provider-specific prompts and generate from multiple models simultaneously. Compare creative interpretations from the same structured input.

Character DNA System

Save and reload full character visual profiles. Identity traits lock during style changes: same face, same colors, same distinguishing features, any aesthetic.

BYOK Architecture

Bring Your Own Key. API keys are sent per-request in headers and never stored on our servers. Add providers anytime, pay only for what you use.