FairyGen: Storied Cartoon Video from a Single Child-Drawn Character




Created by FairyGen from a single hand-drawn sketch

Method Overview

We propose FairyGen, a novel framework for generating animated story videos from a single hand-drawn character, while faithfully preserving its artistic style. It features story planning via MLLM, propagated stylization, 3D-based motion generation, and a two-stage propagated motion adapter.

Method Overview

Comparisons to Prior Works

(i)  Comparison with Muti-Event Video Generation

A robot stands on the edge of a tall city rooftop, ready to jump down to the street below. The robotfirst extends its arms, then jumps upward. As the robot drops, the building walls and city skyline move past in the background. The robot lands freely on the pavement, with its arms touching the ground, as dust and debris scatter slightly upon landing.

MEVG (ECCV 24)

Vlogger (CVPR 24)

Ours

(ii) Comparison with Style Baselines

Reference
BLoRA
InstantStyle
DreamBooth
Ours
Ref 1
BLoRA 1
InstantStyle 1
DreamBooth 1
Ours 1
Ref 2
BLoRA 2
InstantStyle 2
DreamBooth 2
Ours 2
Ref 3
BLoRA 3
InstantStyle 3
DreamBooth 3
Ours 3
Ref 4
BLoRA 4
InstantStyle 4
DreamBooth 4
Ours 4

(iii) Comparison with Motion Baselines

Motion Sequences
Animate-X
Wan 2.1-Depth
Ours

(iv)  Comparisons with DreamVideo (ID+Motion)

prompt: A robot is walking through a corridor in a futuristic spaceship.

Train Data

DreamVideo (CVPR 24)

Ours

Ablation Studies

(i)  Style Customization Methods

input1
lora1
dora1
pa1
input2
lora2
dora2
pa2
Input Image
LoRA
DoRA
PA (Ours)

(ii)  Two-stage Motion Adapter

Train Video
LoRA Directly
w/ Two Stage Motion Adapter

(iii)  Timestep Shift Strategy

Limitation

In some case, our method create a dynamic foreground motion over a static background due to the video diffusion model's unpredictability.

BibTeX

BibTex Code Here