Oh My Assistant: Generative AI Assistant Service for Comic ariststs

Oh My Assistant is a generative AI service designed to assist comics and illustration artists by learning each artist's unique drawing style to convert realistic images into their style and translate characters' poses to a desired pose.

Live Demo

The service is deployed on the following link (As of August 2024, the link may not work).

Background Image Generator

background.mp4

Pose Image Generator

pose.mp4

Member

Chanwoo Kim	Hyunwoo Nam	Kyungyub Ryu	Gyuseob Lee	Hyeonji Lee	Juhee Han

Modeling	Modeling	Backend	Backend	Modeling	Frontend
Background Image Generate	Background Image Generate	PL Infra Serving	Implement BE	Pose Image Generate	UI/UX Design Implement FE

Service Architecture

The background generation service selects either the Img2Img or Txt2Img model of Stable Diffusion depending on whether an source image is provided by the user. Then, it follows these steps to create webtoon-style images.

Noise Initialization
- If an original image is provided, noise is generated by gradually adding noise to the original image. If no original image is given, noise is generated as a completely random tensor.
Inject Condition
- Our model takes prompts as input and uses the text encoder of the CLIP model to extract embedding vectors.
Denoising Process
- The noise generated earlier is used as input to progressively remove noise and produce a high-resolution image. A cross-attention mechanism is employed to guide the noise removal process according to the given features.

In summary, by inputting an original image or prompt into a model trained in webtoon style, the model generates and removes noise based on the input to produce the background image required by the artist.

Result

Modeling - Pose

Character Pose Transfer is divided into 2 steps: Pose Estimation and Pose Transfer.

Pose Estimation

Pose Estimation model uses DWPose. DWPose uses a detector to find landmarks in the input image and a classifier to categorize body keypoints, predicting a total of 133 keypoints (COCO Whole Body) for the face, hands, arms, and legs.

Pose Transfer

Pose Transfer uses a diffusion model based on AnimateAnyone. The target pose image extracted from the previous estimation and the character image provided by the user are input and embedded using a VAE and CLIP Encoder. Denoising UNet and ReferenceNet are used to generate the pose-changed character image from noise, and the VAE Decoder is employed to decode the image and produce the final result.

Name		Name	Last commit message	Last commit date
Latest commit History 440 Commits
.github		.github
backend		backend
frontend		frontend
model/pose_transfer		model/pose_transfer
.gitignore		.gitignore
README.md		README.md
README_KOR.md		README_KOR.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Oh My Assistant: Generative AI Assistant Service for Comic ariststs

Table of content

Live Demo

Background Image Generator

Pose Image Generator

Member

Service Architecture

Result

Modeling - Pose

Pose Estimation

Pose Transfer

Directory

Links

About

Releases

Packages

Languages

jh7316/AI_Comics_Assistant

Folders and files

Latest commit

History

Repository files navigation

Oh My Assistant: Generative AI Assistant Service for Comic ariststs

Table of content

Live Demo

Background Image Generator

Pose Image Generator

Member

Service Architecture

Result

Modeling - Pose

Pose Estimation

Pose Transfer

Directory

Links

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages