ControlNet Pose
Main Features: The jagilley/controlnet-pose model is used to modify images with humans using pose detection. Input an image and prompt the model to generate an image as you would for Stable Diffusion; Openpose will detect the pose for you.
Model Description: ControlNet is a neural network structure which allows control of pretrained large diffusion models to support additional input conditions beyond prompts. The ControlNet learns task-specific conditions in an end-to-end way, and the learning is robust even when the training dataset is small (< 50k samples). Large diffusion models like Stable Diffusion can be augmented with ControlNets to enable conditional inputs like edge maps, segmentation maps, keypoints, etc.
Input Parameters:
image: Input image (Required).prompt: Prompt for the model (Required, default "an astronaut on the moon, digital art").num_samples: Number of samples, higher values may OOM (default "1").image_resolution: Image resolution to be generated (default "512").low_threshold/high_threshold: Canny line detection low/high threshold (default 100/200).ddim_steps: Steps (default 20).scale: Scale for classifier-free guidance (default 9).seed: Seed.eta: Controls the amount of noise added during the denoising diffusion process; higher value -> more noise (default 0).a_prompt: Additional text to be appended to prompt (default "best quality, extremely detailed").n_prompt: Negative Prompt (default "longbody, lowres, bad anatomy...").detect_resolution: Resolution at which detection method will be applied (default 512).
Usage: Can be run on Replicate via Node.js, Python, and HTTP API, or run locally using Cog and Docker.
Target Users & Core Advantages: Targeted at AI developers and image generation enthusiasts. The core advantage lies in generating new high-quality images based on text prompts while strictly preserving the human pose from the input image, combining the generation power of Stable Diffusion with the precise pose control of ControlNet.
Pricing & Cost: This model costs approximately $0.15 to run on Replicate (or 6 runs per $1), varying depending on inputs. It runs on Nvidia A100 (80GB) GPU hardware. Predictions typically complete within 108 seconds. The first API call will boot the cold model and may take longer.
의론