Replicate
Core Functionality
methexis-inc/img2prompt provides approximate text prompt generation with style, which can be used with stable diffusion to re-create similar looking versions of the image/painting. It is optimized for stable-diffusion (clip ViT-L/14). The CLIP Interrogator uses the OpenAI CLIP models to test a given image against a variety of artists, mediums, and styles to study how the different models see the content of the image. It also combines the results with BLIP caption to suggest a text prompt.
Usage Instructions & Workflow
- Input: Upload a file from your machine or provide an image URL.
- Processing: The model interrogates the image using ViT-L/14.
- Output: Generates a text prompt containing image content, style, artists, etc. (e.g., "a cat wearing a suit and tie with green eyes, a stock photo by Hanns Katz, pexels, furry art, stockphoto, creative commons attribution, quantum wavetracing").
Running Methods
- API Calls: Supports Node.js, Python, and HTTP API. Requires setting the
REPLICATE_API_TOKENenvironment variable and uses the model version ID50adaf2d3ad20a6f911a8a9e3ccf777b263b8596fbd2c8fc26e8888f8a0edbb5. - Local Execution: Can be downloaded and run locally using Cog or Docker. Requires Nvidia T4 GPU hardware.
Pricing & Costs
- This model costs approximately $0.011 to run on Replicate, or 90 runs per $1, but this varies depending on your inputs.
- Predictions typically complete within 49 seconds. The first API call will boot the model (cold boot) and may take longer, but subsequent responses will be fast.
Technical Background & Source
A slightly adapted version of the CLIP Interrogator notebook by @pharmapsychotic. It is open source under its License, with source code hosted on GitHub (pharmapsychotic/clip-interrogator).
アクセス:
9.9M
国:
India
価格設定モデル:
Free
議論する