Replicate

Webサイトへのアクセス

Core Functionality

methexis-inc/img2prompt provides approximate text prompt generation with style, which can be used with stable diffusion to re-create similar looking versions of the image/painting. It is optimized for stable-diffusion (clip ViT-L/14). The CLIP Interrogator uses the OpenAI CLIP models to test a given image against a variety of artists, mediums, and styles to study how the different models see the content of the image. It also combines the results with BLIP caption to suggest a text prompt.

Usage Instructions & Workflow

Input: Upload a file from your machine or provide an image URL.
Processing: The model interrogates the image using ViT-L/14.
Output: Generates a text prompt containing image content, style, artists, etc. (e.g., "a cat wearing a suit and tie with green eyes, a stock photo by Hanns Katz, pexels, furry art, stockphoto, creative commons attribution, quantum wavetracing").

Running Methods

API Calls: Supports Node.js, Python, and HTTP API. Requires setting the REPLICATE_API_TOKEN environment variable and uses the model version ID 50adaf2d3ad20a6f911a8a9e3ccf777b263b8596fbd2c8fc26e8888f8a0edbb5.
Local Execution: Can be downloaded and run locally using Cog or Docker. Requires Nvidia T4 GPU hardware.

Pricing & Costs

This model costs approximately $0.011 to run on Replicate, or 90 runs per $1, but this varies depending on your inputs.
Predictions typically complete within 49 seconds. The first API call will boot the model (cold boot) and may take longer, but subsequent responses will be fast.

Technical Background & Source

A slightly adapted version of the CLIP Interrogator notebook by @pharmapsychotic. It is open source under its License, with source code hosted on GitHub (pharmapsychotic/clip-interrogator).

アクセス: 9.9M

国: India

価格設定モデル: Free

芸術 Free

きょうゆう