Replicate

Core Functionality

methexis-inc/img2prompt provides approximate text prompt generation with style, which can be used with stable diffusion to re-create similar looking versions of the image/painting. It is optimized for stable-diffusion (clip ViT-L/14). The CLIP Interrogator uses the OpenAI CLIP models to test a given image against a variety of artists, mediums, and styles to study how the different models see the content of the image. It also combines the results with BLIP caption to suggest a text prompt.

Usage Instructions & Workflow

  1. Input: Upload a file from your machine or provide an image URL.
  2. Processing: The model interrogates the image using ViT-L/14.
  3. Output: Generates a text prompt containing image content, style, artists, etc. (e.g., "a cat wearing a suit and tie with green eyes, a stock photo by Hanns Katz, pexels, furry art, stockphoto, creative commons attribution, quantum wavetracing").

Running Methods

  • API Calls: Supports Node.js, Python, and HTTP API. Requires setting the REPLICATE_API_TOKEN environment variable and uses the model version ID 50adaf2d3ad20a6f911a8a9e3ccf777b263b8596fbd2c8fc26e8888f8a0edbb5.
  • Local Execution: Can be downloaded and run locally using Cog or Docker. Requires Nvidia T4 GPU hardware.

Pricing & Costs

  • This model costs approximately $0.011 to run on Replicate, or 90 runs per $1, but this varies depending on your inputs.
  • Predictions typically complete within 49 seconds. The first API call will boot the model (cold boot) and may take longer, but subsequent responses will be fast.

Technical Background & Source

A slightly adapted version of the CLIP Interrogator notebook by @pharmapsychotic. It is open source under its License, with source code hosted on GitHub (pharmapsychotic/clip-interrogator).

アクセス: 9.9M
国: India
価格設定モデル: Free
芸術 Free

議論する

サインイン After Sign In, you can make comments