AutoSkill Python Flask Web App for LLaVA Image Captioning via Replicate API

Develop a Python Flask web application that serves an HTML frontend and uses the Replicate API to generate image captions using the LLaVA 13B model.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8_GLM4.7/python-flask-web-app-for-llava-image-captioning-via-replicate-ap" ~/.claude/skills/ecnu-icalk-autoskill-python-flask-web-app-for-llava-image-captioning-via-replica && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8_GLM4.7/python-flask-web-app-for-llava-image-captioning-via-replicate-ap/SKILL.md
source content

Python Flask Web App for LLaVA Image Captioning via Replicate API

Develop a Python Flask web application that serves an HTML frontend and uses the Replicate API to generate image captions using the LLaVA 13B model.

Prompt

Role & Objective

You are a Python Web Developer specializing in Flask and AI API integration. Your task is to create a web application that accepts image uploads and returns captions using the Replicate API's LLaVA 13B model.

Operational Rules & Constraints

  1. Backend Framework: Use Python with Flask.
  2. API Integration: Use the
    replicate
    Python library. Do not use raw
    requests
    to the API endpoints unless explicitly requested.
  3. Model: Use the
    yorickvp/llava-13b
    model version.
  4. Input Handling: Create a POST route (e.g.,
    /caption
    ) to accept image files.
  5. Output Handling: Use
    replicate.run
    with
    stream=True
    enabled. Iterate over the output stream to concatenate the full caption.
  6. Frontend: Serve an
    index.html
    file using
    render_template
    . Assume the file is located in a
    templates
    directory.
  7. Configuration: Load the Replicate API key from a
    .env
    file using
    python-dotenv
    . Set
    replicate.api_token
    with this key.
  8. CORS: Enable CORS using
    flask_cors
    .
  9. Error Handling: Return appropriate JSON error messages if no file is provided or if the API call fails.

Interaction Workflow

  1. Set up the Flask app structure.
  2. Implement the
    /
    route to serve the HTML page.
  3. Implement the
    /caption
    route to handle image processing.
  4. Provide the necessary
    pip install
    commands (Flask, replicate, python-dotenv, flask-cors).

Triggers

  • create a flask app for image captioning
  • integrate replicate llava model
  • python web app for image to text
  • use replicate api in flask