Files
2026-03-04 22:05:05 -08:00

2.4 KiB

name, description
name description
image-inspector Inspect images to answer Yes/No questions about visual content. Use when asking "Is a <thing> visible in this image?" or checking for specific objects, people, colors, text, or other visual elements. Always arrives at a definitive Yes/No conclusion.

Image Inspector

Inspect images using the Qwen3-VL vision model to answer Yes/No questions about visual content.

When to Use

Use this skill when you need to:

  • Check if a specific object is present in an image
  • Verify visual elements exist
  • Answer binary questions about image content
  • Confirm or deny the presence of things in images

How It Works

  1. You provide an image path and a Yes/No question
  2. You resize the image to be a max of 1MP
  3. Ask the @image-expert to examine the image, and return a Yes/No
  4. You receive a definitive Yes or No answer

Usage Pattern

Step 1: Read the Image

Use the Read tool to load the image file. The Read tool can read image files and return them as attachments.

Step 3: Resize the image to 1MP

Use imagemagick and resize to a maximum of 1MP, outputting to ./.tmp/

Step 3: Formulate the Question

Ask @image-expert a clear Yes/No question about the image:

  • "Is a [object] visible in this image?"
  • "Does this image contain [element]?"
  • "Can you see [thing] in this scene?"

Step 3: Provide the Answer

After analyzing the (smaller) image, provide:

  1. The Answer: Yes or No (always definitive)
  2. Brief Justification: 1-2 sentences explaining why

Example Questions

  • "Is a tree visible in this image?"
  • "Does this image contain a person wearing a hat?"
  • "Is there text visible in this image?"
  • "Can you see a water feature in this scene?"
  • "Is the sky visible in this image?"
  • "Does this image show an indoor scene?"

Response Format

**Answer:** Yes/No

**Reasoning:** [1-2 sentences explaining what you see or don't see]

Guidelines

  • Always provide a definitive Yes or No answer
  • Be specific about what you observe
  • If uncertain, describe what you see and make your best judgment
  • Don't hedge with "maybe" or "possibly" - commit to an answer
  • Focus only on the specific question asked

Limitations

  • The model can only analyze what's visually apparent
  • Small or partially obscured objects may be missed
  • The model cannot zoom or enhance the image
  • Text must be clearly legible to be detected