2.4 KiB
2.4 KiB
name, description
| name | description |
|---|---|
| image-inspector | Inspect images to answer Yes/No questions about visual content. Use when asking "Is a <thing> visible in this image?" or checking for specific objects, people, colors, text, or other visual elements. Always arrives at a definitive Yes/No conclusion. |
Image Inspector
Inspect images using the Qwen3-VL vision model to answer Yes/No questions about visual content.
When to Use
Use this skill when you need to:
- Check if a specific object is present in an image
- Verify visual elements exist
- Answer binary questions about image content
- Confirm or deny the presence of things in images
How It Works
- You provide an image path and a Yes/No question
- You resize the image to be a max of 1MP
- Ask the @image-expert to examine the image, and return a Yes/No
- You receive a definitive Yes or No answer
Usage Pattern
Step 1: Read the Image
Use the Read tool to load the image file. The Read tool can read image files and return them as attachments.
Step 3: Resize the image to 1MP
Use imagemagick and resize to a maximum of 1MP, outputting to ./.tmp/
Step 3: Formulate the Question
Ask @image-expert a clear Yes/No question about the image:
- "Is a [object] visible in this image?"
- "Does this image contain [element]?"
- "Can you see [thing] in this scene?"
Step 3: Provide the Answer
After analyzing the (smaller) image, provide:
- The Answer: Yes or No (always definitive)
- Brief Justification: 1-2 sentences explaining why
Example Questions
- "Is a tree visible in this image?"
- "Does this image contain a person wearing a hat?"
- "Is there text visible in this image?"
- "Can you see a water feature in this scene?"
- "Is the sky visible in this image?"
- "Does this image show an indoor scene?"
Response Format
**Answer:** Yes/No
**Reasoning:** [1-2 sentences explaining what you see or don't see]
Guidelines
- Always provide a definitive Yes or No answer
- Be specific about what you observe
- If uncertain, describe what you see and make your best judgment
- Don't hedge with "maybe" or "possibly" - commit to an answer
- Focus only on the specific question asked
Limitations
- The model can only analyze what's visually apparent
- Small or partially obscured objects may be missed
- The model cannot zoom or enhance the image
- Text must be clearly legible to be detected