- Hermes Detective Agency: Open-ended mystery investigation game - Roles: Chief (human), Witness (Kimi), Detective (Hermes) - 5 difficulty levels, community cases, open-ended solving - Scoring: Alignment %, Evidence %, Time - Features: Retry, Journal, Observe mode - Tech: Kimi Vision + Hermes Agent + Pollinations Changelog: - Research phase: Kimi capabilities, Hermes agent, image APIs - Brainstorming: 14 ideas explored - Comparison matrix: Detective selected as winner - Concept finalized with all design decisions
4.2 KiB
4.2 KiB
Idea 007: Spot the Difference Agent
Date: 2026-04-19
Status: Idea
Tags: hermes-agent, kimi-vision, puzzle, gamification, webapp
Concept
A daily "Spot the Difference" puzzle webapp where AI (Kimi + Hermes) analyzes two images and shows its step-by-step process in finding the differences.
Core insight: Use visual analysis strength, minimize reasoning load.
User Flow
- User opens webapp → sees today's "Spot the Difference" puzzle (two similar images)
- User can play manually (click on differences) OR
- User clicks "Let AI Solve" → watches AI's step-by-step analysis
- AI shows its reasoning process: "Scanning left-to-right... Found difference #1: color mismatch in top-left..."
- Leaderboard shows attempt stats (anonymous)
Why This Works
| Aspect | Implementation |
|---|---|
| Visual Analysis | Kimi compares images pixel-level + semantic |
| Low Reasoning | Pattern matching, not complex logic |
| Step-by-Step | Show each finding with visual highlight |
| Gamification | Daily puzzle, leaderboard, no auth |
Puzzle Types
Primary: Spot the Difference (v1)
- Two images with subtle differences
- Kimi identifies all differences
- Each found difference highlighted + explanation
Secondary (future):
- Find the anomaly (what's wrong in this image?)
- Count the objects (how many X in this image?)
- What's different? (semantic analysis)
Technical Stack
| Component | Source | Role |
|---|---|---|
| Frontend | Single HTML page | Display puzzle, show AI process |
| Image Analysis | Kimi Vision (via gateway) | Compare images, find differences |
| Orchestration | Hermes Agent | Coordinate flow, format output |
| Image Gen | Pollinations AI | Generate daily puzzle pairs |
Daily Puzzle Generation
Hermes + Pollinations → Generate base image
Hermes + Pollinations → Generate modified image (with subtle changes)
Store both → Serve to users daily
AI Solving Process
1. Hermes receives both images
2. Send to Kimi Vision for analysis
3. Kimi returns list of differences with locations
4. Hermes formats step-by-step explanation
5. Frontend animates each finding
Features
Core
- Daily puzzle auto-rotates
- Two-image display (side by side)
- "Let AI Solve" button
- Step-by-step visualization of AI findings
- Show each difference with highlight + explanation
Gamification (no auth)
- Attempt counter (per user session, localStorage)
- Leaderboard (anonymous, session-based)
- "Perfect solve" badge (AI found all differences on first pass)
Nice to Have
- Difficulty levels (Easy/Medium/Hard)
- Share result as image
- Hint system (Kimi finds 1, user finds rest)
Step-by-Step Output Format
🔍 Scanning image...
✅ Difference #1 found: "The lamp color changed from blue to red"
📍 Location: Top-left corner
👆 [Highlighted on image]
✅ Difference #2 found: "Window shape is slightly different"
📍 Location: Center-right
👆 [Highlighted on image]
...
🎯 Solved! Found X differences in Y steps.
⏱️ Time: Z seconds
Comparison with Other Ideas
| Aspect | 001 Visual Narrative | 007 Spot the Difference |
|---|---|---|
| Visual Analysis | Heavy | Heavy |
| Reasoning | Medium | Light |
| Demo Impact | High | High |
| Gamification | Low | High |
| Uniqueness | 7/10 | 9/10 |
| Step-by-Step | Yes | Yes (more natural) |
Why Stronger than 001
- Tangible use case — People actually play spot the difference
- Clear AI demonstration — "Watch AI see what you see"
- Gamification — Daily puzzle + leaderboard = engagement
- Low reasoning, high vision — Perfect for Kimi's strength
- Step-by-step is natural — Not forced, it's how you'd solve it
Risks
- ⚠️ Need reliable daily puzzle generation (harder than it sounds)
- ⚠️ Kimi analysis quality depends on image complexity
- ⚠️ Need diverse puzzle set to not repeat
Next Steps
- Test Kimi's spot-the-difference capability
- Design puzzle generation pipeline
- Mock up webapp UI
- Prototype step-by-step visualization
Related Ideas
- See:
001-visual-narrative-agent.md