- Hermes Detective Agency: Open-ended mystery investigation game - Roles: Chief (human), Witness (Kimi), Detective (Hermes) - 5 difficulty levels, community cases, open-ended solving - Scoring: Alignment %, Evidence %, Time - Features: Retry, Journal, Observe mode - Tech: Kimi Vision + Hermes Agent + Pollinations Changelog: - Research phase: Kimi capabilities, Hermes agent, image APIs - Brainstorming: 14 ideas explored - Comparison matrix: Detective selected as winner - Concept finalized with all design decisions
139 lines
4.2 KiB
Markdown
139 lines
4.2 KiB
Markdown
# Idea 007: Spot the Difference Agent
|
|
|
|
**Date:** 2026-04-19
|
|
**Status:** Idea
|
|
**Tags:** hermes-agent, kimi-vision, puzzle, gamification, webapp
|
|
|
|
## Concept
|
|
|
|
A daily "Spot the Difference" puzzle webapp where AI (Kimi + Hermes) analyzes two images and shows its step-by-step process in finding the differences.
|
|
|
|
**Core insight:** Use visual analysis strength, minimize reasoning load.
|
|
|
|
## User Flow
|
|
|
|
1. User opens webapp → sees today's "Spot the Difference" puzzle (two similar images)
|
|
2. User can play manually (click on differences) OR
|
|
3. User clicks "Let AI Solve" → watches AI's step-by-step analysis
|
|
4. AI shows its reasoning process: "Scanning left-to-right... Found difference #1: color mismatch in top-left..."
|
|
5. Leaderboard shows attempt stats (anonymous)
|
|
|
|
## Why This Works
|
|
|
|
| Aspect | Implementation |
|
|
|--------|----------------|
|
|
| **Visual Analysis** | Kimi compares images pixel-level + semantic |
|
|
| **Low Reasoning** | Pattern matching, not complex logic |
|
|
| **Step-by-Step** | Show each finding with visual highlight |
|
|
| **Gamification** | Daily puzzle, leaderboard, no auth |
|
|
|
|
## Puzzle Types
|
|
|
|
### Primary: Spot the Difference (v1)
|
|
- Two images with subtle differences
|
|
- Kimi identifies all differences
|
|
- Each found difference highlighted + explanation
|
|
|
|
### Secondary (future):
|
|
- Find the anomaly (what's wrong in this image?)
|
|
- Count the objects (how many X in this image?)
|
|
- What's different? (semantic analysis)
|
|
|
|
## Technical Stack
|
|
|
|
| Component | Source | Role |
|
|
|-----------|--------|------|
|
|
| Frontend | Single HTML page | Display puzzle, show AI process |
|
|
| Image Analysis | Kimi Vision (via gateway) | Compare images, find differences |
|
|
| Orchestration | Hermes Agent | Coordinate flow, format output |
|
|
| Image Gen | Pollinations AI | Generate daily puzzle pairs |
|
|
|
|
### Daily Puzzle Generation
|
|
```
|
|
Hermes + Pollinations → Generate base image
|
|
Hermes + Pollinations → Generate modified image (with subtle changes)
|
|
Store both → Serve to users daily
|
|
```
|
|
|
|
### AI Solving Process
|
|
```
|
|
1. Hermes receives both images
|
|
2. Send to Kimi Vision for analysis
|
|
3. Kimi returns list of differences with locations
|
|
4. Hermes formats step-by-step explanation
|
|
5. Frontend animates each finding
|
|
```
|
|
|
|
## Features
|
|
|
|
### Core
|
|
- [ ] Daily puzzle auto-rotates
|
|
- [ ] Two-image display (side by side)
|
|
- [ ] "Let AI Solve" button
|
|
- [ ] Step-by-step visualization of AI findings
|
|
- [ ] Show each difference with highlight + explanation
|
|
|
|
### Gamification (no auth)
|
|
- [ ] Attempt counter (per user session, localStorage)
|
|
- [ ] Leaderboard (anonymous, session-based)
|
|
- [ ] "Perfect solve" badge (AI found all differences on first pass)
|
|
|
|
### Nice to Have
|
|
- [ ] Difficulty levels (Easy/Medium/Hard)
|
|
- [ ] Share result as image
|
|
- [ ] Hint system (Kimi finds 1, user finds rest)
|
|
|
|
## Step-by-Step Output Format
|
|
|
|
```
|
|
🔍 Scanning image...
|
|
✅ Difference #1 found: "The lamp color changed from blue to red"
|
|
📍 Location: Top-left corner
|
|
👆 [Highlighted on image]
|
|
|
|
✅ Difference #2 found: "Window shape is slightly different"
|
|
📍 Location: Center-right
|
|
👆 [Highlighted on image]
|
|
|
|
...
|
|
|
|
🎯 Solved! Found X differences in Y steps.
|
|
⏱️ Time: Z seconds
|
|
```
|
|
|
|
## Comparison with Other Ideas
|
|
|
|
| Aspect | 001 Visual Narrative | 007 Spot the Difference |
|
|
|--------|---------------------|------------------------|
|
|
| Visual Analysis | Heavy | **Heavy** |
|
|
| Reasoning | Medium | **Light** |
|
|
| Demo Impact | High | **High** |
|
|
| Gamification | Low | **High** |
|
|
| Uniqueness | 7/10 | **9/10** |
|
|
| Step-by-Step | Yes | **Yes (more natural)** |
|
|
|
|
## Why Stronger than 001
|
|
|
|
1. **Tangible use case** — People actually play spot the difference
|
|
2. **Clear AI demonstration** — "Watch AI see what you see"
|
|
3. **Gamification** — Daily puzzle + leaderboard = engagement
|
|
4. **Low reasoning, high vision** — Perfect for Kimi's strength
|
|
5. **Step-by-step is natural** — Not forced, it's how you'd solve it
|
|
|
|
## Risks
|
|
|
|
- ⚠️ Need reliable daily puzzle generation (harder than it sounds)
|
|
- ⚠️ Kimi analysis quality depends on image complexity
|
|
- ⚠️ Need diverse puzzle set to not repeat
|
|
|
|
## Next Steps
|
|
|
|
- [ ] Test Kimi's spot-the-difference capability
|
|
- [ ] Design puzzle generation pipeline
|
|
- [ ] Mock up webapp UI
|
|
- [ ] Prototype step-by-step visualization
|
|
|
|
## Related Ideas
|
|
|
|
- See: `001-visual-narrative-agent.md`
|