feat: Initial commit - Hermes Detective Agency concept
- Hermes Detective Agency: Open-ended mystery investigation game - Roles: Chief (human), Witness (Kimi), Detective (Hermes) - 5 difficulty levels, community cases, open-ended solving - Scoring: Alignment %, Evidence %, Time - Features: Retry, Journal, Observe mode - Tech: Kimi Vision + Hermes Agent + Pollinations Changelog: - Research phase: Kimi capabilities, Hermes agent, image APIs - Brainstorming: 14 ideas explored - Comparison matrix: Detective selected as winner - Concept finalized with all design decisions
This commit is contained in:
138
docs/ideas/007-vision-spot-the-difference.md
Normal file
138
docs/ideas/007-vision-spot-the-difference.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# Idea 007: Spot the Difference Agent
|
||||
|
||||
**Date:** 2026-04-19
|
||||
**Status:** Idea
|
||||
**Tags:** hermes-agent, kimi-vision, puzzle, gamification, webapp
|
||||
|
||||
## Concept
|
||||
|
||||
A daily "Spot the Difference" puzzle webapp where AI (Kimi + Hermes) analyzes two images and shows its step-by-step process in finding the differences.
|
||||
|
||||
**Core insight:** Use visual analysis strength, minimize reasoning load.
|
||||
|
||||
## User Flow
|
||||
|
||||
1. User opens webapp → sees today's "Spot the Difference" puzzle (two similar images)
|
||||
2. User can play manually (click on differences) OR
|
||||
3. User clicks "Let AI Solve" → watches AI's step-by-step analysis
|
||||
4. AI shows its reasoning process: "Scanning left-to-right... Found difference #1: color mismatch in top-left..."
|
||||
5. Leaderboard shows attempt stats (anonymous)
|
||||
|
||||
## Why This Works
|
||||
|
||||
| Aspect | Implementation |
|
||||
|--------|----------------|
|
||||
| **Visual Analysis** | Kimi compares images pixel-level + semantic |
|
||||
| **Low Reasoning** | Pattern matching, not complex logic |
|
||||
| **Step-by-Step** | Show each finding with visual highlight |
|
||||
| **Gamification** | Daily puzzle, leaderboard, no auth |
|
||||
|
||||
## Puzzle Types
|
||||
|
||||
### Primary: Spot the Difference (v1)
|
||||
- Two images with subtle differences
|
||||
- Kimi identifies all differences
|
||||
- Each found difference highlighted + explanation
|
||||
|
||||
### Secondary (future):
|
||||
- Find the anomaly (what's wrong in this image?)
|
||||
- Count the objects (how many X in this image?)
|
||||
- What's different? (semantic analysis)
|
||||
|
||||
## Technical Stack
|
||||
|
||||
| Component | Source | Role |
|
||||
|-----------|--------|------|
|
||||
| Frontend | Single HTML page | Display puzzle, show AI process |
|
||||
| Image Analysis | Kimi Vision (via gateway) | Compare images, find differences |
|
||||
| Orchestration | Hermes Agent | Coordinate flow, format output |
|
||||
| Image Gen | Pollinations AI | Generate daily puzzle pairs |
|
||||
|
||||
### Daily Puzzle Generation
|
||||
```
|
||||
Hermes + Pollinations → Generate base image
|
||||
Hermes + Pollinations → Generate modified image (with subtle changes)
|
||||
Store both → Serve to users daily
|
||||
```
|
||||
|
||||
### AI Solving Process
|
||||
```
|
||||
1. Hermes receives both images
|
||||
2. Send to Kimi Vision for analysis
|
||||
3. Kimi returns list of differences with locations
|
||||
4. Hermes formats step-by-step explanation
|
||||
5. Frontend animates each finding
|
||||
```
|
||||
|
||||
## Features
|
||||
|
||||
### Core
|
||||
- [ ] Daily puzzle auto-rotates
|
||||
- [ ] Two-image display (side by side)
|
||||
- [ ] "Let AI Solve" button
|
||||
- [ ] Step-by-step visualization of AI findings
|
||||
- [ ] Show each difference with highlight + explanation
|
||||
|
||||
### Gamification (no auth)
|
||||
- [ ] Attempt counter (per user session, localStorage)
|
||||
- [ ] Leaderboard (anonymous, session-based)
|
||||
- [ ] "Perfect solve" badge (AI found all differences on first pass)
|
||||
|
||||
### Nice to Have
|
||||
- [ ] Difficulty levels (Easy/Medium/Hard)
|
||||
- [ ] Share result as image
|
||||
- [ ] Hint system (Kimi finds 1, user finds rest)
|
||||
|
||||
## Step-by-Step Output Format
|
||||
|
||||
```
|
||||
🔍 Scanning image...
|
||||
✅ Difference #1 found: "The lamp color changed from blue to red"
|
||||
📍 Location: Top-left corner
|
||||
👆 [Highlighted on image]
|
||||
|
||||
✅ Difference #2 found: "Window shape is slightly different"
|
||||
📍 Location: Center-right
|
||||
👆 [Highlighted on image]
|
||||
|
||||
...
|
||||
|
||||
🎯 Solved! Found X differences in Y steps.
|
||||
⏱️ Time: Z seconds
|
||||
```
|
||||
|
||||
## Comparison with Other Ideas
|
||||
|
||||
| Aspect | 001 Visual Narrative | 007 Spot the Difference |
|
||||
|--------|---------------------|------------------------|
|
||||
| Visual Analysis | Heavy | **Heavy** |
|
||||
| Reasoning | Medium | **Light** |
|
||||
| Demo Impact | High | **High** |
|
||||
| Gamification | Low | **High** |
|
||||
| Uniqueness | 7/10 | **9/10** |
|
||||
| Step-by-Step | Yes | **Yes (more natural)** |
|
||||
|
||||
## Why Stronger than 001
|
||||
|
||||
1. **Tangible use case** — People actually play spot the difference
|
||||
2. **Clear AI demonstration** — "Watch AI see what you see"
|
||||
3. **Gamification** — Daily puzzle + leaderboard = engagement
|
||||
4. **Low reasoning, high vision** — Perfect for Kimi's strength
|
||||
5. **Step-by-step is natural** — Not forced, it's how you'd solve it
|
||||
|
||||
## Risks
|
||||
|
||||
- ⚠️ Need reliable daily puzzle generation (harder than it sounds)
|
||||
- ⚠️ Kimi analysis quality depends on image complexity
|
||||
- ⚠️ Need diverse puzzle set to not repeat
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [ ] Test Kimi's spot-the-difference capability
|
||||
- [ ] Design puzzle generation pipeline
|
||||
- [ ] Mock up webapp UI
|
||||
- [ ] Prototype step-by-step visualization
|
||||
|
||||
## Related Ideas
|
||||
|
||||
- See: `001-visual-narrative-agent.md`
|
||||
Reference in New Issue
Block a user