feat: Initial commit - Hermes Detective Agency concept
- Hermes Detective Agency: Open-ended mystery investigation game - Roles: Chief (human), Witness (Kimi), Detective (Hermes) - 5 difficulty levels, community cases, open-ended solving - Scoring: Alignment %, Evidence %, Time - Features: Retry, Journal, Observe mode - Tech: Kimi Vision + Hermes Agent + Pollinations Changelog: - Research phase: Kimi capabilities, Hermes agent, image APIs - Brainstorming: 14 ideas explored - Comparison matrix: Detective selected as winner - Concept finalized with all design decisions
This commit is contained in:
79
docs/ideas/001-visual-narrative-agent.md
Normal file
79
docs/ideas/001-visual-narrative-agent.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# Idea 001: Visual Narrative Agent
|
||||
|
||||
**Date:** 2026-04-19
|
||||
**Status:** Idea
|
||||
**Tags:** hermes-agent, kimi-vision, storytelling, image-generation
|
||||
|
||||
## Concept
|
||||
|
||||
An agentic storytelling system where Hermes orchestrates a narrative loop with Kimi's visual analysis and built-in image generation skills to produce coherent visual stories.
|
||||
|
||||
## User Flow
|
||||
|
||||
1. User provides text prompt (e.g., "A lone astronaut discovers an ancient alien garden on Mars")
|
||||
2. Hermes plans story structure (scenes, pacing, visual style)
|
||||
3. For each scene:
|
||||
- Hermes generates image prompt
|
||||
- Generate image (Hermes built-in skill: manim / ascii)
|
||||
- Kimi analyzes generated image
|
||||
- Kimi's feedback refines next scene's prompt
|
||||
4. Return compiled visual story to user
|
||||
|
||||
## Key Differentiator
|
||||
|
||||
Most story-to-image tools: **Generate → Done**
|
||||
|
||||
This concept: **Generate → Analyze → Refine → Loop**
|
||||
|
||||
Kimi serves as the **visual reasoning engine** — tells Hermes if the generated image matches the intended scene, catches inconsistencies, and informs prompt refinement for the next scene.
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Component | Source | Role |
|
||||
|-----------|--------|------|
|
||||
| Hermes Agent | Nous Research | Orchestration, planning, decision loop |
|
||||
| Kimi Vision | Moonshot AI (via gateway) | Image analysis, visual feedback |
|
||||
| Image Generation | Pollinations AI | Free tier, multiple models (Flux, etc.) |
|
||||
|
||||
### Image Generation Options
|
||||
|
||||
| Provider | Free Tier | Quality | Use Case |
|
||||
|---------|-----------|---------|----------|
|
||||
| **Pollinations** ✅ | ✅ Yes | Good | Primary (simple, free) |
|
||||
| **Flux (local)** | ✅ Free | High | If GPU available |
|
||||
| **Hermes skills** | ✅ Free | Niche | Fallback/ASCII aesthetic |
|
||||
|
||||
### Pollinations API (Primary)
|
||||
- **Endpoint:** `https://gen.pollinations.ai/image/{prompt}`
|
||||
- **Models:** flux, zimage, wan-image, qwen-image, etc.
|
||||
- **Cost:** Free tier (pollen credits), ~$1/1 Pollen paid
|
||||
- **Auth:** Optional for free tier
|
||||
|
||||
## Strengths
|
||||
|
||||
- ✅ Combines Hermes + Kimi + Pollinations natively
|
||||
- ✅ Agentic visual feedback loop is unique
|
||||
- ✅ Visual coherence check via Kimi ensures quality
|
||||
- ✅ Free tier = low barrier to test
|
||||
- ✅ User controls output format (default: image)
|
||||
|
||||
## Weaknesses
|
||||
|
||||
- ⚠️ Pollinations quality vs DALL-E/Midjourney (may need to test)
|
||||
- ⚠️ Kimi requires gateway access (no direct API key)
|
||||
- ⚠️ Loop adds latency (generate → analyze → refine)
|
||||
- ⚠️ Need to verify Pollinations reliability
|
||||
|
||||
## Uniqueness Score
|
||||
|
||||
**7/10** — Agentic visual feedback loop is novel, but need to verify if built-in image generation is compelling enough
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [ ] Explore Hermes built-in image skills (manim, ascii)
|
||||
- [ ] Define output format options
|
||||
- [ ] Sketch technical architecture
|
||||
|
||||
## Related Ideas
|
||||
|
||||
- See: `002-xxx.md`, `003-xxx.md` for alternatives
|
||||
138
docs/ideas/007-vision-spot-the-difference.md
Normal file
138
docs/ideas/007-vision-spot-the-difference.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# Idea 007: Spot the Difference Agent
|
||||
|
||||
**Date:** 2026-04-19
|
||||
**Status:** Idea
|
||||
**Tags:** hermes-agent, kimi-vision, puzzle, gamification, webapp
|
||||
|
||||
## Concept
|
||||
|
||||
A daily "Spot the Difference" puzzle webapp where AI (Kimi + Hermes) analyzes two images and shows its step-by-step process in finding the differences.
|
||||
|
||||
**Core insight:** Use visual analysis strength, minimize reasoning load.
|
||||
|
||||
## User Flow
|
||||
|
||||
1. User opens webapp → sees today's "Spot the Difference" puzzle (two similar images)
|
||||
2. User can play manually (click on differences) OR
|
||||
3. User clicks "Let AI Solve" → watches AI's step-by-step analysis
|
||||
4. AI shows its reasoning process: "Scanning left-to-right... Found difference #1: color mismatch in top-left..."
|
||||
5. Leaderboard shows attempt stats (anonymous)
|
||||
|
||||
## Why This Works
|
||||
|
||||
| Aspect | Implementation |
|
||||
|--------|----------------|
|
||||
| **Visual Analysis** | Kimi compares images pixel-level + semantic |
|
||||
| **Low Reasoning** | Pattern matching, not complex logic |
|
||||
| **Step-by-Step** | Show each finding with visual highlight |
|
||||
| **Gamification** | Daily puzzle, leaderboard, no auth |
|
||||
|
||||
## Puzzle Types
|
||||
|
||||
### Primary: Spot the Difference (v1)
|
||||
- Two images with subtle differences
|
||||
- Kimi identifies all differences
|
||||
- Each found difference highlighted + explanation
|
||||
|
||||
### Secondary (future):
|
||||
- Find the anomaly (what's wrong in this image?)
|
||||
- Count the objects (how many X in this image?)
|
||||
- What's different? (semantic analysis)
|
||||
|
||||
## Technical Stack
|
||||
|
||||
| Component | Source | Role |
|
||||
|-----------|--------|------|
|
||||
| Frontend | Single HTML page | Display puzzle, show AI process |
|
||||
| Image Analysis | Kimi Vision (via gateway) | Compare images, find differences |
|
||||
| Orchestration | Hermes Agent | Coordinate flow, format output |
|
||||
| Image Gen | Pollinations AI | Generate daily puzzle pairs |
|
||||
|
||||
### Daily Puzzle Generation
|
||||
```
|
||||
Hermes + Pollinations → Generate base image
|
||||
Hermes + Pollinations → Generate modified image (with subtle changes)
|
||||
Store both → Serve to users daily
|
||||
```
|
||||
|
||||
### AI Solving Process
|
||||
```
|
||||
1. Hermes receives both images
|
||||
2. Send to Kimi Vision for analysis
|
||||
3. Kimi returns list of differences with locations
|
||||
4. Hermes formats step-by-step explanation
|
||||
5. Frontend animates each finding
|
||||
```
|
||||
|
||||
## Features
|
||||
|
||||
### Core
|
||||
- [ ] Daily puzzle auto-rotates
|
||||
- [ ] Two-image display (side by side)
|
||||
- [ ] "Let AI Solve" button
|
||||
- [ ] Step-by-step visualization of AI findings
|
||||
- [ ] Show each difference with highlight + explanation
|
||||
|
||||
### Gamification (no auth)
|
||||
- [ ] Attempt counter (per user session, localStorage)
|
||||
- [ ] Leaderboard (anonymous, session-based)
|
||||
- [ ] "Perfect solve" badge (AI found all differences on first pass)
|
||||
|
||||
### Nice to Have
|
||||
- [ ] Difficulty levels (Easy/Medium/Hard)
|
||||
- [ ] Share result as image
|
||||
- [ ] Hint system (Kimi finds 1, user finds rest)
|
||||
|
||||
## Step-by-Step Output Format
|
||||
|
||||
```
|
||||
🔍 Scanning image...
|
||||
✅ Difference #1 found: "The lamp color changed from blue to red"
|
||||
📍 Location: Top-left corner
|
||||
👆 [Highlighted on image]
|
||||
|
||||
✅ Difference #2 found: "Window shape is slightly different"
|
||||
📍 Location: Center-right
|
||||
👆 [Highlighted on image]
|
||||
|
||||
...
|
||||
|
||||
🎯 Solved! Found X differences in Y steps.
|
||||
⏱️ Time: Z seconds
|
||||
```
|
||||
|
||||
## Comparison with Other Ideas
|
||||
|
||||
| Aspect | 001 Visual Narrative | 007 Spot the Difference |
|
||||
|--------|---------------------|------------------------|
|
||||
| Visual Analysis | Heavy | **Heavy** |
|
||||
| Reasoning | Medium | **Light** |
|
||||
| Demo Impact | High | **High** |
|
||||
| Gamification | Low | **High** |
|
||||
| Uniqueness | 7/10 | **9/10** |
|
||||
| Step-by-Step | Yes | **Yes (more natural)** |
|
||||
|
||||
## Why Stronger than 001
|
||||
|
||||
1. **Tangible use case** — People actually play spot the difference
|
||||
2. **Clear AI demonstration** — "Watch AI see what you see"
|
||||
3. **Gamification** — Daily puzzle + leaderboard = engagement
|
||||
4. **Low reasoning, high vision** — Perfect for Kimi's strength
|
||||
5. **Step-by-step is natural** — Not forced, it's how you'd solve it
|
||||
|
||||
## Risks
|
||||
|
||||
- ⚠️ Need reliable daily puzzle generation (harder than it sounds)
|
||||
- ⚠️ Kimi analysis quality depends on image complexity
|
||||
- ⚠️ Need diverse puzzle set to not repeat
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [ ] Test Kimi's spot-the-difference capability
|
||||
- [ ] Design puzzle generation pipeline
|
||||
- [ ] Mock up webapp UI
|
||||
- [ ] Prototype step-by-step visualization
|
||||
|
||||
## Related Ideas
|
||||
|
||||
- See: `001-visual-narrative-agent.md`
|
||||
397
docs/ideas/008-visual-detective.md
Normal file
397
docs/ideas/008-visual-detective.md
Normal file
@@ -0,0 +1,397 @@
|
||||
# Idea 008: Visual Detective
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload a "crime scene" or mystery image. Kimi analyzes every detail. Hermes pieces together clues and generates a detective story/hypothesis.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Heavy visual analysis (Kimi reads the scene)
|
||||
- Low reasoning (observation, not complex logic)
|
||||
- Storytelling naturally fits step-by-step
|
||||
- Mystery genre = engaging
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Upload image (or get random daily mystery)
|
||||
2. Kimi: "I see a broken window, muddy footprints, overturned chair..."
|
||||
3. Hermes: "Based on these clues, here's what likely happened..."
|
||||
4. Output: Detective story with visual evidence
|
||||
|
||||
## Tech
|
||||
|
||||
- Kimi Vision: Scene analysis
|
||||
- Hermes: Narrative orchestration
|
||||
- Pollinations: Generate mystery images
|
||||
|
||||
## Unique?
|
||||
|
||||
- Nobody's doing "AI detective" with your photos
|
||||
- Could be daily mystery + community solving
|
||||
|
||||
---
|
||||
|
||||
## 009: Image Tarot Reader
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload any image. AI interprets it like a tarot card reading.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Fun/flirty, low stakes
|
||||
- Heavy visual analysis (Kimi interprets symbolism)
|
||||
- Storytelling fits perfectly
|
||||
- Shareable results
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Upload image OR random draw
|
||||
2. Kimi: Analyzes composition, colors, objects, mood
|
||||
3. Hermes: "This represents [Tarot card]. Your reading: [Narrative]"
|
||||
4. Output: Tarot card + 3-card spread interpretation
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
```
|
||||
🃏 Drawing your card...
|
||||
👁️ Analyzing your image...
|
||||
|
||||
Visual Elements Detected:
|
||||
• A winding road (path in life)
|
||||
• Setting sun (endings/new beginnings)
|
||||
• Standing figure (you, the observer)
|
||||
|
||||
🎴 Your Card: The Fool
|
||||
Interpretation: A new journey awaits. Trust the path ahead...
|
||||
|
||||
Past: Confusion about direction
|
||||
Present: Standing at the crossroads
|
||||
Future: Leap of faith required
|
||||
```
|
||||
|
||||
## Tech
|
||||
|
||||
- Kimi Vision: Symbol analysis
|
||||
- Hermes: Tarot narrative generation
|
||||
- Pollinations: Generate thematic card visuals
|
||||
|
||||
---
|
||||
|
||||
## 010: Color Emotion Translator
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload image. AI analyzes dominant colors and translates them into emotions/mood.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Pure visual analysis
|
||||
- Art/design focused
|
||||
- Generates color palette + emotion report
|
||||
- Useful for designers
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Upload image
|
||||
2. Kimi: Extracts colors, analyzes saturation, harmony
|
||||
3. Hermes: Translates to emotions, generates palette
|
||||
4. Output: Color palette + emotion breakdown + suggested uses
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
```
|
||||
🔍 Scanning colors...
|
||||
🎨 Extracting dominant palette...
|
||||
|
||||
Detected Colors:
|
||||
• #2D4A3E (Deep Forest Green) - 45%
|
||||
• #F5E6D3 (Warm Cream) - 30%
|
||||
• #8B4513 (Saddle Brown) - 15%
|
||||
• #CD853F (Peru Gold) - 10%
|
||||
|
||||
🎭 Emotional Profile:
|
||||
Primary: Grounded, natural, calm
|
||||
Secondary: Warm, nostalgic, organic
|
||||
Accent: Vintage, artisanal, trustworthy
|
||||
|
||||
💡 Recommendations:
|
||||
• Brand Identity for eco-friendly products
|
||||
• Interior design: cozy cabin aesthetic
|
||||
• Packaging: artisanal food products
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 011: Before/After Time Machine
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload an old/historical photo. AI shows what it would look like today or vice versa.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Historical/educational angle
|
||||
- Visual transformation is compelling
|
||||
- Shows AI's understanding of time/changes
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Upload old OR new photo
|
||||
2. Select transformation direction
|
||||
3. Kimi: Analyzes context, era, subject
|
||||
4. Hermes: Predicts/adapts to target era
|
||||
5. Output: Side-by-side transformation
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
```
|
||||
📸 Analyzing source image...
|
||||
📅 Detected era: 1950s New York Street
|
||||
|
||||
Identifying elements:
|
||||
• Black & white photography style
|
||||
• Vintage automobiles (1950s models)
|
||||
• Fashion: fedoras, swing coats
|
||||
• Architecture: Art Deco buildings
|
||||
|
||||
🔮 Projecting to 2024...
|
||||
|
||||
Transformation breakdown:
|
||||
• Colorization: Added natural skin tones + sky colors
|
||||
• Vehicles: Replaced with modern equivalents
|
||||
• Architecture: Updated signage, added modern elements
|
||||
• Fashion: Modernized while preserving style
|
||||
|
||||
✨ Your 1950s scene in 2024!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 012: Visual Haiku Generator
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload any image. AI generates a haiku (5-7-5) based on visual elements.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Minimal reasoning, pure visual
|
||||
- Artistic/creative output
|
||||
- Japanese aesthetic + AI = unique
|
||||
- Highly shareable
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Upload image
|
||||
2. Kimi: Analyzes scene, mood, elements
|
||||
3. Hermes: Crafts haiku (strict 5-7-5)
|
||||
4. Output: Image + haiku + syllable breakdown
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
```
|
||||
🖼️ Analyzing your image...
|
||||
|
||||
Scene Elements:
|
||||
• Autumn forest path
|
||||
• Golden leaves falling
|
||||
• Soft morning light through trees
|
||||
|
||||
✍️ Crafting haiku...
|
||||
|
||||
Forest whispers
|
||||
Golden footsteps on leaves—
|
||||
Silence speaks loud
|
||||
|
||||
📝 Syllable breakdown:
|
||||
"Forest" (2) - whisper (2)
|
||||
s(1) - il(1) -ence (1) - speaks (1) - loud (1)
|
||||
"Golden" (2) - foot (1) -steps (1) - on (1) - leaves (1)
|
||||
(5) - (7) - (5) ✅
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 013: Image Alchemy
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload two random images. AI "fuses" them into a new concept based on their shared elements.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Surprising/comedic combinations
|
||||
- Pure visual + semantic analysis
|
||||
- Unique creative output
|
||||
- Viral potential
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Upload image A (or random)
|
||||
2. Upload image B (or random)
|
||||
3. Kimi: Analyzes both separately
|
||||
4. Hermes: Finds connections, creates fusion
|
||||
5. Output: New concept + fused image prompt
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
```
|
||||
🌀 Analyzing Image A: A Viking ship
|
||||
• Norse aesthetic
|
||||
• Ocean voyage
|
||||
• Historical warrior culture
|
||||
|
||||
🌀 Analyzing Image B: A Coffee shop
|
||||
• Cozy atmosphere
|
||||
• Barista craft
|
||||
• Modern social space
|
||||
|
||||
🔮 Alchemizing...
|
||||
|
||||
Found connections:
|
||||
• Craft (warrior's craft → barista's craft)
|
||||
• Ritual (battle ritual → coffee ritual)
|
||||
• Journey (ocean voyage → daily commute)
|
||||
|
||||
⚗️ Alchemy Result:
|
||||
|
||||
"THE VIKING BARISTA"
|
||||
|
||||
A warrior of the morning,
|
||||
steering through storms of exhaustion,
|
||||
claiming the sacred cup.
|
||||
|
||||
Your coffee shop serves mead in horn-shaped mugs,
|
||||
the barista wears a helmet of foam,
|
||||
and every latte is a conquest.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 014: Visual Lie Detector
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload a photo + claim. AI analyzes if the image supports or contradicts the claim.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Useful in era of fake news
|
||||
- Pure visual verification
|
||||
- Educational about image analysis
|
||||
- "Is this real?" tool
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Paste claim + upload image
|
||||
2. Kimi: Analyzes image details
|
||||
3. Hermes: Compares claim vs evidence
|
||||
4. Output: Verdict + reasoning
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
```
|
||||
🔍 Analyzing claim: "This photo was taken in Paris"
|
||||
|
||||
🔬 Image Analysis:
|
||||
• Architecture: Haussmannian buildings ✓
|
||||
• Street signs: French ✓
|
||||
• License plates: European format ✓
|
||||
• Language: French on signs ✓
|
||||
• Vegetation: Consistent with Paris climate ✓
|
||||
• Shadows: Consistent with claimed time of day ✓
|
||||
|
||||
✅ VERDICT: LIKELY AUTHENTIC
|
||||
|
||||
Confidence: 94%
|
||||
Supporting evidence: 8/8 elements match
|
||||
Caveats: Metadata not verified
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 015: Object Archaeology
|
||||
|
||||
**Date:** 2026-04-19
|
||||
|
||||
## Concept
|
||||
|
||||
Upload an object close-up. AI identifies it, tells its history/story.
|
||||
|
||||
## Why Strong
|
||||
|
||||
- Educational
|
||||
- Heavy visual (identification + knowledge)
|
||||
- Discovery/antiquities angle
|
||||
- Could work with museum APIs
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Upload object photo
|
||||
2. Kimi: Visual identification + details
|
||||
3. Hermes: Tells object's "story"
|
||||
4. Output: Identity + history narrative
|
||||
|
||||
## Step-by-Step
|
||||
|
||||
```
|
||||
🔍 Scanning object...
|
||||
|
||||
Visual Analysis:
|
||||
• Material: Ceramic
|
||||
• Style: Ming Dynasty blue and white
|
||||
• Pattern: Dragon with cloud motifs
|
||||
• Technique: Underglaze blue
|
||||
|
||||
🏺 Object Identified:
|
||||
Ming Dynasty (1368-1644) Blue and White Porcelain
|
||||
Dragon Pattern Bowl
|
||||
|
||||
📜 The Story:
|
||||
This bowl was crafted during the reign of Emperor Wanli,
|
||||
at the height of Jingdezhen's porcelain production.
|
||||
The dragon motif signifies imperial power and protection...
|
||||
|
||||
[Full historical narrative]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Comparison Matrix
|
||||
|
||||
| # | Name | Visual | Reasoning | Uniqueness | Fun |
|
||||
|---|------|--------|-----------|------------|-----|
|
||||
| 007 | Spot the Difference | Heavy | Light | 9/10 | 8/10 |
|
||||
| 008 | Visual Detective | Heavy | Light | 8/10 | 9/10 |
|
||||
| 009 | Image Tarot | Heavy | Light | 8/10 | 10/10 |
|
||||
| 010 | Color Emotion | Medium | Light | 7/10 | 7/10 |
|
||||
| 011 | Before/After | Heavy | Medium | 8/10 | 8/10 |
|
||||
| 012 | Visual Haiku | Heavy | Light | 9/10 | 8/10 |
|
||||
| 013 | Image Alchemy | Heavy | Light | 10/10 | 10/10 |
|
||||
| 014 | Lie Detector | Heavy | Medium | 9/10 | 8/10 |
|
||||
| 015 | Object Archaeology | Heavy | Medium | 8/10 | 8/10 |
|
||||
|
||||
---
|
||||
|
||||
**My top picks for uniqueness + fun:**
|
||||
1. **013 Image Alchemy** — Most unique, viral potential
|
||||
2. **009 Image Tarot** — Fun, shareable, low friction
|
||||
3. **007 Spot the Difference** — Game + AI demonstration
|
||||
4. **014 Visual Lie Detector** — Useful, educational
|
||||
|
||||
What stands out to you?
|
||||
132
docs/ideas/COMPARISON.md
Normal file
132
docs/ideas/COMPARISON.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Ideas Comparison Matrix
|
||||
|
||||
**Date:** 2026-04-19
|
||||
**Purpose:** Compare all ideas to select final concept
|
||||
|
||||
---
|
||||
|
||||
## Scoring Criteria
|
||||
|
||||
| Criteria | Weight | Description |
|
||||
|----------|--------|-------------|
|
||||
| **Visual Analysis** | 30% | Heavy Kimi use (aligned with Kimi's strength) |
|
||||
| **Multi-Turn** | 20% | Not single-turn, builds over time |
|
||||
| **Human-AI Interaction** | 20% | Human participates, not passive |
|
||||
| **Cost Efficiency** | 15% | Low API costs (image gen vs analysis) |
|
||||
| **Uniqueness** | 10% | Stand out from competitors |
|
||||
| **Fun/Engagement** | 5% | Enjoyable to play/watch |
|
||||
|
||||
**Scoring:** 1-5 (5 = best)
|
||||
|
||||
---
|
||||
|
||||
## Full Comparison Matrix
|
||||
|
||||
| # | Idea | Visual | Multi-Turn | Human-AI | Cost | Unique | Fun | **Total** |
|
||||
|---|------|--------|------------|----------|------|--------|-----|-----------|
|
||||
| 001 | Visual Narrative Agent | 4 | 4 | 3 | 2 | 3 | 4 | **3.5** |
|
||||
| 002 | Visual Memory Journal | 3 | 3 | 2 | 3 | 4 | 3 | **3.0** |
|
||||
| 003 | Design Critic | 3 | 2 | 2 | 3 | 2 | 3 | **2.6** |
|
||||
| 004 | Visual Poem | 4 | 2 | 2 | 3 | 4 | 4 | **3.2** |
|
||||
| 005 | Scene Journey | 4 | 3 | 2 | 2 | 3 | 4 | **3.2** |
|
||||
| 007 | Spot the Difference | 4 | 2 | 3 | 2 | 4 | 5 | **3.4** |
|
||||
| 008 | Visual Detective | 4 | 3 | 2 | 3 | 4 | 4 | **3.5** |
|
||||
| 009 | Image Tarot | 4 | 2 | 3 | 3 | 4 | 5 | **3.5** |
|
||||
| 013 | Image Alchemy | 4 | 2 | 3 | 2 | 5 | 5 | **3.6** |
|
||||
| 014 | Lie Detector | 4 | 2 | 3 | 3 | 4 | 4 | **3.4** |
|
||||
| 032v2 | Art Critic | 5 | 3 | 3 | 3 | 3 | 4 | **3.7** |
|
||||
| **033v2** | **Detective** | **5** | **5** | **5** | **4** | **4** | **5** | **4.7** |
|
||||
| 035 | Guess Artist | 5 | 2 | 3 | 3 | 3 | 4 | **3.5** |
|
||||
| Auction | Auction | 3 | 4 | 5 | 4 | 4 | 4 | **3.9** |
|
||||
|
||||
---
|
||||
|
||||
## Top Contenders
|
||||
|
||||
| Rank | Idea | Score | Key Strengths |
|
||||
|------|------|-------|---------------|
|
||||
| 🥇 | **033v2 Detective** | **4.7** | Best multi-turn, human directs, Kimi does real work |
|
||||
| 🥈 | Auction | 3.9 | Human describes, human engages, cheap |
|
||||
| 🥉 | 032v2 Art Critic | 3.7 | Kimi visual analysis, multi-turn |
|
||||
| 4 | 013 Image Alchemy | 3.6 | Most unique, viral potential |
|
||||
| 5 | 009 Image Tarot | 3.5 | Fun, shareable |
|
||||
|
||||
---
|
||||
|
||||
## 033v2 Detective — Why It Wins
|
||||
|
||||
### Alignment with User Goals
|
||||
|
||||
| User Goal | How Detective Meets It |
|
||||
|-----------|----------------------|
|
||||
| Heavy visual analysis | Kimi analyzes each piece of evidence |
|
||||
| Low reasoning | Pattern matching, not complex logic |
|
||||
| Multi-turn | 5-7 rounds per case |
|
||||
| Human-AI collaboration | Human (Chief) directs the investigation |
|
||||
| Cost efficient | Mostly text between Kimi calls |
|
||||
| Fun/engagement | Mystery + competition |
|
||||
|
||||
### What Makes It Special
|
||||
|
||||
1. **Natural two-agent roles:** Witness (sees) + Detective (thinks)
|
||||
2. **Human as boss:** Chief directs investigation, not passive observer
|
||||
3. **Multi-turn structure:** Each round builds the case
|
||||
4. **Kimi's strength shines:** Visual evidence analysis is the core mechanic
|
||||
5. **Scoring system:** Track cases solved, rounds taken, accuracy
|
||||
|
||||
### Comparison to Other Games
|
||||
|
||||
| Aspect | Spot the Difference | Tarot | Alchemy | **Detective** |
|
||||
|--------|-------------------|-------|---------|---------------|
|
||||
| Visual Analysis | 4 | 4 | 4 | **5** |
|
||||
| Multi-Turn | 2 | 2 | 2 | **5** |
|
||||
| Human Role | Judge | Receive | Submit | **Direct** |
|
||||
| Narrative | None | Story | Surprise | **Full Mystery** |
|
||||
| Replayability | Medium | Low | Medium | **High** |
|
||||
|
||||
---
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Go with 033v2 Detective.**
|
||||
|
||||
### Why Not Others
|
||||
|
||||
| Idea | Why Not |
|
||||
|------|---------|
|
||||
| 001 Visual Narrative | Too similar to others, high cost |
|
||||
| 007 Spot Difference | Fun but shallow (1-turn) |
|
||||
| 009 Image Tarot | Not really interactive |
|
||||
| 013 Image Alchemy | Unique but single interaction |
|
||||
| Auction | Good but less "AI demonstration" |
|
||||
|
||||
### Detective's Edge
|
||||
|
||||
- **Multi-turn** = not just a quick demo
|
||||
- **Human directs** = active participation
|
||||
- **Kimi sees evidence** = clear AI capability showcase
|
||||
- **Cost efficient** = mostly text
|
||||
- **Daily cases** = reason to return
|
||||
|
||||
---
|
||||
|
||||
## Next Steps for 033v2 Detective
|
||||
|
||||
- [ ] Define case structure (5-7 evidence images)
|
||||
- [ ] Design Chief interface (what buttons/actions)
|
||||
- [ ] Plan Witness + Detective prompts
|
||||
- [ ] Mock up UI
|
||||
- [ ] Prototype with one case
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Ideas That Could Combine with Detective
|
||||
|
||||
### Detective + Art Critic
|
||||
Two types of daily content: Mystery case OR Art analysis
|
||||
|
||||
### Detective + Auction
|
||||
Hybrid mode: Evidence auction where Chief describes to Detective
|
||||
|
||||
### Detective + Spot Difference
|
||||
Mini-game within case: "Find the clue hidden in this photo"
|
||||
Reference in New Issue
Block a user