Files
hermes-detective/CHANGELOG.md
shoko ecfd0b1160 feat: Initial commit - Hermes Detective Agency concept
- Hermes Detective Agency: Open-ended mystery investigation game
- Roles: Chief (human), Witness (Kimi), Detective (Hermes)
- 5 difficulty levels, community cases, open-ended solving
- Scoring: Alignment %, Evidence %, Time
- Features: Retry, Journal, Observe mode
- Tech: Kimi Vision + Hermes Agent + Pollinations

Changelog:
- Research phase: Kimi capabilities, Hermes agent, image APIs
- Brainstorming: 14 ideas explored
- Comparison matrix: Detective selected as winner
- Concept finalized with all design decisions
2026-04-20 00:00:30 +00:00

205 lines
5.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Session Log
## 2026-04-19
### 001 - Session Start: Hermes Hackathon
**What:** Started Hermes Agent Creative Hackathon collaboration.
**Context:**
- Hackathon: 16 days, $25k prizes (Main $15k, Kimi $5k, $5k Kimi credits)
- Presented by Kimi Moonshot & Nous Research
- Two tracks: Main (any creative use) and Kimi Track (must use Kimi models)
- Deadline: EOD Sunday, May 3rd
**Action:**
- Set up workflow structure (`.issues/`, `docs/`, git init)
- Created first issue file: `001-hermes-hackathon-project.md`
**Next:**
- Define project concept and creative domain focus
- Explore Hermes Agent capabilities
- Sketch initial prototype idea
### 002 - Research Completed
**What:** Validated Kimi and Hermes Agent capabilities.
**Findings:**
- Kimi K2.5: multimodal (text+image+video), video understanding, visual coding
- Kimi benchmarks: SWE-bench 65.8%, Tau2 ~64%
- Hermes 3: function calling, structured output, OpenAI-compatible
- Hermes built-in skills: manim-video, ascii-video, ascii-art (accessibility-focused)
**Action:**
- Created `docs/research-kimi-visual-capabilities.md`
- Created `docs/research-hermes-agent.md`
- Created `docs/research-image-generation-apis.md`
- Updated issue file with research summary
**Next:**
- Define concrete project concept
- Choose specific creative angle (visual coding? video analysis? image generation?)
- Start rapid prototyping
---
### 003 - Image Gen API Research
**What:** Found affordable/free image generation API.
**Findings:**
- **Pollinations AI** ✅: Free tier, OpenAI-compatible, multiple models (Flux, etc.)
- Endpoint: `https://gen.pollinations.ai/image/{prompt}`
- Simple: just curl it, no auth needed for basic
- Models: flux, zimage, wan-image, qwen-image, gptimage
- Cost: Free tier (pollen credits), $1 ≈ 1 Pollen paid
**Action:**
- Created `docs/research-image-generation-apis.md`
- Updated idea 001 with image gen options
**Next:**
- Sketch more project ideas for comparison
- Do idea benchmark matrix
---
### 004 - Brainstorming Session
**What:** Generated 7 project ideas, deeper dive on Idea 007.
**Ideas Generated:**
1. 001: Visual Narrative Agent (text → image loop)
2. 002: Visual Memory Journal (AI scrapbook)
3. 003: Reverse Design Critic (UI critique + fix)
4. 004: Visual Poem Generator (two-AI art collaboration)
5. 005: Scene-to-Scene Video Storyteller (visual journey)
6. 006: Real-time Visual Debugger (screenshot → fix)
7. 007: Spot the Difference Agent (NEW FOCUS)
**User Preferences:**
- Want high visual analysis, low reasoning
- Single page webapp, no auth
- Show step-by-step AI process
- Gamification (leaderboard, daily puzzles)
**Selected for deeper dive:** 007 Vision Puzzle
**Action:**
- Created `docs/ideas/007-vision-spot-the-difference.md`
**Next:**
- Compare all ideas to pick winner
---
### 005 - Ideas Comparison
**What:** Created comparison matrix for all brainstormed ideas.
**Ideas Compared:** 14 concepts across visual games, interactive, and creative
**Scoring Criteria:**
- Visual Analysis (30%)
- Multi-Turn (20%)
- Human-AI Interaction (20%)
- Cost Efficiency (15%)
- Uniqueness (10%)
- Fun (5%)
**Results:**
| Rank | Idea | Score |
|------|------|-------|
| 🥇 | **033v2 Detective** | **4.7** |
| 🥈 | Auction | 3.9 |
| 🥉 | 032v2 Art Critic | 3.7 |
| 4 | 013 Image Alchemy | 3.6 |
| 5 | 009 Image Tarot | 3.5 |
**Winner: 033v2 Detective**
**Why:**
- Best multi-turn (5+ rounds)
- Human actively directs (Chief role)
- Kimi does real visual work
- Cost efficient (mostly text)
- Natural mystery/narrative arc
**Action:**
- Created `docs/ideas/COMPARISON.md`
- Created `docs/ideas/008-visual-detective.md` (includes multi-agent v2)
**Next:**
- Discuss and finalize concept details
---
### 006 - Concept Documented
**What:** Documented chosen 033v2 Detective as `chosen-detective-game.md`.
**Documented:**
- Elevator pitch
- Game roles (Chief, Witness, Detective)
- Evidence types (crime scene, documents, photos, etc.)
- Round structure (7 rounds per case)
- Scoring system
- UI concept sketch
- Difficulty tiers (Rookie → Chief)
- Daily cases + leaderboard
**Action:**
- Created `docs/chosen-detective-game.md`
**Open Questions (for discussion):**
1. How much Witness describes unprompted?
2. Can Detective be wrong?
3. Red herrings — yes/no?
4. Plot twist mid-case?
5. Timer?
6. Replay past cases?
7. Hints system?
8. Skip evidence?
9. Case sources (pre-made/generated)?
10. Image sources (real/AI/illustrated)?
11. Share results?
12. Community cases?
**Next:**
- Discuss and finalize concept details
---
### 007 - Concept Finalized
**What:** Finalized Hermes Detective Agency concept after extensive discussion.
**Key Decisions:**
- **Difficulty:** 5 levels (Easy → Impossible), one case per day
- **Open-ended solving:** No single truth, multiple valid theories
- **Scoring:** Alignment %, Evidence cited %, Time (turns × 10min)
- **Hints:** Embedded in evidence (too obvious, barely obvious, not too obvious)
- **Witness:** Dynamic appearance based on triggers (harder cases)
- **Truth reveal:** Available anytime, doesn't end game
- **Retry:** Unlimited attempts, every documented
- **Journal:** Private by default, publish stats/journal optional
- **Observe:** Watch others' published solves
- **Case source:** 5 starter cases (one per difficulty) + community generation
- **Community:** Visits + reviews (no auth, manipulable but requires effort)
- **Discovery:** Jungle (browse) vs path (direct links from creator)
- **Case format:** YAML-based template
- **Creator tools:** Hermes skill + format validator
**Action:**
- Updated `docs/chosen-detective-game.md` with full finalized concept
**Next:**
- Technical architecture
- UI/UX design
- Prompt engineering
- Prototype development
---