Files
hermes-detective/CHANGELOG.md
shoko ecfd0b1160 feat: Initial commit - Hermes Detective Agency concept
- Hermes Detective Agency: Open-ended mystery investigation game
- Roles: Chief (human), Witness (Kimi), Detective (Hermes)
- 5 difficulty levels, community cases, open-ended solving
- Scoring: Alignment %, Evidence %, Time
- Features: Retry, Journal, Observe mode
- Tech: Kimi Vision + Hermes Agent + Pollinations

Changelog:
- Research phase: Kimi capabilities, Hermes agent, image APIs
- Brainstorming: 14 ideas explored
- Comparison matrix: Detective selected as winner
- Concept finalized with all design decisions
2026-04-20 00:00:30 +00:00

5.6 KiB
Raw Blame History

Session Log

2026-04-19

001 - Session Start: Hermes Hackathon

What: Started Hermes Agent Creative Hackathon collaboration.

Context:

  • Hackathon: 16 days, $25k prizes (Main $15k, Kimi $5k, $5k Kimi credits)
  • Presented by Kimi Moonshot & Nous Research
  • Two tracks: Main (any creative use) and Kimi Track (must use Kimi models)
  • Deadline: EOD Sunday, May 3rd

Action:

  • Set up workflow structure (.issues/, docs/, git init)
  • Created first issue file: 001-hermes-hackathon-project.md

Next:

  • Define project concept and creative domain focus
  • Explore Hermes Agent capabilities
  • Sketch initial prototype idea

002 - Research Completed

What: Validated Kimi and Hermes Agent capabilities.

Findings:

  • Kimi K2.5: multimodal (text+image+video), video understanding, visual coding
  • Kimi benchmarks: SWE-bench 65.8%, Tau2 ~64%
  • Hermes 3: function calling, structured output, OpenAI-compatible
  • Hermes built-in skills: manim-video, ascii-video, ascii-art (accessibility-focused)

Action:

  • Created docs/research-kimi-visual-capabilities.md
  • Created docs/research-hermes-agent.md
  • Created docs/research-image-generation-apis.md
  • Updated issue file with research summary

Next:

  • Define concrete project concept
  • Choose specific creative angle (visual coding? video analysis? image generation?)
  • Start rapid prototyping

003 - Image Gen API Research

What: Found affordable/free image generation API.

Findings:

  • Pollinations AI : Free tier, OpenAI-compatible, multiple models (Flux, etc.)
    • Endpoint: https://gen.pollinations.ai/image/{prompt}
    • Simple: just curl it, no auth needed for basic
    • Models: flux, zimage, wan-image, qwen-image, gptimage
    • Cost: Free tier (pollen credits), $1 ≈ 1 Pollen paid

Action:

  • Created docs/research-image-generation-apis.md
  • Updated idea 001 with image gen options

Next:

  • Sketch more project ideas for comparison
  • Do idea benchmark matrix

004 - Brainstorming Session

What: Generated 7 project ideas, deeper dive on Idea 007.

Ideas Generated:

  1. 001: Visual Narrative Agent (text → image loop)
  2. 002: Visual Memory Journal (AI scrapbook)
  3. 003: Reverse Design Critic (UI critique + fix)
  4. 004: Visual Poem Generator (two-AI art collaboration)
  5. 005: Scene-to-Scene Video Storyteller (visual journey)
  6. 006: Real-time Visual Debugger (screenshot → fix)
  7. 007: Spot the Difference Agent (NEW FOCUS)

User Preferences:

  • Want high visual analysis, low reasoning
  • Single page webapp, no auth
  • Show step-by-step AI process
  • Gamification (leaderboard, daily puzzles)

Selected for deeper dive: 007 Vision Puzzle

Action:

  • Created docs/ideas/007-vision-spot-the-difference.md

Next:

  • Compare all ideas to pick winner

005 - Ideas Comparison

What: Created comparison matrix for all brainstormed ideas.

Ideas Compared: 14 concepts across visual games, interactive, and creative

Scoring Criteria:

  • Visual Analysis (30%)
  • Multi-Turn (20%)
  • Human-AI Interaction (20%)
  • Cost Efficiency (15%)
  • Uniqueness (10%)
  • Fun (5%)

Results:

Rank Idea Score
🥇 033v2 Detective 4.7
🥈 Auction 3.9
🥉 032v2 Art Critic 3.7
4 013 Image Alchemy 3.6
5 009 Image Tarot 3.5

Winner: 033v2 Detective

Why:

  • Best multi-turn (5+ rounds)
  • Human actively directs (Chief role)
  • Kimi does real visual work
  • Cost efficient (mostly text)
  • Natural mystery/narrative arc

Action:

  • Created docs/ideas/COMPARISON.md
  • Created docs/ideas/008-visual-detective.md (includes multi-agent v2)

Next:

  • Discuss and finalize concept details

006 - Concept Documented

What: Documented chosen 033v2 Detective as chosen-detective-game.md.

Documented:

  • Elevator pitch
  • Game roles (Chief, Witness, Detective)
  • Evidence types (crime scene, documents, photos, etc.)
  • Round structure (7 rounds per case)
  • Scoring system
  • UI concept sketch
  • Difficulty tiers (Rookie → Chief)
  • Daily cases + leaderboard

Action:

  • Created docs/chosen-detective-game.md

Open Questions (for discussion):

  1. How much Witness describes unprompted?
  2. Can Detective be wrong?
  3. Red herrings — yes/no?
  4. Plot twist mid-case?
  5. Timer?
  6. Replay past cases?
  7. Hints system?
  8. Skip evidence?
  9. Case sources (pre-made/generated)?
  10. Image sources (real/AI/illustrated)?
  11. Share results?
  12. Community cases?

Next:

  • Discuss and finalize concept details

007 - Concept Finalized

What: Finalized Hermes Detective Agency concept after extensive discussion.

Key Decisions:

  • Difficulty: 5 levels (Easy → Impossible), one case per day
  • Open-ended solving: No single truth, multiple valid theories
  • Scoring: Alignment %, Evidence cited %, Time (turns × 10min)
  • Hints: Embedded in evidence (too obvious, barely obvious, not too obvious)
  • Witness: Dynamic appearance based on triggers (harder cases)
  • Truth reveal: Available anytime, doesn't end game
  • Retry: Unlimited attempts, every documented
  • Journal: Private by default, publish stats/journal optional
  • Observe: Watch others' published solves
  • Case source: 5 starter cases (one per difficulty) + community generation
  • Community: Visits + reviews (no auth, manipulable but requires effort)
  • Discovery: Jungle (browse) vs path (direct links from creator)
  • Case format: YAML-based template
  • Creator tools: Hermes skill + format validator

Action:

  • Updated docs/chosen-detective-game.md with full finalized concept

Next:

  • Technical architecture
  • UI/UX design
  • Prompt engineering
  • Prototype development