Files

shoko ecfd0b1160 feat: Initial commit - Hermes Detective Agency concept

- Hermes Detective Agency: Open-ended mystery investigation game
- Roles: Chief (human), Witness (Kimi), Detective (Hermes)
- 5 difficulty levels, community cases, open-ended solving
- Scoring: Alignment %, Evidence %, Time
- Features: Retry, Journal, Observe mode
- Tech: Kimi Vision + Hermes Agent + Pollinations

Changelog:
- Research phase: Kimi capabilities, Hermes agent, image APIs
- Brainstorming: 14 ideas explored
- Comparison matrix: Detective selected as winner
- Concept finalized with all design decisions

2026-04-20 00:00:30 +00:00

1.4 KiB

Raw Blame History

Research: Hermes Agent Capabilities

Date: 2026-04-19
Purpose: Understand Hermes Agent framework for hackathon integration

Hermes 3 (Nous Research)

Core Capabilities

Advanced agentic capabilities
Reliable function calling - Trained specifically for tool use
Structured output - JSON mode / Pydantic schemas
ChatML prompt format - OpenAI-compatible
Multi-turn conversation
Long context coherence

Benchmark Performance

Benchmark	Hermes 3 Score
IFEval (0-shot)	61.70%
MMLU-Redux	92.7%
MMLU-Pro	81.1%
SimpleQA	31.0%

Function Calling

Trained on specific prompts for tool use
XML-based tool call format: <tool_call>{"name": "...", "arguments": {...}}</tool_call>
Supports recursive/chain tool calls
Native tool integration via NousResearch/Hermes-Function-Calling repo

Hermes Agent Framework

Key Components

ChatML format - Structured system/user/assistant turns
Tool definitions - JSON schema for function signatures
Tool parsing - Parse and execute function calls
Response loop - Multi-turn agentic execution

Integration Points

HuggingFace Transformers
vLLM inference
Ollama local deployment
OpenAI-compatible API

Sources

https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-8B
https://github.com/NousResearch/Hermes-Function-Calling
https://arxiv.org/abs/2408.11857 (Hermes 3 Technical Report)

1.4 KiB Raw Blame History

Research: Hermes Agent Capabilities

Hermes 3 (Nous Research)

Core Capabilities

Benchmark Performance

Function Calling

Hermes Agent Framework

Key Components

Integration Points

Sources

1.4 KiB

Raw Blame History