Engineering Notes

Deep-dives into AI infrastructure, prompt engineering, and the tools we build.

2026-05-27
Hermes Agent Ships Security-Guidance Plugin With 25 Pattern-Matched Vulnerability Rules
Hermes Agent ports Anthropic's security-guidance plugin: 25 rules catch unsafe deserialization, command injection, and XSS at write time. Zero LLM tokens.
hermessecuritypluginsagent-safetycode-review
2026-05-27
Cloning Hermes Agent: Every Component, Config, and Cron Job for a Production AI Assistant
Every component, plugin, and cron job in a production Hermes Agent setup, mapped and explained for replication from scratch.
hermeslitellmdeepseekskillspluginsvps
2026-05-27
Agents Fixing Agents — Hermes Enters Production
Hermes Agent crossed a threshold this week: agents fixing agents, resurrecting decade-old phones, and running production workloads on 1M+ visitor sites.
hermesagentsopenclawproductionai
2026-05-26
Hermes Agent Desktop — 20 AI Specialists Collaborate in a Native macOS App
Community desktop app turns Hermes into a 20-specialist multi-agent system with task decomposition, visual Skill Store, and orchestration.
hermesdesktopmulti-agentopen-sourceorchestration
2026-05-26
GBrain + Mnemosyne: Creating the Voltron of Memory Plugins With Hot Cache and Cold Storage
GBrain's 30-page knowledge graph wired into AxDSan's Mnemosyne: a 113-char digest for ambient awareness, and relevance-gated page injection on first turn. Zero MCP calls.
hermesgbrainmnemosynememoryagents
2026-05-26
Async Media Generation in Hermes Agent With OpenRouter Webhooks
Bridging OpenRouter's async video generation webhooks to Hermes Agent with a Cloudflare Worker signature translator and localhost tunnel.
hermesopenrouterwebhookmedia-generationcloudflaretunnel
2026-05-25
GBrain vs Mnemosyne: Architecture, Not Benchmarks
GBrain runs at 250ms per CLI-spawned operation. Mnemosyne runs at 2ms in-process. The difference is architecture, not implementation quality.
hermesmemorybenchmarksgbrainmnemosyne
2026-05-25
Without Consolidation, Memory Is a Log
Mnemosyne's sleep consolidation cycle scored +70.8pp on BEAM multi-hop reasoning with 0.076ms SQLite reads. Zero cloud dependencies, zero cold start.
hermesmemorymnemosynesqlitebenchmarks
2026-05-25
Deterministic Prefix Ordering: How Production Agents Get 90% Prompt Cache Hit Rates
Prompt caching cuts LLM input costs by up to 90% and TTFT by 80%, but only exact-prefix matches count. How production agents structure prompts for maximum cache hits.
hermesllmcachingoptimizationtokensinfrastructure
2026-05-25
Memory Stacks, Desktops, and OpenClaw Migrations — What X Said About Hermes This Week
A 3-layer memory stack earned 183 likes and 270 bookmarks. Hermes Desktop open-sourced. OpenClaw users migrating. Roundup of Hermes Agent on X.
hermesxroundupcommunitymemorydesktop
2026-05-25
Logging Semantic Skill Injection Decisions
Instrumenting the pre_gateway_dispatch hook to log every skill injection decision — score, skip reason, and top-3 matches — so we can tune the threshold with data instead of guessing.
hermessemantic-skillsobservabilityplugins
2026-05-25
Semantic Skills — Skill Pre-Loading That Keeps Prompt Caching Intact
semantic-skills v2 pre-loads skills into user messages via pre_gateway_dispatch hook. System prompt stays at 226 tokens, cache-friendly. No first-turn tool call needed.
hermesskillscachingtokensoptimizationplugin
2026-05-24
CodeGraph Slashes Agent Token Burn by 87% Across Our Repos
hermescodegraphoptimizationtokensmcp
2026-05-24
JIT Skills Slash Hermes System Prompt 95% — Multiplied Across 39 Calls Per Session
hermestokensoptimizationembeddingsplugins
2026-05-23
How Hermes Agent Builds Its System Prompt
A deep dive into the three-tier system prompt architecture inside Hermes Agent — how it works, why it costs 6,000+ tokens per session, and how we cut that by 95%.
Hermes AgentPrompt ArchitectureToken OptimizationAI Engineering

Engineering Notes

Hermes Agent Ships Security-Guidance Plugin With 25 Pattern-Matched Vulnerability Rules

Cloning Hermes Agent: Every Component, Config, and Cron Job for a Production AI Assistant

Agents Fixing Agents — Hermes Enters Production

Hermes Agent Desktop — 20 AI Specialists Collaborate in a Native macOS App

GBrain + Mnemosyne: Creating the Voltron of Memory Plugins With Hot Cache and Cold Storage

Async Media Generation in Hermes Agent With OpenRouter Webhooks

GBrain vs Mnemosyne: Architecture, Not Benchmarks

Without Consolidation, Memory Is a Log

Deterministic Prefix Ordering: How Production Agents Get 90% Prompt Cache Hit Rates

Memory Stacks, Desktops, and OpenClaw Migrations — What X Said About Hermes This Week

Logging Semantic Skill Injection Decisions

Semantic Skills — Skill Pre-Loading That Keeps Prompt Caching Intact

CodeGraph Slashes Agent Token Burn by 87% Across Our Repos

JIT Skills Slash Hermes System Prompt 95% — Multiplied Across 39 Calls Per Session

How Hermes Agent Builds Its System Prompt