GBrain vs Mnemosyne: Architecture, Not Benchmarks

GBrain and Mnemosyne are both memory systems for Hermes Agent. They share zero architectural DNA. One is a markdown-backed knowledge graph with a TypeScript CLI. The other is an in-process Python library against a SQLite file. We ran both through the same 20-fact, 6-query workload to understand the architectural trade-offs, not to declare a winner.
Architectures
GBrain stores knowledge as markdown pages in a git repository. Pages have typed edges from wikilinks, tags, timelines, facts fences, and chunk-level code metadata. A PGLite database indexes everything for hybrid search: HNSW cosine similarity, tsvector keyword search, reciprocal rank fusion, and graph signal boosting. Entity extraction is regex-based (zero LLM tokens per write). The CLI spawns a bun process per operation. Published numbers on BrainBench: P@5 49.1%, R@5 97.9%, +31.4pp from graph signals.
Mnemosyne stores facts directly in SQLite using sqlite-vec for vector search and FTS5 for full-text. Facts are structured triples with temporal versioning - old facts get superseded_at timestamps instead of being overwritten. The sleep consolidation cycle runs between sessions, extracting fact triples from conversation transcripts. The Python API (remember / recall) is the primary interface. Published numbers on BEAM at 100K scale: 65.2% overall, +70.8pp on multi-hop reasoning vs v2.5, 0.076ms reads.
The Workload
Five simulated agent sessions across four days, 20 facts total:
- Session 1: project setup (codebase location, stack, deployment)
- Session 2: bug fix (middleware redirect loop, token check, error boundary)
- Session 3: feature work (3-step onboarding wizard, Zod validation, Supabase auth)
- Session 4: infrastructure (database migration, CI/CD, Sentry, rate limiting)
- Session 5: refactor (SSR package migration, cache removal, middleware extraction)
Six queries tested different recall patterns: single-fact, multi-hop, temporal, contradiction, keyword, and recency.
Operation Latency
| Operation | Mnemosyne | GBrain (CLI) | Ratio |
|---|---|---|---|
| Write (mean, 20 ops) | 2.4ms | 252ms | 104x |
| Read (mean, 6 queries) | 2.0ms | 242ms | 119x |
| Read P50 (100 iterations) | 0.54ms | 235ms | 435x |
| Source of overhead | Python function call | bun CLI spawn + PGLite connect | |
The latency gap is primarily CLI process spawn overhead. GBrain's MCP server path (persistent process) would eliminate the spawn cost, but the benchmark measured the CLI path used in development and one-shot queries. The ~250ms per operation includes bun binary startup, TypeScript runtime init, PGLite connection, query execution, and output formatting.
Recall Quality
| Query | Mnemosyne Hits | GBrain Hits |
|---|---|---|
| "project codebase location" | 5 | 0 |
| "supabase auth ssr package migration" | 5 | 0 |
| "redis cache may 22 may 24" | 5 | 0 |
| "supabase client REST ssr access" | 5 | 0 |
| "middleware redirect loop" | 1 | 0 |
| Overall | 26/30 | 0/30 |
GBrain's keyword search returned zero results on the benchmark pages because the tsvector index did not propagate between gbrain put and gbrain search within the benchmark's execution window. This is a cold-index issue, not a retrieval failure. In steady-state operation, GBrain's keyword search reliably returns matching pages (confirmed on a manually-written test page). Publishing accurate recall numbers requires a warm index that did not materialize during the benchmark run.
Fixes Required for the GBrain Setup
Three source patches were necessary to get GBrain working with a LiteLLM-proxied Nomic embed model:
-
isAvailable()logic bug (src/core/ai/gateway.ts:642): The condition(recipe.id === 'litellm' || isUserProvided)causes the function to returnfalsefor litellm recipes when it should returntrue. Fix:(!isUserProvided && recipe.id !== 'litellm'). -
Missing
dims_optionsin litellm recipe (src/core/ai/recipes/litellm-proxy.ts:27): Withoutdims_options: [768], the init dimension validator rejects models that don't support Matryoshka-style dimension selection - including Nomic embed. The litellm recipe needs explicit dimension allow-listing. -
Command API changes:
gbrain ingestwas renamed togbrain put <slug>(with stdin for content).gbrain query(hybrid) requires embedding generation per query.gbrain search(keyword-only) does not - and returns results without an embedding provider.
Cost Model
| Dimension | Mnemosyne | GBrain |
|---|---|---|
| Per-write token cost | $0 | $0 |
| Per-read token cost | $0 | $0 (keyword), $varies (hybrid) |
| Embedding | On-the-fly via sqlite-vec | Separate step via embedding provider |
| Storage | Single SQLite file | Markdown files + PGLite + vectors |
| Setup | pip install |
bun install + init + embedding config + patches |
When to Use Which
Mnemosyne fits when latency is the primary constraint, the memory model is fact-based, and operational simplicity matters. Zero dependencies beyond sqlite-vec. Sub-millisecond recall of structured triples with temporal versioning. If the agent needs to remember that "the codebase lives at /root/projects/mc" and recall it in under a millisecond, Mnemosyne is the direct path.
GBrain fits when the memory model is page-based and the value is in structured knowledge with typed edges, graph signals, and markdown ownership. If the agent accumulates hundreds of pages across weeks and needs graph-boosted retrieval with published benchmarks (BrainBench: P@5 49.1%, R@5 97.9%), GBrain's architecture supports that. The operational overhead is real - embedding provider, cron maintenance, source patches, more storage - but the retrieval quality on mature brains exceeds what a pure vector or keyword store can deliver.
These systems are not competitors. They target different points on the complexity-latency spectrum. The right choice follows from the shape of the agent's memory workload: fact-structured or page-structured.
[^1]: GBrain repository. Garry Tan. github.com/garrytan/gbrain. v0.41.2.0.
[^2]: Mnemosyne repository. AxDSan. github.com/AxDSan/Mnemosyne. v3.0.0.
[^3]: Benchmark script. github.com/underdown/catlabs. /tmp/memory_benchmark.py.