Benchmark

The architecture that scored #2 on LongMemEval is the substrate underneath every product.

LongMemEval Oracle (ICLR 2025, University of Michigan) is the standard benchmark for long-horizon agent memory. 500 questions across six categories. Our score is public, reproducible, and live on the leaderboard.

95.6%

LongMemEval Oracle · 500 questions · Claude Opus 4.6 Sibyl Memory placed second on the public leaderboard, tied with Chronos (PwC). The only file-based system in the top tier.

4 vCPU · 16GB EC2 · zero vectors · zero embeddings · zero external retrieval

Rank	System	Score	Architecture
1	agentmemory V4	96.2%	embedding-based
2	Sibyl Memory	95.6%	file-based · zero vectors
2	Chronos (PwC)	95.6%	embedding-based
4	Mastra Observational Memory	94.9%	embedding-based
5	MemMachine	93.0%	embedding-based
6	Hindsight (Vectorize)	91.4%	vector DB
Mem0 · Zep · Supermemory · Emergence AI · Oracle baseline — all below the top tier

Category Breakdown · Opus 4.6

100%

single-session-user

100%

single-session-assistant

96.2%

temporal-reasoning

93.3%

single-session-preference

93.2%

multi-session

92.3%

knowledge-update

We did not optimize for the benchmark. We optimized for production efficiency. The benchmark improvement was a side effect.
— Sibyl Labs · LongMemEval Report · April 2026

Read the full benchmark report →

Use Cases

One schema. Five use cases.

Same architecture across every deployment. File-based at the substrate, Postgres-backed in production, multi-tenant by namespace. Different access patterns optimized for different team problems.

Operator Memory

Live

For autonomous agent founders

Full-stack memory for one operator. Priorities, journal, entities, scars, relationships, arc. The same shape Sibyl uses to operate herself, packaged for delivery.

Buyers: Agent founders building autonomous agents · Scope: single-tenant per operator, multi-month operational continuity

User Profile Memory

Pilot

For consumer platforms with active users

Per-user persistent memory at platform scale. Tracks history, extracts patterns, surfaces back via your UI. Each user gets an isolated namespace that grows with their behavior.

Buyers: prediction markets, social platforms, agent platforms · Scope: 1K → 1M+ active users per platform

Conversational Continuity

Coming Q3

For AI app builders

Cross-session memory for LLM apps. The "remember yesterday I said X" primitive most chat products lack. Drop-in replacement for vector RAG memory with substantially lower cost.

Buyers: AI app builders, chat product startups · Scope: per-user threads, cross-session continuity

Agent Reputation

Coming Q3

For compliance-needing agents

Append-only decision log with full rationale. Hash-anchored to chain on demand. Exportable for audit. Designed for agents whose decisions need to be defensible.

Buyers: trading agents, advisory agents, fintech agents · Scope: per-agent decision history with cryptographic anchoring

Org Memory

Coming Q4

For DAOs · async-first orgs · foundations

Multi-actor memory with role-tagged extraction. Solves "we keep forgetting our own decisions" for distributed teams. Preserves who said what, when, in which thread.

Buyers: DAOs, foundations, async-first orgs · Scope: shared namespace per org, role-tagged contributions

Architecture

Memory is the state. Agents are stateless.

The multi-tenant pattern: per-tenant schema namespace, ephemeral agent runtime, the same hierarchical memory shape across every use case. Each request loads exactly the requesting tenant's slice, processes the turn, writes back, exits. Memory persists in Postgres. Agents don't persist at all.

Schema · Five Tiers

HOT

state_documents

treasury · priorities · session

WARM

entities · entity_relations

UNIQUE(tenant, category, name)

COLD

journal · revenue · errors · metrics

append-only · indexed by ts

REFERENCE

reference_documents

runbooks · rules · constants

Per-user pricing. No vector tax.

Cloud or self-host. Same SDK either way. Pricing reflects the cost we don't pay — embedding APIs and vector DB hosting — not the cost we do pay.

Free

$0

100 MAU · cloud only

100 monthly active users
10K writes / 100K reads per month
Cloud only
Community support

Request an API key

Starter

$99/mo

~$0.10 per active user

1,000 monthly active users
Unlimited writes & reads at fair use
Cloud only
Email support · 48h response

Start with Starter

What 10,000 active users actually cost

Apples-to-apples comparison at the Pro tier scale. Vector-DB stack numbers are approximate; verify against your own usage.

Vector-DB Stack

Pinecone Standard~$70/mo

OpenAI embeddings (~500K calls)~$1,000/mo

LangChain Memory infra~$200/mo

Engineering timeweeks

~$1,270/mo + engineering

Sibyl Memory · Pro

One SDKincluded

Schema migrationsincluded

Multi-tenant Postgresincluded

Dashboard, audit, GDPR cascadeincluded

$499/mo · done

~60% cheaper at this scale. The cost advantage compounds: every additional active user adds ~$0.10–0.30 of vector-DB cost on the left, ~$0.05 on the right.

Prices in USD, exclusive of taxes. Custom contracts available at every tier above Pro. Annual prepay: 20% discount. Self-host customers run the same SDK against their own Postgres — schema parity, not service parity. We never touch your data.

Integration Pipeline

What the pipeline looks like.

One SDK, two transports. Point sibyl-memory-client at Sibyl Cloud with an API key, or at your own Postgres with a connection string. No vector DB to provision. No embedding service to wire. No schema migrations to design — they're already shipped, idempotent, tested against production. Once your connection is configured, less than ten minutes from clone to first write.

Step 1

Configure your connection

Drop a Sibyl Cloud API key + tenant ID into your env, or a Postgres connection string if you're self-hosting. Same SDK reads either. Standard secret handling — env file, secrets manager, deployment config, your call.

# cloud SIBYL_API_KEY=sk_... SIBYL_TENANT_ID=... # or self-host DATABASE_URL=postgres://...

drop into env

Step 2

Install the SDK

One npm package. Same package for both transports. No vector DB, no embedding service, no extra infra to provision.

npm i sibyl-memory-client

~10 seconds

Step 3

Initialize the client

Same constructor takes either shape. SDK handles auth, retries, connection pool, schema validation.

// cloud new MemoryClient({ apiKey, tenantId }) // or self-host new MemoryClient({ databaseUrl })

~15 seconds

Step 4

Write your first memory

Entities, state, journal, jobs. Same call shape on both transports. Reads are immediate. Async work hits the queue automatically.

await memory.entities.upsert({...})

live

Self-host adds two steps before step 1: 1. provision a Postgres 14+ (Neon, RDS, Aurora, Supabase, on-prem — your choice) and 2. run the migration runner against it (node apply-sibyl-memory-schema.mjs $DATABASE_URL). The resulting DATABASE_URL becomes your step-1 connection. Same SDK, same code, same dashboard build. We never touch your data — schema parity, not service parity. ~half a day for a competent ops engineer end-to-end.

Day-2 management: a tenant dashboard at app.sibylcap.com (or your self-hosted equivalent) gives operators per-tenant overview, user-by-user drill-down, materialized metrics (24h activity, 7d growth, stale-active, dead-xrefs), audit log, GDPR cascade-delete, and cleanup triggers. All reads from the same Postgres. All writes logged to audit_events.

Production Pedigree

Sibyl Memory powers @sibylcap, an autonomous agent that has run continuously on Base since February 2026. The architecture documented on this page handles her priorities, journal, entities, treasury, partner relationships, and on-chain reputation in production. Every claim on this page is provable on a public surface.

View her record on X → Token on Basescan → Try the live demo →

Start with the free tier.
Talk to us when you outgrow it.

Cloud key requests are processed within 24 hours. Self-host setup is half a day for a competent ops engineer. No vector DB, no embedding service, no schema migrations to design — they're already shipped, idempotent, tested against production.

Request an API key Read the spec Schedule a 30-min eval

Persistent memory for AI agents.