SOMA docs
Agent

Models & stack

Which LLM drives which path, and why.

The agent is a Mastra Agent instance with a tool registry and a streaming Anthropic model. The stack is tight on purpose — one agent framework, one model vendor, one embedding vendor.

RoleModelWhere
Primary streaming agentclaude-sonnet-4-5-20250929somaAgent in packages/agent/src/agents/soma.ts
Deep synthesis (weekly review, brief)claude-opus-4-5somaAgentOpus, same file
Embeddings (entities + facts + query)voyage-3-large (1024d)@soma/tools/shared/embed.ts
Reranking (top-K over top-3K)rerank-2@soma/tools/shared/rerank.ts
Voice transcription (bot)Whisper (OpenAI)@soma/tools/shared/transcribe.ts
Fact extractionclaude-haiku-4-5packages/agent/src/inngest/functions/fact-extract.ts

Why Sonnet for streaming

Sonnet 4.5 is the right cost/latency/quality trade for typed-tool-call-driven dialog. It reliably picks tools from the registry, rarely hallucinates entities that don't exist in memory_recall results, and keeps tone consistent across turns. Opus is a noticeable upgrade for long-form synthesis (weekly review) but 3-5× slower and 5× more expensive — overkill for chat turns.

Why Haiku for fact extraction

Fact extraction runs async per conversation turn. We need structured output (an array of fact objects), good-enough quality, and very low cost. Haiku hits all three. generateObject with a Zod schema gives us non-negotiable shape.

Why Voyage instead of OpenAI embeddings

voyage-3-large scores top-1 on MTEB knowledge retrieval benchmarks at 1024d, and the accompanying rerank-2 cross-encoder lifts top-K precision dramatically over dense-only recall. The cost is comparable to text-embedding-3-large. Swapping later is a one-file change in @soma/tools/shared/embed.ts, but there's no reason to.

Agent definition

// packages/agent/src/agents/soma.ts
export const somaAgent = new Agent({
  name: 'soma',
  description: 'SOMA — personal AI assistant with knowledge-graph memory.',
  instructions: SYSTEM_PROMPT,
  model: anthropic('claude-sonnet-4-5-20250929'),
  tools: allTools,
  ...(memory ? { memory } : {}),
});

allTools is the dictionary of tool handles from @soma/tools/index.ts. See Tool registry.