RAG & Memory#
Build agents with persistent memory and semantic search - from session-scoped caching to cloud-synced long-term knowledge bases.
Why This Matters#
AI agents often need to remember context across conversations, retrieve relevant knowledge from large document collections, and persist important information between sessions. TEA provides a comprehensive memory and RAG (Retrieval-Augmented Generation) stack that works locally for development and scales to cloud-native deployments without code changes.
Quick Example#
name: rag-memory-agent
version: "1.0"
settings:
ltm:
backend: duckdb
catalog:
type: sqlite
path: ":memory:"
storage:
uri: "./ltm_data/"
inline_threshold: 1024
nodes:
- id: remember
action: memory.store
key: "user_preference"
value: "{{ state.preference }}"
ttl: 3600 # expires in 1 hour
- id: search
action: vector.query
query: "{{ state.question }}"
k: 5
collection: "knowledge_base"
- id: answer
action: llm.call
model: gpt-4
messages:
- role: system
content: |
Use this context: {{ state.search.results | tojson }}
- role: user
content: "{{ state.question }}"
edges:
- from: __start__
to: remember
- from: remember
to: search
- from: search
to: answer
- from: answer
to: __end__
Memory Types#
Type |
Backend |
Persistence |
Use Case |
|---|---|---|---|
Short-Term |
In-Memory |
Session only |
Working memory, TTL-based caching |
Long-Term (Local) |
SQLite |
Survives restarts |
Local knowledge base, development |
Long-Term (Cloud) |
DuckDB + Catalog |
Cloud-synced |
Production, serverless, multi-agent |
Graph Memory |
CozoDB / Kuzu |
Local or cloud |
Entity relationships, reasoning |
LTM Backends#
Backend |
Use Case |
Install |
|---|---|---|
sqlite (default) |
Local development, single-node |
Built-in |
duckdb |
Analytics-heavy, catalog-aware, cloud storage |
|
litestream |
SQLite with S3 replication |
|
blob-sqlite |
Distributed with blob storage |
|
Catalog Backends (for DuckDB LTM)#
Catalog |
Use Case |
Install |
|---|---|---|
sqlite (default) |
Local, development |
Built-in |
firestore |
Serverless, Firebase ecosystem |
|
postgres |
Self-hosted, SQL compatibility |
|
supabase |
Edge, REST API, managed Postgres |
|
Available Actions#
Memory Actions#
Action |
Description |
|---|---|
|
Store key-value pairs with optional TTL (session-scoped) |
|
Retrieve values by key with fallback default |
|
Condense conversation history using LLM |
Long-Term Memory Actions#
Action |
Description |
|---|---|
|
Persist key-value pairs to durable storage |
|
Fetch values from long-term storage |
|
Remove entries from storage |
|
Full-text search with FTS5 and metadata filtering |
RAG Actions#
Action |
Description |
|---|---|
|
Generate embeddings (OpenAI, Ollama, or custom) |
|
Store documents with embeddings and metadata |
|
Semantic similarity search with filtering |
Graph Memory Actions#
Action |
Description |
|---|---|
|
Store nodes with type, properties, and embeddings |
|
Create edges between entities |
|
Datalog/Cypher queries for graph traversals |
|
Get relevant subgraph for a query (N-hop expansion) |
Key Features#
Feature |
Description |
|---|---|
Zero Dependencies |
In-memory backends work with pure Python |
Pluggable Providers |
OpenAI, Ollama, or custom embedding providers |
Serverless Ready |
DuckDB with Firestore/Supabase catalog for cloud functions |
Content Deduplication |
SHA-256 hashing prevents duplicate storage |
Small Data Inlining |
Entries <1KB stored directly in catalog (fast access) |
Cloud Storage |
Direct S3/GCS/Azure I/O via httpfs extension |
Configuration Examples#
Local Development#
settings:
ltm:
backend: sqlite
path: ./agent_memory.db
rag:
embedding_provider: ollama
embedding_model: nomic-embed-text
vector_store: memory
Serverless Production (Firebase)#
settings:
ltm:
backend: duckdb
catalog:
type: firestore
project: my-tea-project
storage:
uri: "gs://my-bucket/agents/ltm/"
inline_threshold: 1024
rag:
embedding_provider: openai
embedding_model: text-embedding-3-small
Self-Hosted Production (PostgreSQL)#
settings:
ltm:
backend: duckdb
catalog:
type: postgres
connection_string: "${POSTGRES_URL}"
storage:
uri: "s3://my-bucket/agents/ltm/"
s3_region: us-east-1
Embedding Providers#
OpenAI (Remote or Local-Compatible API)#
settings:
rag:
embedding_provider: openai
embedding_model: text-embedding-3-small # 1536 dims
# openai_base_url: http://localhost:8000/v1 # For local APIs
Ollama (Local)#
settings:
rag:
embedding_provider: ollama
embedding_model: nomic-embed-text # 768 dims, 8K context
ollama_base_url: http://localhost:11434
Ollama Model |
Dimensions |
Context |
Use Case |
|---|---|---|---|
|
768 |
8,192 tokens |
Long documents, balanced |
|
1024 |
512 tokens |
High accuracy |
|
384 |
256 tokens |
Lightweight, fast |
|
1024 |
8,192 tokens |
Highest retrieval accuracy |
Examples#
Knowledge Graph Agent - Graph memory with Datalog reasoning
Conversational Agent - Short-term memory with summarization
RAG Pipeline - Embedding + vector search workflow
Learn More#
Memory Actions Reference - Full API documentation
LTM Backend Guide - Backend selection and configuration
Checkpoint Guide - Save/resume with memory state
TEA-BUILTIN-001.1: Memory Actions - Implementation story
TEA-BUILTIN-001.4: Long-Term Memory - LTM architecture
TEA-BUILTIN-001.6: DuckDB LTM - Cloud-native backend
TEA-BUILTIN-002.2: RAG Actions - Embedding and vector search