RAG & Memory#

Build agents with persistent memory and semantic search - from session-scoped caching to cloud-synced long-term knowledge bases.

Why This Matters#

AI agents often need to remember context across conversations, retrieve relevant knowledge from large document collections, and persist important information between sessions. TEA provides a comprehensive memory and RAG (Retrieval-Augmented Generation) stack that works locally for development and scales to cloud-native deployments without code changes.

Quick Example#

name: rag-memory-agent
version: "1.0"

settings:
  ltm:
    backend: duckdb
    catalog:
      type: sqlite
      path: ":memory:"
    storage:
      uri: "./ltm_data/"
    inline_threshold: 1024

nodes:
  - id: remember
    action: memory.store
    key: "user_preference"
    value: "{{ state.preference }}"
    ttl: 3600  # expires in 1 hour

  - id: search
    action: vector.query
    query: "{{ state.question }}"
    k: 5
    collection: "knowledge_base"

  - id: answer
    action: llm.call
    model: gpt-4
    messages:
      - role: system
        content: |
          Use this context: {{ state.search.results | tojson }}
      - role: user
        content: "{{ state.question }}"

edges:
  - from: __start__
    to: remember
  - from: remember
    to: search
  - from: search
    to: answer
  - from: answer
    to: __end__

Memory Types#

Type

Backend

Persistence

Use Case

Short-Term

In-Memory

Session only

Working memory, TTL-based caching

Long-Term (Local)

SQLite

Survives restarts

Local knowledge base, development

Long-Term (Cloud)

DuckDB + Catalog

Cloud-synced

Production, serverless, multi-agent

Graph Memory

CozoDB / Kuzu

Local or cloud

Entity relationships, reasoning

LTM Backends#

Backend

Use Case

Install

sqlite (default)

Local development, single-node

Built-in

duckdb

Analytics-heavy, catalog-aware, cloud storage

pip install duckdb fsspec

litestream

SQLite with S3 replication

pip install litestream

blob-sqlite

Distributed with blob storage

pip install fsspec

Catalog Backends (for DuckDB LTM)#

Catalog

Use Case

Install

sqlite (default)

Local, development

Built-in

firestore

Serverless, Firebase ecosystem

pip install firebase-admin

postgres

Self-hosted, SQL compatibility

pip install psycopg2

supabase

Edge, REST API, managed Postgres

pip install requests

Available Actions#

Memory Actions#

Action

Description

memory.store

Store key-value pairs with optional TTL (session-scoped)

memory.retrieve

Retrieve values by key with fallback default

memory.summarize

Condense conversation history using LLM

Long-Term Memory Actions#

Action

Description

ltm.store

Persist key-value pairs to durable storage

ltm.retrieve

Fetch values from long-term storage

ltm.delete

Remove entries from storage

ltm.search

Full-text search with FTS5 and metadata filtering

RAG Actions#

Action

Description

embedding.create

Generate embeddings (OpenAI, Ollama, or custom)

vector.store

Store documents with embeddings and metadata

vector.query

Semantic similarity search with filtering

Graph Memory Actions#

Action

Description

graph.store_entity

Store nodes with type, properties, and embeddings

graph.store_relation

Create edges between entities

graph.query

Datalog/Cypher queries for graph traversals

graph.retrieve_context

Get relevant subgraph for a query (N-hop expansion)

Key Features#

Feature

Description

Zero Dependencies

In-memory backends work with pure Python

Pluggable Providers

OpenAI, Ollama, or custom embedding providers

Serverless Ready

DuckDB with Firestore/Supabase catalog for cloud functions

Content Deduplication

SHA-256 hashing prevents duplicate storage

Small Data Inlining

Entries <1KB stored directly in catalog (fast access)

Cloud Storage

Direct S3/GCS/Azure I/O via httpfs extension

Configuration Examples#

Local Development#

settings:
  ltm:
    backend: sqlite
    path: ./agent_memory.db
  rag:
    embedding_provider: ollama
    embedding_model: nomic-embed-text
    vector_store: memory

Serverless Production (Firebase)#

settings:
  ltm:
    backend: duckdb
    catalog:
      type: firestore
      project: my-tea-project
    storage:
      uri: "gs://my-bucket/agents/ltm/"
    inline_threshold: 1024
  rag:
    embedding_provider: openai
    embedding_model: text-embedding-3-small

Self-Hosted Production (PostgreSQL)#

settings:
  ltm:
    backend: duckdb
    catalog:
      type: postgres
      connection_string: "${POSTGRES_URL}"
    storage:
      uri: "s3://my-bucket/agents/ltm/"
      s3_region: us-east-1

Embedding Providers#

OpenAI (Remote or Local-Compatible API)#

settings:
  rag:
    embedding_provider: openai
    embedding_model: text-embedding-3-small  # 1536 dims
    # openai_base_url: http://localhost:8000/v1  # For local APIs

Ollama (Local)#

settings:
  rag:
    embedding_provider: ollama
    embedding_model: nomic-embed-text  # 768 dims, 8K context
    ollama_base_url: http://localhost:11434

Ollama Model

Dimensions

Context

Use Case

nomic-embed-text

768

8,192 tokens

Long documents, balanced

mxbai-embed-large

1024

512 tokens

High accuracy

all-minilm

384

256 tokens

Lightweight, fast

bge-m3

1024

8,192 tokens

Highest retrieval accuracy

Examples#

Learn More#