Reflection Actions#
Parent document: Actions Index Story: TEA-AGENT-001.2 (Reflection Loop)
Overview#
Reflection actions provide self-correcting agent capabilities through generate-evaluate-correct loops. These patterns enable agents to iteratively improve their outputs based on structured feedback.
Key Features:
Multiple evaluator types (schema, LLM, custom code)
Configurable failure strategies
Full iteration history tracking
Type coercion for schema validation
Lua and Prolog evaluator support
Actions#
Action |
Description |
|---|---|
Execute generate→evaluate→correct cycle |
|
Standalone evaluation action |
|
Standalone correction action |
reflection.loop#
Execute an iterative generate→evaluate→correct loop until output passes validation or max iterations is reached.
Parameters#
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
object |
Yes |
- |
Generator configuration (action or run) |
|
object |
Yes |
- |
Evaluator configuration (type + params) |
|
object |
No |
- |
Corrector configuration (action or run) |
|
int |
No |
|
Maximum correction attempts |
|
string |
No |
|
Strategy when max iterations reached |
Generator Configuration#
The generator produces output to be evaluated:
generator:
# Option 1: Use an action
action: llm.call
model: gpt-4
prompt: "Generate a JSON user profile..."
# Option 2: Inline Python code
run: |
import json
return json.dumps({"name": "test", "email": "test@example.com"})
Evaluator Configuration#
Three evaluator types are supported:
Schema Evaluator#
Uses JSON Schema validation:
evaluator:
type: schema
schema:
type: object
required: [name, email]
properties:
name:
type: string
minLength: 1
email:
type: string
pattern: "^[^@]+@[^@]+\\.[^@]+$"
Features:
Automatic type coercion (string→int, string→boolean)
External schema references via
$refDetailed error paths and suggestions
LLM Evaluator#
Uses LLM-as-judge:
evaluator:
type: llm
model: gpt-4
prompt: |
Evaluate if this JSON is a valid user profile:
{{ state.reflection_output | tojson }}
Requirements:
- Has name and email fields
- Email looks valid
- No placeholder data
examples:
- input: '{"name": "John", "email": "john@test.com"}'
output: '{"valid": true, "score": 1.0, "reason": "Valid profile"}'
- input: '{"name": "", "email": "invalid"}'
output: '{"valid": false, "score": 0.2, "reason": "Empty name, invalid email"}'
Custom Evaluator#
Uses inline Python, Lua, or Prolog code:
evaluator:
type: custom
language: python # or: lua, prolog
run: |
import json
try:
data = json.loads(output) if isinstance(output, str) else output
has_name = bool(data.get("name"))
has_email = "@" in data.get("email", "")
score = (has_name + has_email) / 2
result = {
"valid": score >= 1.0,
"score": score,
"errors": [] if score >= 1.0 else [{"message": "Missing required fields"}],
"suggestions": ["Add name field"] if not has_name else []
}
except:
result = {"valid": False, "score": 0.0, "errors": [{"message": "Invalid JSON"}]}
Corrector Configuration#
The corrector attempts to fix invalid output:
corrector:
action: llm.call
model: gpt-4
prompt: |
Fix this JSON based on the validation errors:
Original: {{ state.reflection_output | tojson }}
Errors:
{% for error in state.reflection_errors %}
- {{ error.message }}
{% endfor %}
Return only the corrected JSON, no explanation.
On-Failure Strategies#
Strategy |
Description |
|---|---|
|
Return the attempt with highest score |
|
Return the final attempt |
|
Raise |
State Variables#
The loop sets these state variables during execution:
Variable |
Type |
Description |
|---|---|---|
|
int |
Current iteration (1-based) |
|
any |
Current generator output |
|
list |
Errors from current evaluation |
|
list |
All attempts with outputs and scores |
|
any |
Best output seen so far |
|
float |
Score of best output (0.0-1.0) |
Return Value#
# On success (valid output produced)
{
"success": true,
"valid": true,
"reflection_iteration": 2,
"reflection_output": {"name": "John", "email": "john@example.com"},
"reflection_errors": [],
"reflection_history": [...],
"reflection_best": {"name": "John", "email": "john@example.com"},
"reflection_best_score": 1.0,
# Plus all keys from output if it's a dict
"name": "John",
"email": "john@example.com"
}
# On failure (max iterations exhausted)
{
"success": false,
"valid": false,
"exhausted": true,
"reflection_iteration": 3,
"reflection_output": {...},
"reflection_errors": [...],
"reflection_history": [...],
"reflection_best": {...},
"reflection_best_score": 0.7
}
Complete Example#
name: json-generator-with-reflection
description: Generate valid JSON with self-correction
nodes:
- name: generate_profile
uses: reflection.loop
with:
generator:
action: llm.call
model: gpt-4
prompt: |
Generate a JSON user profile with name, email, and age.
User request: {{ state.request }}
Return only valid JSON.
evaluator:
type: schema
schema:
type: object
required: [name, email, age]
properties:
name:
type: string
minLength: 1
email:
type: string
pattern: "^[^@]+@[^@]+\\.[^@]+$"
age:
type: integer
minimum: 0
maximum: 150
corrector:
action: llm.call
model: gpt-4
prompt: |
Fix this JSON based on validation errors:
Original: {{ state.reflection_output | tojson }}
Errors:
{% for error in state.reflection_errors %}
- Path: {{ error.path }}, Message: {{ error.message }}
{% endfor %}
Return only corrected JSON.
max_iterations: 3
on_failure: return_best
output: profile
- name: use_profile
run: |
if state.get("success"):
return {"message": f"Created profile for {state['name']}"}
else:
return {"message": "Failed to generate valid profile", "best_attempt": state.get("reflection_best")}
reflection.evaluate#
Standalone evaluation action for use outside the loop.
Parameters#
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
any |
No |
|
Data to evaluate |
|
string |
No |
|
Type: schema, llm, custom |
|
object |
Conditional |
- |
JSON Schema (for schema type) |
|
string |
Conditional |
- |
Evaluation prompt (for llm type) |
|
string |
No |
From state |
LLM model (for llm type) |
|
list |
No |
|
Few-shot examples (for llm type) |
|
string |
Conditional |
- |
Code (for custom type) |
|
string |
No |
|
Code language (for custom type) |
Example#
nodes:
- name: validate_output
uses: reflection.evaluate
with:
data: "{{ state.generated_json }}"
evaluator_type: schema
schema:
type: object
required: [status, data]
output: validation
- name: check_result
run: |
if state["validation"]["valid"]:
return {"status": "valid"}
else:
return {"status": "invalid", "errors": state["validation"]["errors"]}
reflection.correct#
Standalone correction action.
Parameters#
Parameter |
Type |
Required |
Default |
Description |
|---|---|---|---|---|
|
any |
No |
|
Data to correct |
|
list |
No |
|
Validation errors |
|
string |
Conditional |
- |
Action to use |
|
string |
Conditional |
- |
Inline Python code |
|
string |
Conditional |
- |
LLM prompt (uses llm.call) |
Example#
nodes:
- name: fix_json
uses: reflection.correct
with:
data: "{{ state.invalid_json }}"
errors: "{{ state.validation_errors }}"
prompt: |
Fix this JSON:
{{ state.invalid_json | tojson }}
Errors: {{ state.validation_errors | tojson }}
output: corrected
Error Handling#
ReflectionFailedError#
When on_failure: raise and max iterations are exhausted:
from the_edge_agent.actions import ReflectionFailedError
try:
result = graph.invoke({"request": "create user"})
except ReflectionFailedError as e:
print(f"Reflection failed: {e}")
print(f"History: {e.history}")
# Each history entry: {"iteration": int, "output": any, "score": float, "valid": bool, "errors": list}
Graceful Degradation#
For non-critical workflows, use on_failure: return_best:
- name: generate_with_fallback
uses: reflection.loop
with:
generator:
action: llm.call
prompt: "Generate content..."
evaluator:
type: schema
schema: {type: object, required: [content]}
max_iterations: 3
on_failure: return_best # Return best attempt instead of failing
output: result
- name: handle_result
run: |
if state.get("success"):
return {"content": state["result"]["content"]}
else:
# Use best attempt or fallback
best = state.get("reflection_best", {})
return {"content": best.get("content", "Default content")}
Best Practices#
Set appropriate max_iterations: 3-5 is usually sufficient; more can waste tokens
Use schema validation first: It’s faster and deterministic
Provide good corrector prompts: Include specific errors and examples
Track iteration history: Use
reflection_historyfor debuggingConsider fallbacks: Use
on_failure: return_bestfor non-critical paths
See Also#
Reasoning Actions - Chain-of-Thought, ReAct patterns
Planning Actions - Task decomposition with correction
Agent Actions - Multi-agent with reflection