OpenSRE's episodic memory is a system that remembers past investigations and uses them to guide future ones. After every investigation, OpenSRE extracts structured metadata from the outcome and stores it. When a similar incident occurs, this stored knowledge is retrieved and injected into the new investigation's context — like a senior SRE recalling what worked last time.
Without episodic memory, every incident investigation starts from scratch. An AI agent has no knowledge of past outages, no patterns to recognize, no proven approaches to try first. The first investigation of a payments-service timeout takes as long as the tenth.
With episodic memory, OpenSRE builds institutional knowledge:
The writeup node produces a structured incident report.
An LLM extracts structured metadata from the investigation:
high_error_rate, pod_crashloop, latency_spike)The episode is stored in PostgreSQL via the config-service API. All investigations are stored, not just resolved ones — unresolved incidents are valuable for learning too.
Before a new investigation begins, init_context queries episodic memory for similar past episodes using weighted scoring:
| Factor | Weight | |--------|--------| | Alert type match | 0.5 | | Service overlap | 0.3 | | Resolution status | 0.2 |
The top matching episodes are formatted and injected into the planner's context. The planner can see: "Last time this alert fired on payments-service, it was a database connection pool issue resolved by restarting the connection pool manager."
When 2 or more episodes share the same alert type, OpenSRE automatically generates a reusable investigation strategy. This strategy captures the common investigation path: which skills to run first, what patterns to look for, which services to prioritize.
| After N investigations | What improves | |------------------------|--------------| | 1 | Baseline performance | | 2-3 | Similar incidents get context from past episodes | | 5+ | Strategies auto-generate for common alert types | | 10+ | High accuracy pattern recognition for recurring issues |
The web console at http://localhost:3002 includes an episodic memory browser: