Documentation Index
Fetch the complete documentation index at: https://docs.xysq.ai/llms.txt
Use this file to discover all available pages before exploring further.
Memory Storage Model
This is the canonical mental model for memories in xysq. If you’re touching anything inbackend/memory/, backend/services/memory.py, backend/api/routes/memories.py, or backend/chat/wiring.py, read this first.
The two stores
xysq stores user memories in two places, by design:- Postgres (
memoriestable) — durable source of truth for what the user wrote, what they tagged it with, what state it’s in (pending/processing/completed/failed). The vault list endpoint reads from here. - Hindsight (per-bank document store) — the AI memory provider. Stores the original document text plus N atomic memory units the LLM extracted from it. Recall ranks units; mutations operate on documents.
memories row corresponds to exactly one Hindsight document, identified by the row’s document_id column. We send document_id at retain time; Hindsight indexes by it; we use it for all subsequent operations.
Document vs memory unit
Bank routing
The Hindsight bank for a given user item (memory or knowledge source) is deterministic from the row state. Privacy tags take precedence over the item’s default bank. Personal items:Tags include pii? | Tags include confidential? | Item is knowledge_sources? | Bank(s) |
|---|---|---|---|
| ✓ | ✗ | any | {user_id}:private |
| ✗ | ✓ | any | {user_id}:confidential |
| ✓ | ✓ | any | {user_id}:private AND {user_id}:confidential (fan out) |
| ✗ | ✗ | ✓ | {user_id}:wiki (knowledge default) |
| ✗ | ✗ | ✗ | {user_id} (memory default) |
| Item is knowledge_sources? | Bank(s) |
|---|---|
| ✓ | team:{owner_id}:wiki |
| ✗ | team:{owner_id} |
pii to a knowledge source moves it from :wiki to :private. Removing pii from a memory moves it from :private back to main. The update_memory_tags and update_knowledge_tags route handlers must compute the new banks under the new tags, PATCH whatever stays, retain into newly-targeted banks, and delete from no-longer-targeted banks. This is the same fan-out logic on both sides — a memory and a knowledge source are routed by the same rule, just with different defaults when no privacy tag is present.
Never iterate banks at mutation time. If a row is in either table, its bank(s) are computed by resolve_banks_for_row(row, *, default_bank) — single deterministic answer. If you need to operate on an item and the answer is “I don’t know what bank”, that’s a data inconsistency bug, not a “try them all” situation.
What memories.hindsight_id was, and why it’s gone
The column tried to store the per-unit Hindsight memory id so we could call per-unit endpoints (PATCH /memories/{memory_id}/tags). But:
- Hindsight’s
RetainResponsehas no per-unit id field, so the column was always NULL. - Even if we could populate it, a “memory” has N units — operating on one updated tags on 1/N facts and silently left the others stale.
What the webhook does
Theretain.completed webhook from Hindsight fires once per document after extraction finishes. The handler matches the memories row by document_id and flips status from processing → completed. That’s it. It does not write a hindsight_id — there’s nothing useful to write.
What asset uploads do (and don’t do)
When a user uploads a file via Organise:assetstable row created (file metadata, GCS uri, extraction status)- Hindsight retain to
{user_id}:wikibank (where the extracted content lives for recall) knowledge_sourcestable row oftype='document'(so the file appears in the Knowledge Base scope of the unified vault)
memories row. Asset uploads are knowledge content; they belong in knowledge_sources. The pre-2026-05-11 dual-surfacing pattern (asset → memories row + knowledge_sources row + Hindsight wiki) created rows whose vault_type='personal' claimed they were main-vault memories but whose actual content was in the wiki bank — a categorical lie. That pattern is removed; existing leaked rows are cleaned up by the cleanup_asset_memories.py one-shot.
Operational invariants
These should hold after the refactor lands. Add tests / asserts that catch violations:- Every
memoriesrow withstatus='completed'has a corresponding Hindsight document at the bank computed byresolve_banks_for_row(row, default_bank='main'). (Holds modulo the orphan cleanup; themark_orphan_memories_failed.pyone-shot promotes violations to status=‘failed’ so the UI surfaces them.) - No
memoriesrow hasrole='asset'in tags or metadata after the cleanup runs. Asset uploads only live inknowledge_sources. memories.hindsight_idcolumn does not exist (after Phase 5 migration).- Every Hindsight document in a user’s bank has either a
memoriesrow (with matchingdocument_id) or aknowledge_sourcesrow. Backfilled bybackfill_ghost_documents.py; ongoing invariant maintained because retain creates the Postgres row before calling Hindsight.