This document defines the canonical objects used by the ANDB v1 prototype. These objects are the first-class semantic units of the system and are the backbone for event evolution, retrieval, provenance, graph expansion, and structured response assembly.
The semantic definitions in this document should be read together with the implementation structs in src/internal/schemas/canonical.go. If there is a mismatch, current field names in code take precedence for immediate implementation, and the docs should then be updated.
ANDB explicitly moves away from the idea that agent cognition can be represented by a single table like:
memory(id, content, embedding, metadata)
That approach does not naturally capture:
- event source
- state evolution
- relation structure
- runtime state
- artifact linkage
- provenance
- versioning
Canonical objects separate those concerns into stable semantic units.
The v1 prototype includes:
AgentSessionEventMemoryStateArtifactEdgeObjectVersionUserPolicyEmbeddingShareContract
Among these, the operational core of v1 is:
EventMemoryStateArtifactEdgeObjectVersion
Agent and Session remain foundational because they define ownership, scope, and execution context.
Agent represents an execution identity inside the MAS context. It is the namespace anchor for actions, memories, and state.
Current Go fields:
agent_idtenant_idworkspace_idagent_typerole_profilepolicy_refcapability_setdefault_memory_policycreated_atstatus
Agent is used to:
- scope events and memories
- partition query context
- attach policy defaults
- define ownership boundaries
Session represents a task, thread, or execution context in which events occur and runtime state evolves.
session_idagent_idparent_session_idtask_typegoalcontext_refstart_tsend_tsstatusbudget_tokenbudget_time_ms
Session is used to:
- group event flows
- bind runtime state
- constrain retrieval context
- support local task-level reasoning
Event is the fundamental source of state evolution. Events capture messages, tool calls, tool results, plan updates, critiques, retrieval operations, and task transitions.
user_messageassistant_messagetool_call_issuedtool_result_returnedretrieval_executedplan_updatedcritique_generatedtask_finishedhandoff_occurred
event_idtenant_idworkspace_idagent_idsession_idevent_typeevent_timeingest_timevisible_timelogical_tsparent_event_idcausal_refspayloadsourceimportancevisibilityversion
visible_time and logical_ts already exist in the current Go schema. In v1 they should be treated as reserved-but-useful fields: present in the contract, but not yet backed by a full publication or logical-time system.
payload is intentionally flexible because event content varies by event type. Its semantic interpretation belongs to the materialization layer.
Event serves as:
- ingest-level source of truth
- provenance anchor
- replay-ready mutation record
- trigger source for canonical-object materialization
Memory is a reusable cognitive unit derived from one or more events or summaries. It is not identical to raw event payload. It represents something the system should later retrieve and reason over.
Suggested v1 memory categories:
episodicsemanticproceduralsocialreflective
memory_idmemory_typeagent_idsession_idowner_typescopelevelcontentsummarysource_event_idsconfidenceimportancefreshness_scorettlvalid_fromvalid_toprovenance_refversionis_active
owner_type is a coarse visibility/ownership class in v1, for example:
privatepublicpartial
scope is a target identifier string whose meaning depends on owner_type. For example:
- when
owner_type = private,scopemay be empty or a user/agent id - when
owner_type = public,scopemay be a workspace/tenant id or a well-known scope id - when
owner_type = partial,scopeshould carry a concrete target identifier (e.g. group id, allowlist id, or contract id)
level represents distillation depth, for example:
0: raw or near-raw record1: summary2: higher-level abstraction
source_event_ids is critical and should not be dropped. It is the most direct provenance bridge between event origin and reusable memory.
content is reserved for a content reference rather than inline text. In v1, it is expected to carry an embedding identifier so Memory can reference an Embedding record (see section 12) which stores both the original text and the vector payload reference for this memory.
Memory is the primary retrieval-oriented cognitive object in v1. It should be:
- retrievable
- filterable
- provenance-linked
- relation-expandable
State captures current or operational execution condition rather than reusable long-term knowledge.
- current plan
- tool stack
- execution status
- budget state
- temporary blackboard
- failure marker
state_idagent_idsession_idstate_typestate_keystate_valuederived_from_event_idcheckpoint_tsversion
State is used to:
- track runtime execution context
- explain why an agent is currently blocked or active
- support runtime-aware retrieval
- attach evidence to live operating conditions
Artifact represents external or derived work products. These are outputs that should remain linked to cognition and provenance rather than floating outside the database model.
- documents
- code
- SQL
- reports
- files
- API outputs
- generated blobs
artifact_idsession_idowner_agent_idartifact_typeuricontent_refmime_typemetadatahashproduced_by_event_idversion
Artifact is used to:
- preserve tool outputs
- bridge external actions back into the evidence graph
- support explainability and reproducibility
- anchor references to large content outside inline event payloads
Edge represents an explicit typed relation between canonical objects. ANDB does not want relation semantics to disappear inside implicit application joins.
caused_byderived_fromsupportscontradictssummarizesupdatesuses_toolbelongs_to_taskshared_with
edge_idsrc_object_idsrc_typeedge_typedst_object_iddst_typeweightprovenance_refcreated_ts
Edge is essential for:
- graph expansion
- evidence assembly
- provenance chaining
- proof-trace explanation
User represents a human or service identity that can own objects, publish policies, and participate in governance.
In v1, User is intentionally minimal and mainly exists to:
- anchor publisher/owner identity
- support future visibility and access-control work
user_iduser_nameuser_tenant_iduser_workspace_iddefault_visibility
default_visibility means the identity-level default visibility scope used when creating new objects.
It is different from object-level visibility, which is the effective visibility on each specific object instance.
User is used to:
- identify publishers of policies
- support auditing and governance attribution
- provide a stable identity handle beyond
Agent
Embedding represents a reusable vector representation referenced by other objects. The system may store the vector payload externally and keep only a stable reference in the canonical record.
This object exists to reduce duplication and to support consistent embedding reuse across multiple object types.
Note: embedding_type is intentionally omitted for now because v1 does not yet have a stable, agreed-upon embedding taxonomy. It can be reintroduced once classification is defined.
vector_idvector_contextoriginal_textdimmodel_idvector_refcreated_ts
Embedding is used to:
- share vector payloads across canonical objects
- allow multiple indexes to reference the same representation
- keep canonical object records lightweight while supporting retrieval views
Policy defines governance rules that an agent or system component can reference. It is a canonical descriptor for a published policy and is intentionally minimal in v1.
In addition, ANDB uses PolicyRecord as an object-level overlay that captures applied governance decisions (e.g. TTL, salience, quarantine) for a specific object.
Note: in v1, context is stored on PolicyRecord rather than Policy so that the concrete applied rule payload is captured alongside the object-level decision and can vary across records even under the same policy_id.
policy_idpolicy_versionpolicy_start_timepolicy_end_timepublisher_typepublisher_idpolicy_type
policy_idpolicy_versioncontextobject_idobject_typesalience_weightttlconfidence_overrideverified_statequarantine_flagvisibility_policypolicy_reasonpolicy_sourcepolicy_event_id
Policy and PolicyRecord are used to:
- express governance constraints without baking them into every object type
- support policy-aware retrieval filtering
- keep policy changes auditable via
policy_event_id
ObjectVersion records lineage for mutable canonical objects.
object_idobject_typeversionmutation_event_idvalid_fromvalid_tosnapshot_tag
ObjectVersion allows ANDB to:
- track object evolution
- attach version hints in responses
- preserve mutation provenance
- prepare for future rollback and time-travel behavior
In v1, this is intentionally lighter than a full visibility engine.
ShareContract defines an explicit governance contract for sharing within a scope. It exists to ensure that "shared memory" is not just a string label but a policy-controlled, auditable agreement.
contract_idscoperead_aclwrite_aclderive_aclttl_policyconsistency_levelmerge_policyquarantine_policyaudit_policy
ShareContract is used to:
- define who can read/write/derive within a shared scope
- attach TTL/merge/quarantine/audit rules to sharing
- serve as the target identifier for partial sharing modes (e.g.
Memory.owner_type = partial)
Common v1 relationships include:
Event -> MemoryEvent -> StateEvent -> ArtifactEvent -> ObjectVersionMemory -> EventMemory -> ArtifactMemory -> MemoryState -> EventArtifact -> Event
In addition, the v1 prototype recognizes the following governance and representation relationships:
Agent -> Policy(viaagent.policy_ref -> policy.policy_id)PolicyRecord -> Object(via(policy_record.object_type, policy_record.object_id)targeting any canonical object)PolicyRecord -> Policy(via(policy_record.policy_id, policy_record.policy_version))PolicyRecord -> Event(viapolicy_record.policy_event_id -> event.event_id)Policy -> User|Agent(publisher identity; see naming conventions below)ShareContract -> Scope(governs a shareable scope)Memory(owner_type=partial) -> ShareContract(viamemory.scope -> share_contract.contract_id)
Embedding relationships are currently represented as relations rather than strong-typed fields on every object:
Object -> Embedding(recommended viaEdgewhen an object is associated with a specific embedding vector)Event -> Embedding(optional viapayloadreferences when embeddings are produced during ingestion)
For Memory, the embedding relationship is a direct reference:
Memory -> Embedding(viamemory.content -> embedding.vector_id)
These relationships may be represented through explicit edges or direct object references depending on the layer and implementation maturity.
To keep schemas stable and unambiguous, v1 uses the following conventions:
*_id: canonical object identifiers (e.g.agent_id,session_id,memory_id)*_ref: references to external payloads (e.g. object store blobs, large text, vector payloads), not canonical objects(object_type, object_id): a generic cross-object reference used by governance/versioning records
For publisher identity fields, prefer a typed pair:
publisher_type:useroragentpublisher_id: the correspondinguser_idoragent_id
The following are explicitly acceptable in v1:
- policy/governance objects may remain reserved rather than fully operational
- share contracts are not required
- logical time semantics may remain shallow
- conflict/merge objects are deferred
- some fields may be present before their full runtime behavior exists
These simplifications are acceptable only if the contracts remain extensible.
Every retrievable cognitive unit should map back to a canonical object.
Every derived object should preserve provenance to source event(s).
Every mutable object should carry version semantics.
Every structure needed for evidence assembly should be representable through explicit edges or object references.
In v1, schema stability is more important than field completeness.
Canonical objects are the semantic backbone of ANDB. They define what the system fundamentally stores, materializes, retrieves, relates, and returns.
They should be treated as the primary abstraction of the repository, not as incidental structs.