ADR-001: Three-Tiered Memory System Architecture¶

Status¶

Accepted

Date¶

2023-11-15

Context¶

The NeuroCognitive Architecture (NCA) requires a memory system that mimics human cognitive processes while providing practical functionality for Large Language Models (LLMs). Current LLM architectures lack sophisticated memory management that can differentiate between immediate context, working knowledge, and long-term storage. This limitation restricts their ability to maintain contextual awareness across interactions, learn from experiences, and develop a persistent knowledge base.

We need to design a memory architecture that:

Supports different types of information retention based on importance and recency
Enables efficient retrieval of relevant information during interactions
Allows for dynamic learning and knowledge consolidation
Scales effectively with increasing amounts of data
Maintains performance under high load conditions
Integrates seamlessly with existing LLM capabilities

Decision¶

We will implement a three-tiered memory system inspired by human cognitive architecture:

1. Short-Term Memory (STM)¶

Purpose: Maintain immediate context and recent interactions.

Characteristics: - High-speed access with limited capacity - Volatile storage with automatic decay - Recency-biased retention - Direct integration with LLM context window

Implementation Details: - In-memory data structures (priority queues, circular buffers) - Time-based expiration policies - Importance scoring for retention decisions - Context window management algorithms

Capacity: Configurable, defaulting to approximately 10-15 conversation turns or equivalent information units

Decay Rate: Exponential decay with half-life of 5-10 interaction cycles

2. Working Memory (WM)¶

Purpose: Store actively used information and facilitate reasoning processes.

Characteristics: - Medium-term persistence - Structured organization by topic/domain - Bidirectional flow with STM and LTM - Support for active manipulation and reasoning

Implementation Details: - Distributed cache system (Redis, Memcached) - Graph-based knowledge representation - Attention mechanisms for focus management - Chunking algorithms for information organization

Capacity: Larger than STM but still constrained (configurable based on deployment resources)

Persistence: Hours to days, with activity-based reinforcement

Retrieval Mechanism: Associative retrieval with relevance scoring

3. Long-Term Memory (LTM)¶

Purpose: Persistent knowledge storage and retrieval.

Characteristics: - Durable, high-capacity storage - Semantic organization and indexing - Consolidation processes from working memory - Multiple retrieval pathways

Implementation Details: - Vector database (Pinecone, Weaviate, or Milvus) - Document database for unstructured content (MongoDB) - Embedding models for semantic representation - Hierarchical categorization system - Periodic consolidation processes

Capacity: Effectively unlimited, constrained only by infrastructure resources

Persistence: Permanent until explicitly removed or deprecated

Retrieval Efficiency: Optimized for semantic similarity and relevance

Memory Transfer Mechanisms¶

STM → WM Transfer¶

Importance-based promotion
Repetition reinforcement
Explicit marking by system or user
Contextual relevance threshold crossing

WM → LTM Consolidation¶

Scheduled consolidation processes
Usage frequency analysis
Knowledge graph integration
Semantic clustering and abstraction

LTM → WM Activation¶

Query-based retrieval during interactions
Associative activation based on current context
Explicit recall requests
Predictive pre-loading for anticipated topics

Technical Implementation Considerations¶

Data Structures¶

STM: Priority queues, circular buffers with time-based expiration
WM: Graph databases, distributed caches with TTL
LTM: Vector stores, document databases with semantic indexing

Scaling Strategy¶

Horizontal scaling for LTM components
Vertical scaling for WM performance
Sharding strategies based on knowledge domains
Read replicas for high-demand knowledge areas

Persistence Layers¶

STM: In-memory with optional persistence for recovery
WM: Cache with background persistence
LTM: Durable storage with backup and replication

Query Optimization¶

Embedding-based similarity search
Hierarchical filtering
Parallel query execution across memory tiers
Caching frequently accessed patterns

Consequences¶

Positive¶

Enhanced contextual awareness across interactions
Improved information retention and recall capabilities
More natural conversation flow with appropriate memory references
Ability to learn and adapt from past interactions
Reduced redundancy in information processing
Better handling of complex, multi-session tasks

Negative¶

Increased system complexity
Higher resource requirements for memory management
Potential privacy concerns with persistent memory
Need for sophisticated memory management policies
Challenges in determining optimal memory transfer thresholds

Neutral¶

Requires ongoing tuning of memory parameters
Necessitates clear policies for memory retention and deletion
May require user controls for memory management

Compliance and Privacy Considerations¶

All memory tiers must implement configurable retention policies
Personal or sensitive information must be clearly marked and handled according to privacy requirements
Users must have the ability to request memory deletion across all tiers
Audit trails for memory access and modification must be maintained
Memory contents must be encrypted at rest and in transit

Performance Metrics¶

We will measure the effectiveness of this memory architecture using:

Retrieval latency across tiers
Contextual relevance of retrieved information
Memory utilization efficiency
Consolidation processing overhead
User-perceived continuity in multi-session interactions
Knowledge retention accuracy over time

Implementation Phases¶

Phase 1: Core STM implementation with basic decay mechanisms
Phase 2: WM implementation with STM integration
Phase 3: LTM storage and basic retrieval
Phase 4: Advanced memory transfer mechanisms
Phase 5: Optimization and scaling improvements

Alternatives Considered¶

Single unified memory store: Rejected due to inefficiency in handling different memory requirements
Two-tier system (short/long only): Rejected due to lack of intermediate processing capabilities
External knowledge base only: Rejected due to inability to maintain conversation context
Pure neural storage approach: Rejected due to current limitations in retrieval precision

References¶

Baddeley, A. D., & Hitch, G. (1974). Working memory. Psychology of Learning and Motivation, 8, 47-89.
Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. Psychology of Learning and Motivation, 2, 89-195.
Tulving, E. (1985). How many memory systems are there? American Psychologist, 40(4), 385-398.
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81-97.
Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24(1), 87-114.

ADR-002: Health Dynamics System (Pending)
ADR-003: LLM Integration Architecture (Pending)
ADR-004: Knowledge Representation Format (Pending)
ADR-005: Privacy and Data Retention Policies (Pending)

Notes¶

This ADR establishes the foundational memory architecture for the NCA system. Implementation details will be refined in subsequent technical specifications and component designs. Regular review of memory performance metrics will inform ongoing optimization efforts.