Codeks - Ship Production-Ready AI Products in Weeks

One of the most significant limitations of traditional AI systems is their stateless nature - they forget everything between conversations. Memory stores change this paradigm by enabling AI systems to maintain context, learn from interactions, and provide personalized experiences over time.

Understanding Memory in AI Systems

Memory in AI systems refers to the ability to store, retrieve, and utilize information from past interactions. Unlike human memory, which is organic and associative, AI memory systems require deliberate architectural design.

Types of Memory in AI

1. Short-Term Memory (Working Memory)

Handles the current conversation context, typically limited by the model's context window (e.g., 200K tokens for Claude).

2. Long-Term Memory (Persistent Storage)

Stores information across sessions, enabling AI to remember user preferences, past conversations, and learned facts.

3. Semantic Memory

Stores general knowledge and facts extracted from interactions, organized by meaning rather than chronology.

4. Episodic Memory

Records specific events and conversations, maintaining temporal context and relational information.

Architecture of Memory Stores

A robust memory system for AI typically includes several components:

1. Vector Database

Stores embeddings of conversations and facts for semantic search:

1import { Pinecone } from '@pinecone-database/pinecone';
2
3class VectorMemoryStore {
4  private client: Pinecone;
5  private indexName: string;
6
7  constructor(apiKey: string, indexName: string) {
8    this.client = new Pinecone({ apiKey });
9    this.indexName = indexName;
10  }
11
12  async storeMemory(userId: string, content: string, metadata: any) {
13    // Generate embeddings
14    const embedding = await this.generateEmbedding(content);
15
16    // Store in vector database
17    const index = this.client.index(this.indexName);
18    await index.upsert([{
19      id: `${userId}-${Date.now()}`,
20      values: embedding,
21      metadata: {
22        userId,
23        content,
24        timestamp: new Date().toISOString(),
25        ...metadata
26      }
27    }]);
28  }
29
30  async searchMemory(userId: string, query: string, topK: number = 5) {
31    const queryEmbedding = await this.generateEmbedding(query);
32    const index = this.client.index(this.indexName);
33
34    const results = await index.query({
35      vector: queryEmbedding,
36      topK,
37      filter: { userId: { $eq: userId } },
38      includeMetadata: true
39    });
40
41    return results.matches.map(match => ({
42      content: match.metadata?.content,
43      score: match.score,
44      timestamp: match.metadata?.timestamp
45    }));
46  }
47
48  private async generateEmbedding(text: string): Promise<number[]> {
49    // Use OpenAI, Cohere, or other embedding model
50    // Implementation depends on your chosen provider
51  }
52}
53

2. Structured Database

Maintains relational data, user profiles, and conversation metadata:

1interface UserMemory {
2  userId: string;
3  preferences: Record<string, any>;
4  facts: MemoryFact[];
5  conversationHistory: Conversation[];
6}
7
8interface MemoryFact {
9  id: string;
10  fact: string;
11  confidence: number;
12  source: string;
13  createdAt: Date;
14  lastAccessed: Date;
15}
16
17class StructuredMemoryStore {
18  async saveFact(userId: string, fact: MemoryFact) {
19    await db.userMemories.upsert({
20      where: { userId },
21      update: {
22        facts: {
23          push: fact
24        }
25      },
26      create: {
27        userId,
28        facts: [fact],
29        preferences: {},
30        conversationHistory: []
31      }
32    });
33  }
34
35  async getUserFacts(userId: string): Promise<MemoryFact[]> {
36    const memory = await db.userMemories.findUnique({
37      where: { userId }
38    });
39    return memory?.facts || [];
40  }
41}
42

3. Memory Manager

Orchestrates different memory types and manages retrieval:

1class MemoryManager {
2  private vectorStore: VectorMemoryStore;
3  private structuredStore: StructuredMemoryStore;
4
5  constructor(vectorStore: VectorMemoryStore, structuredStore: StructuredMemoryStore) {
6    this.vectorStore = vectorStore;
7    this.structuredStore = structuredStore;
8  }
9
10  async addMemory(userId: string, content: string, type: 'fact' | 'conversation') {
11    // Store in vector database for semantic search
12    await this.vectorStore.storeMemory(userId, content, { type });
13
14    // Extract and store structured facts
15    if (type === 'fact') {
16      const fact: MemoryFact = {
17        id: generateId(),
18        fact: content,
19        confidence: 0.9,
20        source: 'user-provided',
21        createdAt: new Date(),
22        lastAccessed: new Date()
23      };
24      await this.structuredStore.saveFact(userId, fact);
25    }
26  }
27
28  async getRelevantMemories(userId: string, query: string): Promise<string[]> {
29    // Retrieve from vector store (semantic search)
30    const vectorResults = await this.vectorStore.searchMemory(userId, query, 5);
31
32    // Retrieve structured facts
33    const facts = await this.structuredStore.getUserFacts(userId);
34
35    // Combine and rank results
36    return this.combineAndRankMemories(vectorResults, facts, query);
37  }
38
39  private combineAndRankMemories(
40    vectorResults: any[],
41    facts: MemoryFact[],
42    query: string
43  ): string[] {
44    // Implement ranking logic based on relevance, recency, and importance
45    const combined = [
46      ...vectorResults.map(r => ({ content: r.content, score: r.score })),
47      ...facts.map(f => ({ content: f.fact, score: f.confidence }))
48    ];
49
50    return combined
51      .sort((a, b) => b.score - a.score)
52      .slice(0, 10)
53      .map(m => m.content);
54  }
55}
56

Implementing Memory-Aware Conversations

Basic Conversation Flow with Memory

1class MemoryAwareAssistant {
2  private memoryManager: MemoryManager;
3  private llmClient: any; // Your LLM client (OpenAI, Anthropic, etc.)
4
5  async chat(userId: string, message: string): Promise<string> {
6    // 1. Retrieve relevant memories
7    const relevantMemories = await this.memoryManager.getRelevantMemories(
8      userId,
9      message
10    );
11
12    // 2. Build context with memories
13    const context = this.buildContext(relevantMemories, message);
14
15    // 3. Generate response with memory context
16    const response = await this.llmClient.generateResponse({
17      systemPrompt: `You are a helpful assistant with access to conversation history.
18                     Use the following memories to provide personalized responses:
19                     ${relevantMemories.join('\n')}`,
20      userMessage: message,
21      context
22    });
23
24    // 4. Store the new interaction
25    await this.memoryManager.addMemory(
26      userId,
27      `User: ${message}\nAssistant: ${response}`,
28      'conversation'
29    );
30
31    // 5. Extract and store new facts
32    const extractedFacts = await this.extractFacts(message, response);
33    for (const fact of extractedFacts) {
34      await this.memoryManager.addMemory(userId, fact, 'fact');
35    }
36
37    return response;
38  }
39
40  private buildContext(memories: string[], currentMessage: string): string {
41    return `
42      Previous relevant interactions and facts:
43      ${memories.join('\n---\n')}
44
45      Current message: ${currentMessage}
46    `;
47  }
48
49  private async extractFacts(message: string, response: string): Promise<string[]> {
50    // Use LLM to extract factual information
51    const prompt = `Extract key facts from this conversation that should be remembered:
52                    User: ${message}
53                    Assistant: ${response}
54
55                    Return facts as a JSON array.`;
56
57    const result = await this.llmClient.generateResponse({
58      userMessage: prompt,
59      responseFormat: 'json'
60    });
61
62    return JSON.parse(result).facts || [];
63  }
64}
65

Advanced Memory Patterns

1. Hierarchical Memory Organization

1interface MemoryHierarchy {
2  immediate: string[];      // Current conversation
3  session: string[];        // Current session memories
4  user: string[];          // User-level memories
5  global: string[];        // System-wide knowledge
6}
7
8class HierarchicalMemory {
9  async retrieveMemories(userId: string, query: string): Promise<MemoryHierarchy> {
10    return {
11      immediate: await this.getImmediateContext(),
12      session: await this.getSessionMemories(userId),
13      user: await this.getUserMemories(userId, query),
14      global: await this.getGlobalKnowledge(query)
15    };
16  }
17}
18

2. Memory Consolidation

Periodic process to summarize and compress old memories:

1class MemoryConsolidation {
2  async consolidateMemories(userId: string) {
3    const oldMemories = await this.getOldMemories(userId, 30); // 30 days old
4
5    // Summarize old conversations
6    const summary = await this.llmClient.summarize({
7      content: oldMemories.join('\n'),
8      maxLength: 500
9    });
10
11    // Store consolidated summary
12    await this.memoryManager.addMemory(
13      userId,
14      `Summary of previous interactions: ${summary}`,
15      'fact'
16    );
17
18    // Archive or delete original memories
19    await this.archiveMemories(oldMemories);
20  }
21}
22

3. Memory Importance Weighting

1interface WeightedMemory {
2  content: string;
3  importance: number;
4  recency: number;
5  accessCount: number;
6}
7
8class MemoryRanking {
9  calculateImportance(memory: WeightedMemory): number {
10    const recencyScore = this.calculateRecencyScore(memory.recency);
11    const frequencyScore = Math.log(memory.accessCount + 1);
12    const importanceScore = memory.importance;
13
14    return (
15      recencyScore * 0.3 +
16      frequencyScore * 0.3 +
17      importanceScore * 0.4
18    );
19  }
20
21  private calculateRecencyScore(daysSinceAccess: number): number {
22    // Exponential decay
23    return Math.exp(-daysSinceAccess / 7);
24  }
25}
26

Memory Store Options

Popular Vector Databases

Pinecone: Managed, highly scalable
Weaviate: Open-source, hybrid search
Qdrant: Performance-focused, written in Rust
Chroma: Lightweight, developer-friendly
Milvus: Open-source, distributed

Selection Criteria

1interface VectorDBRequirements {
2  scale: 'small' | 'medium' | 'large';
3  latency: 'low' | 'medium' | 'high';
4  features: string[];
5  budget: 'low' | 'medium' | 'high';
6}
7
8function selectVectorDB(requirements: VectorDBRequirements): string {
9  if (requirements.scale === 'small' && requirements.budget === 'low') {
10    return 'Chroma';
11  }
12  if (requirements.latency === 'low' && requirements.scale === 'large') {
13    return 'Pinecone';
14  }
15  // Add more selection logic
16}
17

Best Practices

1. Privacy and Security

Encrypt sensitive memories
Implement user data deletion
Use tenant isolation in multi-user systems

1class SecureMemoryStore {
2  async storeMemory(userId: string, content: string, sensitive: boolean = false) {
3    const data = sensitive ? await this.encrypt(content) : content;
4    await this.memoryManager.addMemory(userId, data, 'fact');
5  }
6
7  async deleteUserData(userId: string) {
8    await this.vectorStore.deleteByFilter({ userId });
9    await this.structuredStore.deleteUser(userId);
10  }
11}
12

2. Memory Refresh and Validation

Periodically validate and update stored memories:

1async validateMemories(userId: string) {
2  const facts = await this.memoryManager.getUserFacts(userId);
3
4  for (const fact of facts) {
5    if (this.isOutdated(fact)) {
6      await this.updateOrRemoveFact(fact);
7    }
8  }
9}
10

3. Context Window Management

Balance between memory context and available token budget:

1function selectMemoriesForContext(
2  memories: string[],
3  maxTokens: number
4): string[] {
5  let totalTokens = 0;
6  const selected = [];
7
8  for (const memory of memories) {
9    const tokens = estimateTokens(memory);
10    if (totalTokens + tokens <= maxTokens) {
11      selected.push(memory);
12      totalTokens += tokens;
13    } else {
14      break;
15    }
16  }
17
18  return selected;
19}
20

Real-World Applications

Customer Support Bots

Remember customer issues, preferences, and previous interactions for personalized support.

Personal AI Assistants

Learn user habits, preferences, and routines to provide proactive assistance.

Educational Platforms

Track learning progress, adapt to student needs, and provide personalized recommendations.

Healthcare AI

Maintain patient history while ensuring HIPAA compliance and data security.

Conclusion

Memory stores are transforming AI systems from stateless processors to intelligent, context-aware assistants. By implementing robust memory architectures, you can build AI applications that:

Provide personalized experiences
Learn and improve over time
Maintain long-term context
Build meaningful relationships with users

The key is choosing the right combination of storage technologies, implementing efficient retrieval mechanisms, and maintaining data privacy and security.

As AI systems become more sophisticated, memory management will be crucial for creating truly intelligent applications that understand and remember their users.

Start building memory-aware AI today. Experiment with vector databases, implement fact extraction, and create personalized AI experiences that users will love.

Memory Stores in AI Systems - Building Conversational AI with Long-Term Context