Skip to content

zmzhace/memfog-py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Memfog Python

Intent recognition and conversational memory management for AI applications with enterprise-grade storage.

Features

Core Features (v1)

  • Intent Classification: Determine user intent for document processing, follow-up questions, and memory retrieval
  • Memory Management: Extract, store, and retrieve conversational memories with intelligent deduplication
  • Vector & Keyword Search: Hybrid retrieval combining vector similarity, BM25, and temporal ranking
  • Document Processing: Section-aware chunking and semantic search for long documents
  • Prompt Enhancement: Automatically enrich prompts with relevant historical context

Storage Layer (v2) πŸ†•

  • Pluggable Storage: In-memory and SQLite adapters with protocol-based extensibility
  • Multi-Tenant Support: User and team-based memory isolation
  • Vector Search: Efficient similarity search with filtering
  • Session Management: Track conversation sessions with context
  • Memory Lifecycle: Short-term, long-term, and archived memory states
  • Batch Operations: High-performance bulk inserts and queries

Installation

pip install memfog

For development:

git clone https://github.com/yourusername/memfog-py.git
cd memfog-py
pip install -e ".[dev]"

Quick Start

v1: Memory Extraction & Intent Classification

Memory Extraction

import asyncio
from memfog import MemoryExtractor, OpenAIChat

llm = OpenAIChat(
    api_key="your-api-key",
    base_url="https://api.openai.com/v1",
    model="gpt-4"
)

extractor = MemoryExtractor(llm)

async def main():
    memories = await extractor.extract(
        user_input="I prefer using Python for data analysis",
        ai_response="That's a great choice! Python has excellent libraries."
    )
    print(memories)

asyncio.run(main())

v2: Enterprise Storage Layer πŸ†•

Basic Memory Operations

from memfog.v2.storage.factory import create_storage
from memfog.v2.types import MemoryItem, MemoryScope, MemoryType

# Create storage adapter (in-memory or SQLite)
storage = create_storage("sqlite", db_path="./memories.db")
await storage.initialize()

# Create a memory
memory: MemoryItem = {
    "id": "mem-1",
    "content": "User prefers dark mode for coding",
    "category": "preference",
    "memory_type": MemoryType.TEXT,
    "scope": MemoryScope.LONG_TERM,
    "user_id": "user-123",
    "importance": 8,
    "confidence": 0.95,
    "embedding": [0.1, 0.2, 0.3, ...]  # 1536-dim vector
}
await storage.create_memory(memory)

# Query memories
memories = await storage.query_memories({
    "user_id": "user-123",
    "scope": MemoryScope.LONG_TERM.value,
    "archived": False
}, limit=10)

# Vector similarity search
results = await storage.vector_search(
    embedding=query_embedding,
    filters={"user_id": "user-123"},
    limit=5,
    min_score=0.7
)

for memory, score in results:
    print(f"{memory['content']} (similarity: {score:.2f})")

await storage.close()

Multi-Tenant Memory Management

# User management
user = {
    "id": "user-123",
    "name": "Alice",
    "team_id": "team-1",
    "metadata": {"role": "developer"}
}
await storage.create_user(user)

# Session tracking
session = {
    "id": "session-456",
    "user_id": "user-123",
    "context": {"project": "memfog-py"},
    "started_at": "2026-03-07T10:00:00Z"
}
await storage.create_session(session)

# Isolate memories by user
user_memories = await storage.query_memories({"user_id": "user-123"})

# Team-based access
team_memories = await storage.query_memories({"team_id": "team-1"})

Memory Lifecycle Management

# Create short-term memory
temp_memory = {
    "id": "temp-1",
    "content": "Current task: refactor auth module",
    "scope": MemoryScope.SHORT_TERM,
    "importance": 5
}
await storage.create_memory(temp_memory)

# Promote to long-term
await storage.update_memory("temp-1", {
    "scope": MemoryScope.LONG_TERM,
    "importance": 9
})

# Archive old memories
await storage.update_memory("temp-1", {"archived": True})

v1: Vector Memory Store (Legacy)

from memfog import VectorMemoryStore, OpenAIChat, OpenAIEmbedding, MemoryStorage

llm = OpenAIChat(api_key="...", base_url="...", model="gpt-4")
embedding = OpenAIEmbedding(
    api_key="...",
    base_url="...",
    model="text-embedding-3-small",
    dimensions=1536
)
storage = MemoryStorage()

store = VectorMemoryStore(llm=llm, embedding=embedding, storage=storage)

async def main():
    # Add memory
    await store.add({
        "content": "User prefers dark mode",
        "category": "preference",
        "importance": 4
    })

    # Search
    results = await store.search("What are my UI preferences?")
    for mem in results:
        print(f"{mem['content']} (score: {mem['score']:.2f})")

asyncio.run(main())

Intent Classification

from memfog import DocumentIntentClassifier, RetrievalIntentClassifier

# Document intent
doc_classifier = DocumentIntentClassifier(llm)
intent = await doc_classifier.classify("Summarize this document")
print(intent)  # "global" or "detail"

# Retrieval intent
retrieval_classifier = RetrievalIntentClassifier()
should_retrieve = await retrieval_classifier.should_retrieve("What's my preference?")
print(should_retrieve)  # True

Document Store

from memfog import DocumentStore

doc_store = DocumentStore(embedding)

async def main():
    # Ingest document
    with open("document.txt") as f:
        text = f.read()
    await doc_store.ingest("doc1", text)

    # Search
    results = await doc_store.search("machine learning algorithms", limit=3)
    for result in results:
        print(f"Score: {result['score']:.2f}")
        print(f"Content: {result['content'][:100]}...")

asyncio.run(main())

Prompt Enhancement

from memfog import PromptEnhancer

enhancer = PromptEnhancer(
    vector_store=store,
    embedding=embedding,
    llm=llm,
    retrieval_classifier=retrieval_classifier
)

async def main():
    result = await enhancer.enhance(
        user_input="What should I use for the project?",
        recent_history=[
            {"role": "user", "content": "I'm starting a new data project"},
            {"role": "assistant", "content": "That sounds exciting!"}
        ]
    )
    print(result["structured_memory"])

asyncio.run(main())

Architecture

v1: Core Modules

Core Module

  • Interfaces: LLMProvider, EmbeddingProvider, StorageAdapter
  • Providers: OpenAIChat, OpenAIEmbedding
  • Storage: MemoryStorage (in-memory)
  • Math: cosine, tokenize, bm25_score, rrf_fuse

Intent Module

  • DocumentIntentClassifier: Classify document processing intent
  • FollowUpIntentClassifier: Determine if follow-up needs search
  • RetrievalIntentClassifier: Decide if memory retrieval is needed

Memory Module

  • MemoryExtractor: Extract structured memories from conversations
  • VectorMemoryStore: Vector-based memory with deduplication
  • LocalMemoryStore: Keyword-based fallback memory
  • DocumentStore: Section-aware document chunking and search
  • PromptEnhancer: 5-step memory-enhanced prompt pipeline

v2: Storage Layer πŸ†•

Core Components

  • StorageAdapter Protocol: Interface for pluggable storage backends
  • MemoryStorage: In-memory implementation (development/testing)
  • SQLiteStorage: Production SQLite with vector search
  • StorageFactory: Registry pattern for creating adapters

Data Models

  • MemoryItem: Comprehensive memory schema with metadata
  • User: User entity with team support
  • Session: Session tracking with context
  • Enums: MemoryScope, MemoryVisibility, MemoryType, Permission

Key Features

  • Multi-tenant isolation (user/team level)
  • Vector similarity search with filters
  • Batch operations for performance
  • Memory lifecycle management (short-term β†’ long-term β†’ archived)
  • Session context tracking

For detailed v2 documentation, see memfog/v2/README.md

Version Comparison

Feature v1 v2
Memory Storage In-memory only In-memory + SQLite
Multi-tenancy ❌ βœ… User & Team isolation
Vector Search Basic Advanced with filters
Session Management ❌ βœ… Full session tracking
Batch Operations ❌ βœ… High-performance bulk ops
Memory Lifecycle Basic βœ… Short/Long-term + Archive
Production Ready Development βœ… SQLite persistence

Recommendation: Use v2 for production applications requiring persistence and multi-user support. v1 remains available for simple use cases.

Testing

# Run all tests
pytest

# Run v1 tests
pytest tests/test_*.py -v

# Run v2 tests
pytest tests/v2/ -v

# Run specific v2 test suites
pytest tests/v2/test_memory_storage.py -v
pytest tests/v2/test_sqlite_storage.py -v
pytest tests/v2/test_integration.py -v

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Roadmap

  • PostgreSQL storage adapter with pgvector
  • Automatic v1 β†’ v2 migration tool
  • Memory compression and archival strategies
  • Distributed storage support (Redis, etc.)
  • Advanced conflict resolution
  • Memory graph relationships

About

Python implementation of memfog - Intent recognition and conversational memory management for AI applications

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages