Memfog Python

Intent recognition and conversational memory management for AI applications with enterprise-grade storage.

Features

Core Features (v1)

Intent Classification: Determine user intent for document processing, follow-up questions, and memory retrieval
Memory Management: Extract, store, and retrieve conversational memories with intelligent deduplication
Vector & Keyword Search: Hybrid retrieval combining vector similarity, BM25, and temporal ranking
Document Processing: Section-aware chunking and semantic search for long documents
Prompt Enhancement: Automatically enrich prompts with relevant historical context

Storage Layer (v2) 🆕

Pluggable Storage: In-memory and SQLite adapters with protocol-based extensibility
Multi-Tenant Support: User and team-based memory isolation
Vector Search: Efficient similarity search with filtering
Session Management: Track conversation sessions with context
Memory Lifecycle: Short-term, long-term, and archived memory states
Batch Operations: High-performance bulk inserts and queries

Installation

pip install memfog

For development:

git clone https://github.com/yourusername/memfog-py.git
cd memfog-py
pip install -e ".[dev]"

Quick Start

v1: Memory Extraction & Intent Classification

Memory Extraction

import asyncio
from memfog import MemoryExtractor, OpenAIChat

llm = OpenAIChat(
    api_key="your-api-key",
    base_url="https://api.openai.com/v1",
    model="gpt-4"
)

extractor = MemoryExtractor(llm)

async def main():
    memories = await extractor.extract(
        user_input="I prefer using Python for data analysis",
        ai_response="That's a great choice! Python has excellent libraries."
    )
    print(memories)

asyncio.run(main())

v2: Enterprise Storage Layer 🆕

Basic Memory Operations

from memfog.v2.storage.factory import create_storage
from memfog.v2.types import MemoryItem, MemoryScope, MemoryType

# Create storage adapter (in-memory or SQLite)
storage = create_storage("sqlite", db_path="./memories.db")
await storage.initialize()

# Create a memory
memory: MemoryItem = {
    "id": "mem-1",
    "content": "User prefers dark mode for coding",
    "category": "preference",
    "memory_type": MemoryType.TEXT,
    "scope": MemoryScope.LONG_TERM,
    "user_id": "user-123",
    "importance": 8,
    "confidence": 0.95,
    "embedding": [0.1, 0.2, 0.3, ...]  # 1536-dim vector
}
await storage.create_memory(memory)

# Query memories
memories = await storage.query_memories({
    "user_id": "user-123",
    "scope": MemoryScope.LONG_TERM.value,
    "archived": False
}, limit=10)

# Vector similarity search
results = await storage.vector_search(
    embedding=query_embedding,
    filters={"user_id": "user-123"},
    limit=5,
    min_score=0.7
)

for memory, score in results:
    print(f"{memory['content']} (similarity: {score:.2f})")

await storage.close()

Multi-Tenant Memory Management

# User management
user = {
    "id": "user-123",
    "name": "Alice",
    "team_id": "team-1",
    "metadata": {"role": "developer"}
}
await storage.create_user(user)

# Session tracking
session = {
    "id": "session-456",
    "user_id": "user-123",
    "context": {"project": "memfog-py"},
    "started_at": "2026-03-07T10:00:00Z"
}
await storage.create_session(session)

# Isolate memories by user
user_memories = await storage.query_memories({"user_id": "user-123"})

# Team-based access
team_memories = await storage.query_memories({"team_id": "team-1"})

Memory Lifecycle Management

# Create short-term memory
temp_memory = {
    "id": "temp-1",
    "content": "Current task: refactor auth module",
    "scope": MemoryScope.SHORT_TERM,
    "importance": 5
}
await storage.create_memory(temp_memory)

# Promote to long-term
await storage.update_memory("temp-1", {
    "scope": MemoryScope.LONG_TERM,
    "importance": 9
})

# Archive old memories
await storage.update_memory("temp-1", {"archived": True})

v1: Vector Memory Store (Legacy)

from memfog import VectorMemoryStore, OpenAIChat, OpenAIEmbedding, MemoryStorage

llm = OpenAIChat(api_key="...", base_url="...", model="gpt-4")
embedding = OpenAIEmbedding(
    api_key="...",
    base_url="...",
    model="text-embedding-3-small",
    dimensions=1536
)
storage = MemoryStorage()

store = VectorMemoryStore(llm=llm, embedding=embedding, storage=storage)

async def main():
    # Add memory
    await store.add({
        "content": "User prefers dark mode",
        "category": "preference",
        "importance": 4
    })

    # Search
    results = await store.search("What are my UI preferences?")
    for mem in results:
        print(f"{mem['content']} (score: {mem['score']:.2f})")

asyncio.run(main())

Intent Classification

from memfog import DocumentIntentClassifier, RetrievalIntentClassifier

# Document intent
doc_classifier = DocumentIntentClassifier(llm)
intent = await doc_classifier.classify("Summarize this document")
print(intent)  # "global" or "detail"

# Retrieval intent
retrieval_classifier = RetrievalIntentClassifier()
should_retrieve = await retrieval_classifier.should_retrieve("What's my preference?")
print(should_retrieve)  # True

Document Store

from memfog import DocumentStore

doc_store = DocumentStore(embedding)

async def main():
    # Ingest document
    with open("document.txt") as f:
        text = f.read()
    await doc_store.ingest("doc1", text)

    # Search
    results = await doc_store.search("machine learning algorithms", limit=3)
    for result in results:
        print(f"Score: {result['score']:.2f}")
        print(f"Content: {result['content'][:100]}...")

asyncio.run(main())

Prompt Enhancement

from memfog import PromptEnhancer

enhancer = PromptEnhancer(
    vector_store=store,
    embedding=embedding,
    llm=llm,
    retrieval_classifier=retrieval_classifier
)

async def main():
    result = await enhancer.enhance(
        user_input="What should I use for the project?",
        recent_history=[
            {"role": "user", "content": "I'm starting a new data project"},
            {"role": "assistant", "content": "That sounds exciting!"}
        ]
    )
    print(result["structured_memory"])

asyncio.run(main())

Architecture

v1: Core Modules

Core Module

Interfaces: LLMProvider, EmbeddingProvider, StorageAdapter
Providers: OpenAIChat, OpenAIEmbedding
Storage: MemoryStorage (in-memory)
Math: cosine, tokenize, bm25_score, rrf_fuse

Intent Module

DocumentIntentClassifier: Classify document processing intent
FollowUpIntentClassifier: Determine if follow-up needs search
RetrievalIntentClassifier: Decide if memory retrieval is needed

Memory Module

MemoryExtractor: Extract structured memories from conversations
VectorMemoryStore: Vector-based memory with deduplication
LocalMemoryStore: Keyword-based fallback memory
DocumentStore: Section-aware document chunking and search
PromptEnhancer: 5-step memory-enhanced prompt pipeline

v2: Storage Layer 🆕

Core Components

StorageAdapter Protocol: Interface for pluggable storage backends
MemoryStorage: In-memory implementation (development/testing)
SQLiteStorage: Production SQLite with vector search
StorageFactory: Registry pattern for creating adapters

Data Models

MemoryItem: Comprehensive memory schema with metadata
User: User entity with team support
Session: Session tracking with context
Enums: MemoryScope, MemoryVisibility, MemoryType, Permission

Key Features

Multi-tenant isolation (user/team level)
Vector similarity search with filters
Batch operations for performance
Memory lifecycle management (short-term → long-term → archived)
Session context tracking

For detailed v2 documentation, see memfog/v2/README.md

Version Comparison

Feature	v1	v2
Memory Storage	In-memory only	In-memory + SQLite
Multi-tenancy	❌	✅ User & Team isolation
Vector Search	Basic	Advanced with filters
Session Management	❌	✅ Full session tracking
Batch Operations	❌	✅ High-performance bulk ops
Memory Lifecycle	Basic	✅ Short/Long-term + Archive
Production Ready	Development	✅ SQLite persistence

Recommendation: Use v2 for production applications requiring persistence and multi-user support. v1 remains available for simple use cases.

Testing

# Run all tests
pytest

# Run v1 tests
pytest tests/test_*.py -v

# Run v2 tests
pytest tests/v2/ -v

# Run specific v2 test suites
pytest tests/v2/test_memory_storage.py -v
pytest tests/v2/test_sqlite_storage.py -v
pytest tests/v2/test_integration.py -v

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Roadmap

PostgreSQL storage adapter with pgvector
Automatic v1 → v2 migration tool
Memory compression and archival strategies
Distributed storage support (Redis, etc.)
Advanced conflict resolution
Memory graph relationships

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docs/plans		docs/plans
memfog		memfog
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Memfog Python

Features

Core Features (v1)

Storage Layer (v2) 🆕

Installation

Quick Start

v1: Memory Extraction & Intent Classification

Memory Extraction

v2: Enterprise Storage Layer 🆕

Basic Memory Operations

Multi-Tenant Memory Management

Memory Lifecycle Management

v1: Vector Memory Store (Legacy)

Intent Classification

Document Store

Prompt Enhancement

Architecture

v1: Core Modules

Core Module

Intent Module

Memory Module

v2: Storage Layer 🆕

Core Components

Data Models

Key Features

Version Comparison

Testing

License

Contributing

Roadmap

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages