Cortex AI: Multi-Model Insights Hub

🤖 Advanced AI-Powered Document Analysis with Multimodal RAG Capabilities

Cortex AI Hub integrates multiple Large Language Models (LLMs) with a sophisticated Multimodal Retrieve-and-Generate (RAG) system, enabling you to extract insights from text, visual content, and video transcripts.

✨ NEW: Premium Dark Theme UI with Glassmorphism - Modern, sleek interface with neon green accents, smooth animations, and frosted glass effects!

🌟 Key Features

📺 YouTube Analyst ⭐ NEW!

🎬 Video Transcript Extraction: Automatically fetch YouTube video transcripts
📝 AI-Powered Summaries: Generate comprehensive video summaries with key takeaways
💬 Interactive Chat: Ask questions about video content using RAG technology
🔍 Hybrid Search: Semantic + keyword search across video transcripts
⚡ Real-Time Analysis: Instant insights from any YouTube video

🖼️ Multimodal RAG

📊 Visual Content Understanding: Analyze images, charts, graphs, and infographics
🔗 Unified Text-Image Search: Search across both textual and visual content
🎯 Context-Aware Analysis: Enhanced understanding with specialized prompts
💾 Persistent Storage: Efficient multimodal embeddings with pickle storage
🆓 Free & Local: Uses open-source models (BLIP, BLIP-2, GIT)

🔍 Advanced Search & RAG

🧠 Hybrid Search: Combines semantic vector search with BM25 keyword search
📂 Multi-Document Support: Upload PDFs or provide URLs
💾 Persistent Vector Database: ChromaDB-powered storage
✅ Accurate Citations: Source-linked responses with references

🤖 AI-Powered Search Agent

🌐 Real-Time Research: ArXiv, Wikipedia, and Tavily web search tools
📰 Current Information: Up-to-date news and research insights
⚡ Instant Responses: Fast, context-aware answers
🔊 Text-to-Speech: Read aloud feature using Edge TTS (en-US-AriaNeural voice)

🎨 Premium UI/UX

🌙 Glassmorphic Dark Theme: Sleek dark interface with frosted glass effects
✨ Smooth Animations: Hover effects, transitions, and micro-animations
🎨 Modern Typography: Inter font family with gradient text effects
📱 Responsive Design: Works beautifully on all screen sizes
💫 Neon Accents: Eye-catching neon green highlights

🚀 Supported AI Models

Model	Provider	Best For
llama-3.3-70b-versatile	Meta	Complex reasoning, analysis
llama-3.1-8b-instant	Meta	Quick queries, fast responses
meta-llama/llama-guard-4-12b	Meta	Safety and content moderation
openai/gpt-oss-120b	OpenAI	Complex analysis tasks
openai/gpt-oss-20b	OpenAI	Balanced performance

🖼️ Vision Models

Model	Description	Best For
BLIP	Quick image captioning	Speed, basic analysis
BLIP-2	Advanced understanding	Complex visual content
GIT	Detailed descriptions	Charts, graphs, infographics

📸 Application Screenshots

🤖 RAG Chatbot Interface

Traditional RAG chatbot with document upload and multi-LLM selection

🖼️ Multimodal RAG Interface

Enhanced multimodal interface with vision model selection and image analysis

🔍 Search Agent Interface

AI-powered search agent with real-time research capabilities

🔄 System Architecture

📊 RAG Chatbot Workflow

Complete RAG chatbot workflow with document processing, hybrid search, and multi-LLM response generation

🤖 Search Agent Workflow

AI-powered search agent workflow with multi-tool research and intelligent orchestration

🖼️ Multimodal RAG Workflow

Enhanced multimodal workflow combining text and visual content analysis

🚀 Getting Started

📋 Prerequisites

Python 3.11+
Git
API Keys: Groq and Tavily

📥 Installation

Clone Repository

git clone https://github.com/RobinMillford/Cortex-AI-Multi-Model-Insights-Hub.git
cd Cortex-AI-Multi-Model-Insights-Hub

Setup Environment

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

Configure API Keys

cp .env.template .env
# Add your GROQ_API_KEY and TAVILY_API_KEY to .env

Run Application
```
streamlit run Main_Page.py
```

🌐 Live Demo

🚀 Try it now

📖 Usage Guide

📺 YouTube Video Analysis ⭐ NEW!

Navigate to "YouTube Analyst" page
Paste a YouTube URL in the sidebar
Click "Analyze Video" to extract transcript
View auto-generated summary
Ask questions about the video content
Get AI-powered insights with context from the transcript

🖼️ Multimodal Document Analysis

Navigate to "Multimodal RAG" page
Choose vision model (BLIP for speed, GIT for accuracy)
Upload PDF with images/charts
Enable "Extract and analyze images"
Ask questions about text and visual content

📄 Traditional Document Chat

Go to "RAG Chatbot" page
Upload PDFs or enter URLs
Configure retrieval parameters
Select LLM models for comparison
Ask questions and get cited responses

🔍 Research & Web Search

Visit "Search Agent" page
Enter research queries
Choose preferred LLM model
Get real-time answers with sources

🛠️ Technology Stack

Frontend: Streamlit with premium glassmorphic dark theme
Backend: Python, LangChain/LangGraph
Vector DB: ChromaDB (text embeddings)
Embeddings: HuggingFace sentence-transformers
Vision: BLIP, BLIP-2, GIT (Hugging Face Transformers)
LLMs: Groq API (Meta Llama, OpenAI models)
Search: Tavily, ArXiv, Wikipedia APIs
Video: YouTube Transcript API
Text-to-Speech: Edge TTS (Microsoft Azure Neural Voices)

📁 Project Structure

├── Main_Page.py                 # App entry point with hero section
├── multimodal_helpers.py        # Multimodal processing utilities
├── helpers.py                   # Text processing utilities
├── chain_setup.py               # LLM configuration
├── styles.py                    # Premium dark theme CSS
├── config.py                    # Model configurations
├── pages/
│   ├── 1_RAG_Chatbot.py        # Traditional RAG interface
│   ├── 2_Search_Agent.py       # Web search agent
│   ├── 3_Multimodal_RAG.py     # Multimodal interface
│   └── 4_YouTube_Analyst.py    # YouTube video analysis ⭐ NEW!
├── chroma_db/                   # Text vector storage
├── multimodal_stores/           # Multimodal embeddings storage
└── requirements.txt             # Python dependencies

🔧 Key Technical Features

🧠 Architecture Highlights

YouTube Integration: Transcript extraction with RAG-powered Q&A
Two-Layer Vision: Vision models → descriptions, embeddings → search
Hybrid Search: Semantic + BM25 for optimal retrieval
Model Caching: Global cache prevents reloading
Session Management: Streamlit state for persistence
Glassmorphism UI: Backdrop blur and frosted glass effects

⚡ Performance Optimizations

Vision models cached globally
Processed embeddings saved for reuse
Lazy loading when needed
Real-time progress feedback
Efficient pickle-based storage
Optimized ChromaDB collection naming

🎨 UI/UX Enhancements

Glassmorphic Design: Frosted glass effects with backdrop blur
Gradient Text Effects: Animated gradient titles
Smooth Animations: Cubic-bezier transitions
Neon Glow Effects: Interactive hover states
Modern Typography: Inter font family
Custom Scrollbars: Styled with gradient effects
Enhanced Components: Buttons, inputs, expanders, and more

📝 Recent Updates

✨ Version 3.0 (Latest)

📺 YouTube Analyst: NEW feature for video transcript analysis and chat
🔊 Text-to-Speech: Read aloud feature in Search Agent using Edge TTS
🎨 Glassmorphic UI: Complete redesign with frosted glass effects
🔤 Inter Font: Modern typography with gradient text effects
✨ Enhanced Animations: Smooth cubic-bezier transitions
🎯 Improved Components: All UI elements redesigned
📊 Updated Main Page: 2x2 grid layout for 4 tools
🔧 CSS Centralization: Unified styles.py for consistency

✨ Version 2.0

🎨 Premium Dark Theme: Complete UI overhaul with modern design
🤖 Updated Model List: Added llama-guard-4-12b, removed deprecated models
🔧 Dependency Cleanup: Removed pysqlite3-binary for better compatibility
✨ Enhanced Animations: Smooth transitions and hover effects
📊 Stats Section: Added visual statistics on main page
🎯 Improved Navigation: Better sidebar organization

🤝 Contributing

Fork the repository
Create feature branch: git checkout -b feature/your-feature
Make changes and test locally
Commit and push: git commit -m "Add feature"
Create Pull Request

🎯 Areas for Contribution

📺 Enhanced video analysis features
🖼️ New vision models or analysis techniques
🔍 Better retrieval algorithms
🎨 UI/UX improvements
📊 Analytics and metrics
🧪 Testing and documentation

📝 License

This project is licensed under the AGPL-3.0 License.

🆘 Support

🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

🙏 Acknowledgments

🤗 Hugging Face: Free open-source vision models
🦙 Meta: Llama models and vision transformers
🔍 Salesforce: BLIP vision models
🏢 Microsoft: GIT vision model
⚡ Groq: Fast LLM inference
🌐 Streamlit: Amazing app framework
🔎 Tavily: Advanced web search API
📺 YouTube Transcript API: Video transcript extraction

Made with ❤️ by Yamin Hossain

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.devcontainer		.devcontainer
.streamlit		.streamlit
images		images
pages		pages
.gitignore		.gitignore
LICENSE		LICENSE
Main_Page.py		Main_Page.py
README.md		README.md
chain_setup.py		chain_setup.py
config.py		config.py
helpers.py		helpers.py
multimodal_helpers.py		multimodal_helpers.py
requirements.txt		requirements.txt
styles.py		styles.py

Folders and files

Latest commit

History

Repository files navigation

Cortex AI: Multi-Model Insights Hub

🌟 Key Features

📺 YouTube Analyst ⭐ NEW!

🖼️ Multimodal RAG

🔍 Advanced Search & RAG

🤖 AI-Powered Search Agent

🎨 Premium UI/UX

🚀 Supported AI Models

🖼️ Vision Models

📸 Application Screenshots

🤖 RAG Chatbot Interface

🖼️ Multimodal RAG Interface

🔍 Search Agent Interface

🔄 System Architecture

📊 RAG Chatbot Workflow

🤖 Search Agent Workflow

🖼️ Multimodal RAG Workflow

🚀 Getting Started

📋 Prerequisites

📥 Installation

🌐 Live Demo

📖 Usage Guide

📺 YouTube Video Analysis ⭐ NEW!

🖼️ Multimodal Document Analysis

📄 Traditional Document Chat

🔍 Research & Web Search

🛠️ Technology Stack

📁 Project Structure

🔧 Key Technical Features

🧠 Architecture Highlights

⚡ Performance Optimizations

🎨 UI/UX Enhancements

📝 Recent Updates

✨ Version 3.0 (Latest)

✨ Version 2.0

🤝 Contributing

🎯 Areas for Contribution

📝 License

🆘 Support

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages