PDF Helper AI 📄🤖

A modern, AI-powered PDF management and interaction platform built with Next.js and Express.js. Upload, analyze, and chat with your PDF documents using advanced AI technology.

✨ Features

🚀 AI-Powered PDF Chat: Interact with your PDF documents using advanced AI
🧠 Qwen2-VL-2B-Instruct Model: Local vision-language model for advanced PDF understanding
🤖 Multi-Model AI Integration: Google Generative AI + Qwen2-VL for optimal performance
📱 Modern UI/UX: Beautiful, responsive design with glass morphism effects
🔍 Smart Analysis: Extract insights and analyze PDF content with AI
📝 Note Management: Create and manage notes from your PDFs
🖼️ Image Extraction: Extract and analyze images from PDF documents
💬 Conversational AI: Natural language processing for interactive discussions
🎯 Context-Aware Responses: AI maintains conversation history for meaningful interactions
📊 Automated Summarization: Generate intelligent summaries of PDF content
🔐 User Authentication: Secure user management with JWT
📊 Dashboard: Centralized management of all your PDFs and notes
🌐 RESTful API: Well-documented API with Swagger/OpenAPI

🤖 AI & Machine Learning

Generative AI Integration

Our PDF Helper AI leverages cutting-edge generative AI technologies to provide intelligent document interaction:

🧠 Qwen2-VL-2B-Instruct: Advanced vision-language model deployed locally for superior PDF understanding and multimodal analysis
🌟 Google Generative AI (Gemini): Integrated for advanced reasoning, content generation, and multi-modal understanding
🔄 Hybrid AI Architecture: Combines the power of cloud-based GenAI with local custom models for optimal performance and privacy

Qwen2-VL-2B-Instruct Model Features

👁️ Vision-Language Understanding: Specialized model capable of processing both text and visual content from PDFs
📄 Document Analysis: Optimized for document understanding with 2B parameters for efficient local inference
🖼️ Image Comprehension: Advanced visual reasoning capabilities for charts, diagrams, and images within PDFs
💡 Instruction Following: Fine-tuned for following complex instructions and providing detailed responses
⚡ Lightweight Architecture: 2B parameter model optimized for local deployment with minimal resource requirements
🔒 Privacy-First: Runs entirely offline, ensuring document confidentiality and data security

Custom Model Features

📚 Vision-Language Processing: Qwen2-VL-2B-Instruct model trained for comprehensive document understanding
🏠 Local Deployment: Runs entirely on-premise for maximum privacy and data security
⚡ Optimized Inference: GPU-accelerated processing with model quantization for fast responses
🔒 Privacy-First: All document processing happens locally, ensuring confidentiality
🎯 Multimodal Understanding: Enhanced capability for processing text, images, charts, and diagrams in PDFs
📊 Efficient Architecture: 2B parameter model provides excellent performance with minimal resource usage

AI-Powered Features

💬 Intelligent Conversations: Natural language interface for document queries and analysis
📊 Smart Summarization: Automatic generation of key insights and executive summaries
🔍 Semantic Search: Advanced content discovery using vector embeddings and similarity matching
🖼️ Multimodal Analysis: Process both text and images within PDFs using OCR and vision models
🎨 Content Generation: Create structured notes, outlines, and reports from PDF content
🔮 Predictive Analysis: AI suggests relevant questions and topics based on document context
📈 Performance Optimization: Continuous model improvement through feedback loops and usage analytics

Technical Implementation

Qwen2-VL-2B-Instruct: Local vision-language model for document understanding and analysis
LM Studio SDK: Local model management and inference optimization
Redis Vector Store: Efficient storage and retrieval of document embeddings
Custom Training Pipeline: Automated model fine-tuning and deployment workflow
API Gateway: Seamless integration between multiple AI models and services

🛠️ Tech Stack

Frontend

Next.js 15 - React framework with App Router
TypeScript - Type-safe JavaScript
Tailwind CSS - Utility-first CSS framework
Lucide React - Beautiful icons
Zustand - State management
React Hot Toast - Notifications

Backend

Express.js - Web framework for Node.js
MongoDB - NoSQL database with Mongoose ODM
Google Generative AI - AI integration
Redis - Caching and session management
JWT - Authentication
Multer - File upload handling
PDF-Parse - PDF text extraction
Tesseract.js - OCR for image text extraction

🚀 Getting Started

Prerequisites

Node.js 18+
MongoDB database
Redis server
Google Generative AI API key

Installation

Clone the repository

git clone https://github.com/govindmehta/pdfHelper.git
cd pdfHelper

Backend Setup

cd backend
npm install

Create a .env file in the backend directory:

PORT=5000
MONGODB_URI=mongodb://localhost:27017/pdfhelper
JWT_SECRET=your-jwt-secret-key
GOOGLE_API_KEY=your-google-generative-ai-key
REDIS_URL=redis://localhost:6379

Frontend Setup

cd ../frontend
npm install

Create a .env.local file in the frontend directory:

NEXT_PUBLIC_API_URL=http://localhost:5000

Running the Application

Start the backend server
```
cd backend
npm run dev
```
Start the frontend development server
```
cd frontend
npm run dev
```
Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:5000
- API Documentation: http://localhost:5000/api-docs

📁 Project Structure

pdfhelper/
├── backend/
│   ├── config/           # Configuration files
│   ├── controllers/      # Route controllers
│   ├── middlewares/      # Custom middleware
│   ├── models/          # Database models
│   ├── routes/          # API routes
│   ├── services/        # Business logic
│   ├── utils/           # Utility functions
│   ├── uploads/         # File uploads
│   └── server.js        # Main server file
├── frontend/
│   ├── src/
│   │   ├── app/         # Next.js app router
│   │   ├── components/  # React components
│   │   └── lib/         # Utility libraries
│   ├── public/          # Static assets
│   └── package.json
└── README.md

🔧 API Endpoints

Authentication

POST /api/users/register - Register new user
POST /api/users/login - User login
GET /api/users/profile - Get user profile

PDF Management

POST /api/pdfs/upload - Upload PDF file
GET /api/pdfs - Get user's PDFs
GET /api/pdfs/:id - Get specific PDF
DELETE /api/pdfs/:id - Delete PDF

AI Chat

POST /api/ai/chat - Chat with PDF content
GET /api/ai/conversation/:pdfId - Get conversation history

Notes

POST /api/notes - Create note
GET /api/notes - Get user's notes
PUT /api/notes/:id - Update note
DELETE /api/notes/:id - Delete note

🎨 UI Components

Landing Page: Modern hero section with gradient animations
Dashboard: Glass morphism design with PDF management
Chat Interface: Real-time AI conversation with structured responses
Authentication: Clean login/register forms
AI Response: Structured content parsing with icons and formatting

🔒 Security Features

JWT-based authentication
Input validation and sanitization
File upload restrictions
CORS configuration
Rate limiting (recommended for production)

📱 Responsive Design

Mobile-first approach
Breakpoint-specific layouts
Touch-friendly interactions
Optimized performance

🚀 Deployment

Backend Deployment

Set up MongoDB Atlas or your preferred database
Configure Redis instance
Set environment variables
Deploy to your preferred platform (Heroku, AWS, etc.)

Frontend Deployment

Build the application: npm run build
Deploy to Vercel, Netlify, or your preferred platform
Configure environment variables

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the ISC License.

🙏 Acknowledgments

Google Generative AI for powerful AI capabilities
The open-source community for amazing tools and libraries
Contributors who help improve this project

📞 Support

For support, please create an issue in the GitHub repository or contact the maintainers.

Made with ❤️ by Govind Mehta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDF Helper AI 📄🤖

✨ Features

🤖 AI & Machine Learning

Generative AI Integration

Qwen2-VL-2B-Instruct Model Features

Custom Model Features

AI-Powered Features

Technical Implementation

🛠️ Tech Stack

Frontend

Backend

🚀 Getting Started

Prerequisites

Installation

Running the Application

📁 Project Structure

🔧 API Endpoints

Authentication

PDF Management

AI Chat

Notes

🎨 UI Components

🔒 Security Features

📱 Responsive Design

🚀 Deployment

Backend Deployment

Frontend Deployment

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

PDF Helper AI 📄🤖

✨ Features

🤖 AI & Machine Learning

Generative AI Integration

Qwen2-VL-2B-Instruct Model Features

Custom Model Features

AI-Powered Features

Technical Implementation

🛠️ Tech Stack

Frontend

Backend

🚀 Getting Started

Prerequisites

Installation

Running the Application

📁 Project Structure

🔧 API Endpoints

Authentication

PDF Management

AI Chat

Notes

🎨 UI Components

🔒 Security Features

📱 Responsive Design

🚀 Deployment

Backend Deployment

Frontend Deployment

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support