A modern, AI-powered PDF management and interaction platform built with Next.js and Express.js. Upload, analyze, and chat with your PDF documents using advanced AI technology.
- 🚀 AI-Powered PDF Chat: Interact with your PDF documents using advanced AI
- 🧠 Qwen2-VL-2B-Instruct Model: Local vision-language model for advanced PDF understanding
- 🤖 Multi-Model AI Integration: Google Generative AI + Qwen2-VL for optimal performance
- 📱 Modern UI/UX: Beautiful, responsive design with glass morphism effects
- 🔍 Smart Analysis: Extract insights and analyze PDF content with AI
- 📝 Note Management: Create and manage notes from your PDFs
- 🖼️ Image Extraction: Extract and analyze images from PDF documents
- 💬 Conversational AI: Natural language processing for interactive discussions
- 🎯 Context-Aware Responses: AI maintains conversation history for meaningful interactions
- 📊 Automated Summarization: Generate intelligent summaries of PDF content
- 🔐 User Authentication: Secure user management with JWT
- 📊 Dashboard: Centralized management of all your PDFs and notes
- 🌐 RESTful API: Well-documented API with Swagger/OpenAPI
Our PDF Helper AI leverages cutting-edge generative AI technologies to provide intelligent document interaction:
- 🧠 Qwen2-VL-2B-Instruct: Advanced vision-language model deployed locally for superior PDF understanding and multimodal analysis
- 🌟 Google Generative AI (Gemini): Integrated for advanced reasoning, content generation, and multi-modal understanding
- 🔄 Hybrid AI Architecture: Combines the power of cloud-based GenAI with local custom models for optimal performance and privacy
- 👁️ Vision-Language Understanding: Specialized model capable of processing both text and visual content from PDFs
- 📄 Document Analysis: Optimized for document understanding with 2B parameters for efficient local inference
- 🖼️ Image Comprehension: Advanced visual reasoning capabilities for charts, diagrams, and images within PDFs
- 💡 Instruction Following: Fine-tuned for following complex instructions and providing detailed responses
- ⚡ Lightweight Architecture: 2B parameter model optimized for local deployment with minimal resource requirements
- 🔒 Privacy-First: Runs entirely offline, ensuring document confidentiality and data security
- 📚 Vision-Language Processing: Qwen2-VL-2B-Instruct model trained for comprehensive document understanding
- 🏠 Local Deployment: Runs entirely on-premise for maximum privacy and data security
- ⚡ Optimized Inference: GPU-accelerated processing with model quantization for fast responses
- 🔒 Privacy-First: All document processing happens locally, ensuring confidentiality
- 🎯 Multimodal Understanding: Enhanced capability for processing text, images, charts, and diagrams in PDFs
- 📊 Efficient Architecture: 2B parameter model provides excellent performance with minimal resource usage
- 💬 Intelligent Conversations: Natural language interface for document queries and analysis
- 📊 Smart Summarization: Automatic generation of key insights and executive summaries
- 🔍 Semantic Search: Advanced content discovery using vector embeddings and similarity matching
- 🖼️ Multimodal Analysis: Process both text and images within PDFs using OCR and vision models
- 🎨 Content Generation: Create structured notes, outlines, and reports from PDF content
- 🔮 Predictive Analysis: AI suggests relevant questions and topics based on document context
- 📈 Performance Optimization: Continuous model improvement through feedback loops and usage analytics
- Qwen2-VL-2B-Instruct: Local vision-language model for document understanding and analysis
- LM Studio SDK: Local model management and inference optimization
- Redis Vector Store: Efficient storage and retrieval of document embeddings
- Custom Training Pipeline: Automated model fine-tuning and deployment workflow
- API Gateway: Seamless integration between multiple AI models and services
- Next.js 15 - React framework with App Router
- TypeScript - Type-safe JavaScript
- Tailwind CSS - Utility-first CSS framework
- Lucide React - Beautiful icons
- Zustand - State management
- React Hot Toast - Notifications
- Express.js - Web framework for Node.js
- MongoDB - NoSQL database with Mongoose ODM
- Google Generative AI - AI integration
- Redis - Caching and session management
- JWT - Authentication
- Multer - File upload handling
- PDF-Parse - PDF text extraction
- Tesseract.js - OCR for image text extraction
- Node.js 18+
- MongoDB database
- Redis server
- Google Generative AI API key
-
Clone the repository
git clone https://github.com/govindmehta/pdfHelper.git cd pdfHelper -
Backend Setup
cd backend npm installCreate a
.envfile in the backend directory:PORT=5000 MONGODB_URI=mongodb://localhost:27017/pdfhelper JWT_SECRET=your-jwt-secret-key GOOGLE_API_KEY=your-google-generative-ai-key REDIS_URL=redis://localhost:6379
-
Frontend Setup
cd ../frontend npm installCreate a
.env.localfile in the frontend directory:NEXT_PUBLIC_API_URL=http://localhost:5000
-
Start the backend server
cd backend npm run dev -
Start the frontend development server
cd frontend npm run dev -
Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:5000
- API Documentation: http://localhost:5000/api-docs
pdfhelper/
├── backend/
│ ├── config/ # Configuration files
│ ├── controllers/ # Route controllers
│ ├── middlewares/ # Custom middleware
│ ├── models/ # Database models
│ ├── routes/ # API routes
│ ├── services/ # Business logic
│ ├── utils/ # Utility functions
│ ├── uploads/ # File uploads
│ └── server.js # Main server file
├── frontend/
│ ├── src/
│ │ ├── app/ # Next.js app router
│ │ ├── components/ # React components
│ │ └── lib/ # Utility libraries
│ ├── public/ # Static assets
│ └── package.json
└── README.md
POST /api/users/register- Register new userPOST /api/users/login- User loginGET /api/users/profile- Get user profile
POST /api/pdfs/upload- Upload PDF fileGET /api/pdfs- Get user's PDFsGET /api/pdfs/:id- Get specific PDFDELETE /api/pdfs/:id- Delete PDF
POST /api/ai/chat- Chat with PDF contentGET /api/ai/conversation/:pdfId- Get conversation history
POST /api/notes- Create noteGET /api/notes- Get user's notesPUT /api/notes/:id- Update noteDELETE /api/notes/:id- Delete note
- Landing Page: Modern hero section with gradient animations
- Dashboard: Glass morphism design with PDF management
- Chat Interface: Real-time AI conversation with structured responses
- Authentication: Clean login/register forms
- AI Response: Structured content parsing with icons and formatting
- JWT-based authentication
- Input validation and sanitization
- File upload restrictions
- CORS configuration
- Rate limiting (recommended for production)
- Mobile-first approach
- Breakpoint-specific layouts
- Touch-friendly interactions
- Optimized performance
- Set up MongoDB Atlas or your preferred database
- Configure Redis instance
- Set environment variables
- Deploy to your preferred platform (Heroku, AWS, etc.)
- Build the application:
npm run build - Deploy to Vercel, Netlify, or your preferred platform
- Configure environment variables
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the ISC License.
- Google Generative AI for powerful AI capabilities
- The open-source community for amazing tools and libraries
- Contributors who help improve this project
For support, please create an issue in the GitHub repository or contact the maintainers.
Made with ❤️ by Govind Mehta