🚧 Work in Progress (Draft): This project is currently in active development. Features and APIs are subject to change.
RAG Refinement is a full-stack proof-of-concept application designed to enhance the quality of Retrieval-Augmented Generation (RAG) systems. Instead of relying solely on switching backend LLMs to improve answer quality—which can be computationally expensive and operationally complex—this project introduces an ML-driven Prompt Refinement layer.
By analyzing prompt characteristics (clarity, specificity, relevance) and applying regression models (Linear, Polynomial, Ensemble, Neural), the system iteratively optimizes user queries before they are sent to the LLM, resulting in higher-quality generated responses.
In industrial RAG applications, improving query answer quality often requires upgrading to larger, slower, or more expensive models. This project aims to solve this by:
- Developing a lightweight machine learning model to score and refine prompts.
- Reducing the dependency on "smart" models by making the input "smarter".
- Providing a feedback loop for continuous prompt improvement.
- 📄 PDF Document RAG: Upload and chat with PDF documents using local embeddings.
- 🤖 ML-Powered Refinement: Automatically rewrite vague prompts using Ensemble, Linear, or Neural regression methods.
- 📊 Refinement Metrics: Real-time feedback on prompt improvement (e.g., "+15% Clarity").
- 💬 Interactive Chat UI: Modern, responsive interface built with React and Tailwind CSS.
- 🔌 Local LLM Integration: Fully privacy-focused using Ollama (Mistral 7B & Nomic Embed Text).
- Core: Java 17, Spring Boot 3.2.4
- AI/ML: DeepLearning4j (DL4J), LangChain4j
- Database: H2 (In-Memory)
- Build Tool: Maven
- Framework: React 19
- Build Tool: Vite 7
- Styling: Tailwind CSS 4, Lucide React (Icons)
- LLM Serving: Ollama (running locally)
- Java Development Kit (JDK) 17 or higher.
- Node.js (v18+ recommended) and npm.
- Ollama: Installed and running (Download here).
- Pull required models:
ollama pull mistral:7b ollama run mistral:7b
- Pull required models:
git clone https://github.com/mahedjaved/ReactRAGpdf.git
cd rag-refinementNavigate to the project root and run the Spring Boot application:
# Windows
mvn spring-boot:run
# Linux/Mac
./mvnw spring-boot:runThe backend will start on http://localhost:8080.
*To test BE seperately with example of refining a prompt, run the following curl command:
curl -X POST http://localhost:8080/api/rag/refine \
-H "Content-Type: application/json" \
-d '{"prompt": "<enter your prompt>"}'Open a new terminal, navigate to the frontend directory, and start the development server:
cd src/main/rag-frontend
npm install
npm run devThe frontend will start on http://localhost:5173.
The system follows a Refine-Then-Retrieve pattern:
- Input: User enters a raw prompt (e.g., "code for login").
- Feature Extraction: The system analyzes the prompt for features like
avg_token_length,specificity_score, andambiguity. - Regression: An ML model (Ensemble/Linear/Neural) predicts a
quality_score. - Optimization: If the score is below threshold, the prompt is rewritten to maximize these feature weights.
- Retrieval: The optimized prompt is used to query the vector database (PDF chunks).
- Generation: The LLM generates a final answer based on the refined context.
Note on Dependencies:
To ensure lightweight performance, the core deeplearning4j library was optimized to use the nn module, excluding unnecessary heavy dependencies like slf4j-api to prevent classpath conflicts.
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-nn</artifactId>
<version>${dl4j.version}</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</exclusion>
</exclusions>
</dependency>This project is being improved step-by-step. Below is the current record of improvement:
- Prompt Evaluation Criteria: Quality, Accuracy, Clarity, Relevance, Specificity, Completeness.
- Refinement Strategy: Storing refinement features in a dedicated
refinement_featurestable for historical analysis.
- ✅ Build Advanced Regression Pipeline
- ✅ 4 Regression Methods: Linear, Polynomial, Neural Networks, Ensemble
- ✅ Gradient Descent Learning: Dynamic weight optimization
- ✅ Iterative Refinement: Feedback-driven improvement loop
- ✅ Comprehensive Metrics: MSE, RMSE, MAE, R²
- ✅ Basic RAG Implementation (Java-Spring-React)
- ✅ Frontend Refinement UI (Integration complete)
- ☐ Reinforcement Learning: Future work to improve prompt refinement using RL as part of human feedback.
- ☐ Monitoring of prediction confidence for each prompt to detect overconfident models.
Contributions are welcome! Since this is a draft project, please open an issue first to discuss what you would like to change.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature) - Commit your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.