Rename the env.example file to .env and provide the required keys.
make install-allThis will install the requirements for all applications (ingestion, embedding, and RAG application).
To run the ingestion process, use the following command:
make run-ingestionThis will scrape Wikipedia for the following pages:
- Artificial intelligence
- Machine learning
- Deep learning
- Natural language processing
- Computer vision
The raw files will be stored in the data/raw directory.
To run the embedding process, use the following command:
make run-embeddingThis will process the raw data, clean it, and chunk it for embedding. The cleaned data will be stored in the data/processed directory. The processed files will be read, and the embedded values will be stored in a local vector database at data/vectordb.
To run the RAG (Retrieval-Augmented Generation) process, use the following command:
make run-ragThis will start the FastAPI RAG application on http://localhost:8000.
To ask a question to the RAG app, it will use the context from the local vector database and pass it to the LLM for a response. Below is a sample endpoint to test it:
curl -X POST "http://localhost:8000/rag" \
-H "Content-Type: application/json" \
-d '{"query": "What is deep learning?", "max_results": 5}'