This project implements a hallucination mitigation research framework using Google's Agent Development Kit (ADK). The primary goal is to explore and compare different agent architectures—specifically RAG (Retrieval-Augmented Generation), CoT (Chain-of-Thought), and NLI (Natural Language Inference) filtered agents—in their ability to reduce hallucinations in AI-powered chatbots. The system provides a web-based interface for interacting with these agents and visualizing their reasoning processes, resources used, and final answers, thereby offering insights into their hallucination mitigation strategies.
The application is currently deployed and accessible via the following link:
https://dietitian-api-411547369.us-central1.run.app/dev-ui?app=RAG_agent
The system is designed around a modular architecture, leveraging Google's ADK to facilitate the development and deployment of various agent types. At a high level, the architecture consists of:
- Agent Development Kit (ADK) Core: Provides the foundational framework for building, testing, and deploying AI agents.
- Multiple Agent Implementations: Separate modules for RAG, CoT, and NLI-filtered agents, each designed to address hallucination through distinct mechanisms.
- Web User Interface (UI): A Gradio-based interface (as suggested by the previous README) that allows users to select an agent, input queries, and view the agent's responses, including detailed state information (reasoning, resources, final answer).
- Data Management: Handles the ingestion and processing of knowledge bases (e.g.,
nutrition_handbook.pdffor the RAG agent) and manages test prompts and results. - Evaluation and Monitoring: Tools and frameworks (as indicated by the
batch_test.pyandresultsdirectories) for evaluating agent performance and tracking metrics related to hallucination.
This high-level design emphasizes flexibility and extensibility, allowing for easy integration of new agent types and evaluation methodologies.
Each agent (RAG, CoT, NLI-filtered) is implemented as a distinct ADK agent, typically residing in its own directory (e.g., RAG_agent, COT_agent, nli_filtered_agent).
- RAG Agent: This agent likely utilizes the
ragsubdirectory, which containsembed_pdf.pyfor embedding document content (e.g.,nutrition_handbook.pdf) andretriever.pyfor retrieving relevant information based on user queries. Theagent.pyfile orchestrates the retrieval and generation process. - CoT Agent: The
COT_agentdirectory containsagent.pyandbatch_test.py. The CoT agent is expected to generate intermediate reasoning steps to arrive at a final answer, which can be inspected in the UI's state section. - NLI-Filtered Agent: The
nli_filtered_agentincludesnli_verifier.pyandrag_wrapper.py, suggesting that it uses Natural Language Inference to verify the factual consistency of generated responses, potentially filtering out hallucinated content. It also appears to leverage a knowledge base (full_nutrition_knowledge_base.txt) and a retriever.
- User Query: A user submits a query through the web UI.
- Agent Selection: The UI routes the query to the selected ADK agent (RAG, CoT, or NLI-filtered).
- Agent Processing: The selected agent processes the query. This may involve:
- Retrieval: For RAG and NLI-filtered agents, relevant information is retrieved from their respective knowledge bases.
- Reasoning: CoT agents generate a chain of thought.
- Verification: NLI-filtered agents perform factual verification.
- Response Generation: The agent generates a response based on its processing.
- State Information: During processing, the agent populates a 'state' object with details like
reasoning,resources, andfinal_answer. This state is then displayed in the UI. - UI Display: The UI renders the agent's response and the detailed state information, allowing users to understand the agent's internal workings and hallucination mitigation strategies.
The Dockerfile at the root of the TestSuite directory indicates that the entire application can be containerized, ensuring consistent environments for development, testing, and deployment. The requirements.txt lists all necessary Python dependencies.
To run this project locally, you will need to have Google's Agent Development Kit (ADK) installed, along with Python and Docker.
- Python 3.9+: Ensure you have a compatible Python version installed.
- Google ADK: Install the ADK by following the official documentation. You can typically install it via pip:
pip install google-adk
- Docker: For building and running the application in a containerized environment.
-
Clone the repository (or extract the provided
TestSuite.zip): If you received a.zipfile, extract it to your desired working directory. If it's a Git repository, clone it:git clone https://github.com/jskoerner/TestSuite.git cd TestSuiteNote: The provided
TestSuite.zipcontains a nestedTestSuitedirectory. Navigate into the innerTestSuitedirectory after extraction. For example, if you extracted to/home/ubuntu/TestSuite, thencd /home/ubuntu/TestSuite/TestSuite. -
Install Python dependencies: Navigate to the root of the
TestSuiteproject (whererequirements.txtis located) and install the required packages:pip install -r requirements.txt
-
Run the ADK web interface: From the root of the
TestSuiteproject, execute the following command to start the ADK web interface:adk web
This will typically launch the web interface in your browser at
http://localhost:8000(or a similar port).
To containerize the application using Docker, follow these steps from the root of the TestSuite project (where the Dockerfile is located):
-
Build the Docker image:
docker build -t hallucination-mitigation-adk . -
Run the Docker container:
docker run -p 8000:8000 hallucination-mitigation-adk
This will map port 8000 from the container to port 8000 on your local machine, allowing you to access the ADK web interface at
http://localhost:8000.
Once the ADK web interface is running (either locally via adk web or through Docker), you can interact with the different agents:
-
Access the Web Interface: Open your web browser and navigate to
http://localhost:8000(or the URL provided byadk web). -
Select an ADK Agent: On the left-hand side of the interface, you will find a dropdown menu. From this dropdown, you can select the desired ADK agent to interact with. The available agents are typically:
RAG_agent(Retrieval-Augmented Generation)CoT_agent(Chain-of-Thought)nli_filtered_agent(Natural Language Inference filtered)
-
Interact with the Chatbot: Once an agent is selected, you can type your queries into the chat input field at the bottom of the screen and press Enter.
-
Analyze the State Section: The right-hand side of the interface features a "State" section. This section is crucial for understanding the agent's internal workings and hallucination mitigation process. It provides:
- Reasoning: The steps or logic the agent followed to arrive at its answer.
- Resources: The external information or knowledge base entries the agent utilized.
- Final Answer: The agent's ultimate response to your query.
This detailed state information allows for a transparent analysis of how each agent addresses potential hallucinations and provides its answer.
Here are some screenshots illustrating the application's interface and functionality:
Contributions are welcome! If you have suggestions for improving this project, please feel free to:
- Fork the repository.
- Create a new branch (
git checkout -b feature/YourFeature). - Make your changes.
- Commit your changes (
git commit -m 'Add some feature'). - Push to the branch (
git push origin feature/YourFeature). - Open a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
The application is deployed on Google Cloud Run and can be accessed at:
https://dietitian-api-411547369.us-central1.run.app/dev-ui?app=RAG_agent



