
This open-source Python application demonstrates an interactive way to apply various NVIDIA NeMo Guardrails to an LLM. It features a custom implementation of the Chain-of-Verification (CoVe) technique and allows the user to dynamically select and activate a suite of other built-in guardrails at runtime.
This project is intended for Security Engineers, LLM Security Researchers, and developers looking to build more secure and reliable AI applications.
- Interactive Guardrail Selection: Choose which guardrails to activate for each session from a command-line menu.
- Dynamic Configuration: The application builds the guardrails configuration on the fly based on your selections.
- Chain-of-Verification (CoVe): A custom action implements the full CoVe pipeline to reduce hallucinations and improve factual accuracy.
- Library of Guardrails: Includes pre-configured rails for:
- Jailbreak Detection
- Input/Output Content Moderation
- Topical Rails (to keep the conversation on specific topics)
- LLM Agnostic: Supports both local LLMs via Ollama and API-based LLMs like OpenAI.
- When you run app.py, it presents a menu of available guardrails defined in the GUARDRAILS_LIBRARY dictionary.
- You select one or more guardrails to activate.
- The script loads a base config.yml file.
- It then iterates through your selections, dynamically merging the YAML configurations and appending the Colang flow definitions for each chosen guardrail.
- If a guardrail requires a custom Python action (like our CoVe implementation), the application imports the necessary module.
- Finally, it initializes LLMRails with the complete, dynamically generated configuration and launches the interactive chat.
- Python 3.9+
- An Ollama installation (if using a local LLM) or an OpenAI API key.
-
Clone the repository:
git clone <repository_url>
cd <repository_name> -
Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate` -
Install the dependencies:
pip install -r requirements.txt -
Configure your LLM:
- For Ollama:
- Make sure the Ollama service is running.
- In app.py, set LLM_PROVIDER = "ollama".
- In cove_guardrails/config.yml, update the model name if you want to use a specific Ollama model (e.g., llama3).
- For OpenAI:
-
Create a .env file in the root of the project:
OPENAI_API_KEY="your_openai_api_key" -
In app.py, set LLM_PROVIDER = "openai".
-
- For Ollama:
Once everything is set up, run the application:
- python app.py
- You will first be prompted to select the guardrails you want to activate. After that, you can begin your conversation with the guarded LLM.
This project has a lot of potential for growth. Here are some ideas for future improvements and contributions:
- Expand the Guardrails Library: Integrate more of the official NeMo Guardrails examples, such as fact-checking against a document (RAG) or detecting sensitive data (PII).
- Advanced CoVe: Enhance the self_check_facts action to use external tools or APIs (e.g., a Google Search or Wikipedia API) for more robust verification, rather than relying solely on the LLM's internal knowledge.
- Web Interface: Build a simple web UI using a framework like Streamlit or Flask to make the application more accessible to non-developers.
- Configuration Management: Allow users to save and load their guardrail selections and configurations.
- Enhanced Logging: Add detailed logging to track which guardrails are triggered and what actions are taken, which is crucial for security auditing.
- Support More LLMs: Add easy configuration options for other popular LLM providers like Anthropic, Cohere, or local models via Hugging Face Transformers.
Contributions are welcome and appreciated! If you'd like to contribute, please follow these steps:
-
Fork the repository on GitHub.
-
Create a new branch for your feature or bug fix:
git checkout -b feature/your-feature-name -
Make your changes and ensure the code follows a consistent style.
-
Test your changes thoroughly to ensure they don't break existing functionality.
-
Submit a pull request with a clear description of your changes and why they are needed.
This project is licensed under the MIT License. See the LICENSE file for more details.
For questions, issues, or to get involved with the project, please open an issue on the GitHub repository.
Project Maintainer: @gueriila7 | Ron F. Del Rosario | ronsurf23@gmail.com