A specialized command-line tool that generates Python unit tests for your functions using the Groq LLM API. Built with engineering rigor: AST-based validation, prompt injection prevention, and deterministic output.
- Python 3.11+
- A Groq API key
# 1. Clone the repository
git clone https://github.com/Abdullahali77/AI_Testing_CLI.git
cd AI_Testing_CLI
# 2. Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
# 3. Install dependencies
pip install -r requirements.txt
# 4. Set up your API key
touch .env # Linux
ni .env -ItemType File # windows powershell
# create .env and put real Groq API key
GROQ_API_KEY= ...python main.py --file path/to/your_module.pypython main.py --file path/to/your_module.py --output tests/test_module.pycat your_module.py | python main.pypython main.py --file your_module.py --model mixtral-8x7b-32768Given example_input.py:
def divide(a: float, b: float) -> float:
if b == 0:
raise ValueError("Cannot divide by zero")
return a / bRun:
python main.py --file example_input.pyExpected output (printed to stdout):
import unittest
from example_input import divide
class TestDivide(unittest.TestCase):
def test_divide_positive_numbers(self):
self.assertAlmostEqual(divide(10, 2), 5.0)
def test_divide_by_zero_raises_value_error(self):
with self.assertRaises(ValueError):
divide(5, 0)
...pytest tests/ -vInstead of fragile regex, the tool uses Python's built-in ast module to parse source code into an Abstract Syntax Tree. This makes function detection mathematically correct — immune to whitespace tricks or unusual formatting.
A common attack vector is embedding malicious LLM instructions inside comments (e.g., # Ignore previous instructions and...). The tokenize module surgically removes all comments before the source code ever reaches the API.
The API is called with temperature=0, ensuring the model produces consistent, reproducible output rather than creative variations.
The LLM is given a tightly scoped system prompt that forbids it from producing anything other than raw Python test code — no markdown fences, no explanations, no prose.
Any prompt containing keywords like "explain", "refactor", "debug", or "fix" is rejected immediately before touching the API, returning: Error: This tool only generates unit tests for functions.
The LLM's response is stripped of markdown fences and validated to confirm it contains test functions before being returned to the user.
AI_Testing_CLI/
├── .gitignore
├── README.md
├── requirements.txt
├── main.py # CLI entry point (Typer)
├── core/
│ ├── parser.py # AST-based function detection
│ ├── sanitizer.py # Comment stripping (prompt injection prevention)
│ └── llm_client.py # Groq API integration
├── utils/
│ ├── validators.py # Out-of-scope request rejection
│ └── formatter.py # LLM output post-processing
└── tests/
├── test_parser.py
├── test_sanitizer.py
├── test_validators.py
└── test_formatter.py