We propose a HITL framework to improve LLM-enabled domain model generation with a refinement loop. The workflow is organized in two main phases:
- Initial Modeling Phase: Start with a domain description to create a draft domain model.
- Iterative Improvement Phase: Refine the domain model via a Q&A feedback loop.
The ToT-Q framework is supported by seven components:
- ToT & Confidence Quantification – Creates the domain model using ToT4DM framework and estimates confidence of the recommended elements.
- Concept Prioritization – Prioritizes validation from central concepts outward, using a structure the refinement process based on concepts and confidence.
- Element Relevance Validation – Detects elements with lowest score to validate if necessary in the domain model.
- Modeling Pattern Matching – Detects modeling patterns in the domain model and uses enabling patterns for a sequence of questions until completed.
- Selection of patterns – Select matched patterns prioritizing the areas of uncertainty in the domain model using a confidence threshold.
- Question Generation – Generates questions from selected patterns with a rule-based agent, adapted to the user modeling expertise.
- Model Refinement – Updates the domain model and confidence scores based on domain expert’s answers, until all questions are addressed or a limit is reached.
The ToT-Q tool is developed using the ToT4DM DSL tool, BESSER Web modeling editor and BESSER Agentic framework.
Request OpenAI or Azure keys to have access to the LLM API. Instructions are in the following links:
To configure the ToT DSL:
- Create the .env file as instructed in the Tot4DM repo.
- Review the examples to configure the ToT4DM DSL.
To configure the BESSER Agentic framework:
- Configure the config.ini file with the websocket options indicated in the BESSER Agentic framework docs.
To configure the templates, you can modify the question variables in the following python file.
Add in the .env file the following variables to configure the trigger of questions:
# Maximum number of questions in the Q&A loop
MAX_QUESTIONS = 15
# Relevance threshold for filtering low-confidence elements
RELEVANCE_THRESHOLD = 0.35 # Suggested range: [0.1, 0.5]
# Refinement threshold for triggering questions on uncertain elements
REFINEMENT_THRESHOLD = 0.8 # Suggested range: [0.5, 0.9]
# Confidence values used when updating the model based on expert answers
HIGH_CONFIDENCE = 0.95 # Suggested range: [0.8, 1.0]
LOW_CONFIDENCE = 0.4 # Suggested range: [0.1, 0.5]Quick Start: Run both the rule-based agent and editor, then access at http://localhost:5000
- Install Python 3.11 and create a virtual environment
- Install the required packages:
pip install -r requirements.txt
- Configure the templates and question triggers in the .env file.
- Run the rule-based agent (this agent calls the LLM agents):
python tot_rules_q/rule_agent.py
- A log will capture all the thoughts created by the LLM and questions triggered by the rule-based agent.
- Navigate to BESSER_WME and install Node.js dependencies:
cd BESSER_WME npm install - Start the web modeling editor:
npm run start:webapp
The results of the experiments include the reference models and the output from the experiments. To run the experiments, use the input data with the domain descriptions. Then execute the experiment:
python tot_rules_q/rule_agent.pyThen start the BESSER Web Modeling Editor in a separate terminal:
cd BESSER_WME
npm run start:webapp