Numeric Online Action Model Learning

A Python-based framework for learning numeric action models in planning domains using various online and offline learning algorithms. This repository implements several state-of-the-art algorithms for learning PDDL+ action models from observed trajectories, with support for both discrete and numeric state variables.

Overview

This framework provides implementations of multiple action model learning algorithms:

SAM (Safe Action Model) Learning: Learns discrete action models from observations
Numeric SAM: Extension of SAM for learning numeric state variables and effects
NOAM (Numeric Online Action Model): Online learning algorithm for numeric action models
Semi-Online Learning: Hybrid approach combining exploration and solver-based planning
Informative Explorer: Uses informative state selection for efficient exploration
Goal-Oriented Explorer: Focuses exploration toward goal states
Optimistic Explorer: Uses optimistic model assumptions for exploration

The framework supports learning from:

Pre-recorded trajectory files
Online interaction with planning environments
Solver-generated plans (using ENHSP or Metric-FF)

Installation

Prerequisites

Python 3.7+
Java Runtime Environment (for ENHSP solver)
C++ compiler (for Metric-FF solver, optional)

Setup

Clone the repository:

git clone https://github.com/yourusername/numeric-online-action-model-learning.git
cd numeric-online-action-model-learning

Install Python dependencies:

pip install -r requirements.txt

Configure solver paths (if using solvers):

Set environment variables for the solvers:

# For ENHSP solver
export ENHSP_FILE_PATH=/path/to/enhsp/enhsp.jar

# For Metric-FF solver
export METRIC_FF_DIRECTORY=/path/to/metric-ff

# For Plan Miner (optional)
export PLAN_MINER_DIR_PATH=/path/to/planminer

On Windows PowerShell:

$env:ENHSP_FILE_PATH="C:\path\to\enhsp\enhsp.jar"
$env:METRIC_FF_DIRECTORY="C:\path\to\metric-ff"

Datasets

The repository includes benchmark domains in the benchmark/ directory. Each domain is provided as a zip file containing PDDL problem and domain files:

counters: Simple counter manipulation domain
depots: Warehouse logistics with trucks and hoists
driverlog: Package delivery with drivers and trucks
farmland: Agricultural planning domain
pogo_stick: Minecraft task involving the multi-step creation of a pogo stick
rovers: Mars rover exploration and sample collection
sailing: Sailboat navigation with wind dynamics aiming to save people at sea
satellite: Satellite observation and data transmission
wooden_sword: Minecraft task involving the multi-step creation of a wooden sword

Dataset Format

Each benchmark contains:

Domain PDDL file (.pddl) - Defines the planning domain structure
Problem PDDL files (pfile*.pddl) - Individual problem instances
Partial domain files - Domains with incomplete action models for learning

Running Experiments

Online Learning Experiments

Run online learning experiments using the Planning with Online Learning (PIL) framework:

python experiments/concurrent_execution/planning_with_online_learning.py \
  --working_directory_path /path/to/benchmark/domain \
  --domain_file_name partial_domain.pddl \
  --problem_prefix pfile \
  --polynomial_degree 1 \
  --exploration_type noam_learning

Key Parameters:

--working_directory_path: Path to the domain directory
--domain_file_name: Name of the partial domain file
--problem_prefix: Prefix for problem files (default: "pfile")
--polynomial_degree: Degree of polynomial for numeric effects (0=linear, 1=quadratic, etc.)
--exploration_type: Learning algorithm to use:
- noam_learning - Numeric Online Action Model Learning
- semi_online - Semi-Online Learning with solver integration
- informative_explorer - Informative state-based exploration
- goal_oriented_explorer - Goal-oriented exploration
- optimistic_explorer - Optimistic model exploration

Parallel Numeric Experiments

Run multiple experiments in parallel across different domains or folds:

python experiments/concurrent_execution/parallel_numeric_experiment_runner.py \
  --working_directory_path /path/to/benchmark \
  --domain_file_name domain.pddl \
  --polynom_degree 1 \
  --learning_algorithm numeric_sam \
  --fold_num 0

Offline Learning from Trajectories

Train models from pre-recorded trajectories:

from sam_learning.learners import NumericSAMLearner
from pddl_plus_parser.lisp_parsers import DomainParser, TrajectoryParser

# Load domain
domain = DomainParser("domain.pddl").parse_domain()

# Initialize learner
learner = NumericSAMLearner(
    partial_domain=domain,
    polynomial_degree=1
)

# Learn from observations
for trajectory in trajectories:
    learner.learn_action_model([trajectory])

# Export learned domain
learned_domain = learner.partial_domain

Repository Structure

Core Components

`sam_learning/`

The main learning framework containing all learning algorithms and utilities.

learners/: Implementation of learning algorithms
- sam_learning.py - Base SAM learner for discrete models
- numeric_sam.py - Numeric SAM extension for numeric state variables
- noam_algorithm.py - Numeric Online Action Model Learning
- semi_online_learning_algorithm.py - Semi-online learning with solver integration
core/: Core learning components and utilities
- numeric_learning/: Numeric precondition and effect learning
  - Convex hull learning for safe preconditions
  - Linear regression for effect learning
  - Polynomial regression for complex effects
- online_learning/: Online learning specific components
  - online_discrete_models_learner.py - Discrete model updates
  - online_numeric_models_learner.py - Numeric model updates
  - informative_states_learner.py - Informative state selection
  - episode_info_recorder.py - Episode statistics recording
- online_learning_agents/: Environment interaction agents
  - abstract_agent.py - Abstract agent interface
  - ipc_agent.py - IPC environment agent implementation
- propositional_operations/: Discrete precondition learning
- predicates_matcher.py - Matches predicates to action parameters
- vocabulary_creator.py - Creates lifted state vocabularies
- environment_snapshot.py - Stores state transition snapshots
- matching_utils.py - Utilities for matching and grounding

`solvers/`

Planning solver integrations for problem-solving and plan generation.

abstract_solver.py - Abstract solver interface
enhsp_solver.py - ENHSP (Expressive Numeric Heuristic Search Planner) integration
metric_ff_solver.py - Metric-FF solver integration

`experiments/`

Experiment execution and management utilities.

concurrent_execution/: Parallel experiment runners
- planning_with_online_learning.py - Main PIL framework
- parallel_numeric_experiment_runner.py - Parallel numeric experiments
- parallel_basic_experiment_runner.py - Base parallel runner
- distributed_results_collector.py - Collects results from parallel runs
- folder_creation_for_parallel_execution.py - Setup for parallel runs
plotting/: Visualization and result plotting
- plot_nsam_results.py - Plot NSAM performance
- plot_online_learning_results.py - Plot online learning metrics
- plot_numeric_precision.py - Plot numeric precision metrics

`statistics/`

Performance metrics and statistics calculation.

learning_statistics_manager.py - Manages learning statistics
numeric_performance_calculator.py - Calculates numeric precision/recall
semantic_performance_calculator.py - Semantic model evaluation
discrete_precision_recall_calculator.py - Discrete model metrics
trajectories_statistics.py - Trajectory-based statistics
utils.py - Statistical utilities

`validators/`

Model validation and correctness checking.

safe_domain_validator.py - Validates learned domains for safety
validator_script_data.py - Validation data structures
common.py - Common validation utilities

`trajectory_creators/`

Tools for generating training trajectories.

experiments_trajectories_creator.py - Creates experimental trajectories
plan_miner_trajectories_creator.py - Integration with PlanMiner
random_walk_trajectories_creator.py - Random walk trajectory generation

`utilities/`

General utility functions and type definitions.

util_types.py - Enumerations for algorithms, solvers, and policies
k_fold_split.py - K-fold cross-validation splitting
distributed_k_fold_split.py - Distributed k-fold splitting

`tests/`

Unit tests and integration tests.

sam_learning_test.py - Tests for SAM learning
numeric_learning_tests/ - Tests for numeric learning components
online_learning_tests/ - Tests for online learning algorithms
general_utilities_tests/ - Tests for utility functions
conftest.py - Pytest configuration and fixtures

Learning Algorithms

SAM (Safe Action Model) Learning

Learns discrete action models guaranteeing that learned preconditions are safe (never allow inapplicable actions).

Use case: Discrete planning domains without numeric state variables

Numeric SAM

Extends SAM to handle numeric state variables using:

Convex hull learning for safe numeric preconditions
Linear/polynomial regression for numeric effects

Use case: Numeric planning domains with known effect structures

NOAM (Numeric Online Action Model Learning)

Online learning algorithm that:

Selects informative state-action pairs for exploration
Incrementally updates models during execution
Balances exploration and exploitation

Use case: Online learning scenarios with environment interaction

Semi-Online Learning

Hybrid approach that:

Attempts to solve problems using learned models and solvers
Falls back to exploration when solvers fail
Integrates solver-generated trajectories into training

Use case: Domains where solver assistance can accelerate learning

Exploration Strategies

Informative Explorer: Prioritizes exploring states that provide maximum information gain
Goal-Oriented Explorer: Directs exploration toward goal states
Optimistic Explorer: Uses optimistic model assumptions to encourage exploration

Solvers

ENHSP (Expressive Numeric Heuristic Search Planner)

Supports full PDDL+ planning with:

Numeric state variables
Complex numeric expressions
Non-linear effects

Configuration: Set ENHSP_FILE_PATH environment variable

Metric-FF

Fast planning for domains with simple numeric constraints.

Configuration: Set METRIC_FF_DIRECTORY environment variable

Output and Results

Experiments generate several output files:

Learned domains: safe_domain.pddl, optimistic_domain.pddl
Episode statistics: exploration_statistics.csv
Trajectories: trajectory_*.txt files with observed state transitions
Performance metrics: Precision, recall, and F1 scores for learned models
Solution files: .solution files containing generated plans

Examples

Example 1: Learning from Online Exploration

from pathlib import Path
from sam_learning.learners import SemiOnlineNumericAMLearner
from sam_learning.core.online_learning_agents import IPCAgent
from solvers import ENHSPSolver
from pddl_plus_parser.lisp_parsers import DomainParser, ProblemParser

# Setup
workdir = Path("benchmark/rovers")
domain = DomainParser("benchmark/rovers/partial_domain.pddl").parse_domain()
agent = IPCAgent(domain)
solver = ENHSPSolver()

# Initialize learner
learner = SemiOnlineNumericAMLearner(
    workdir=workdir,
    partial_domain=domain,
    polynomial_degree=1,
    agent=agent,
    solvers=[solver]
)

# Initialize learning algorithms
learner.initialize_learning_algorithms()

# Run learning on problems
problem_paths = list(workdir.glob("pfile*.pddl"))
learner.try_to_solve_problems(problem_paths)

Example 2: Offline Learning from Trajectories

from sam_learning.learners import NumericSAMLearner
from pddl_plus_parser.lisp_parsers import TrajectoryParser

# Initialize learner
learner = NumericSAMLearner(
    partial_domain=domain,
    polynomial_degree=1
)

# Parse and learn from trajectory
trajectory = TrajectoryParser(domain, problem).parse_trajectory("trace.txt")
learner.learn_action_model([trajectory])

# Export learned domain
learner.partial_domain.to_pddl_file("learned_domain.pddl")

Testing

Run the test suite:

# Run all tests
pytest

# Run specific test module
pytest tests/sam_learning_test.py

# Run with verbose output
pytest -v

# Run tests with coverage
pytest --cov=sam_learning

Contributing

Contributions are welcome! Please follow these guidelines:

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass
Submit a pull request

Citation

If you use this work in your research, please cite:

@inproceedings{mordoch2026online,
  title={Online Learning of Numeric Action Models for Planning},
  author={Argaman Mordoch and Yarin Benyamin and Shahaf S. Shperberg and Brendan Juba and Roni Stern},
  booktitle={International Conference on Autonomous Agents \& Multiagent Systems ({AAMAS})},
  year={2026}
}

License

MIT License - see LICENSE file for details.

Contact

For questions or issues, please open an issue on GitHub or contact the maintainers.

Acknowledgments

This project builds upon:

PDDL+ Parser library for PDDL parsing
ENHSP and Metric-FF solvers for planning
Various action model learning research (NSAM, SAM, etc.)

Troubleshooting

Common Issues

Issue: ENHSP_FILE_PATH not set

Solution: Set the environment variable to point to your ENHSP jar file

Issue: Out of memory errors

Solution: Reduce polynomial degree or batch size, or increase Java heap size

Issue: Solver timeout

Solution: Increase timeout parameters in solver configuration

Issue: No solution found

Solution: Check that the partial domain has correct action signatures and that problems are solvable

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
benchmark		benchmark
experiments		experiments
sam_learning		sam_learning
solvers		solvers
statistics		statistics
tests		tests
trajectory_creators		trajectory_creators
utilities		utilities
validators		validators
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Numeric Online Action Model Learning

Table of Contents

Overview

Installation

Prerequisites

Setup

Datasets

Dataset Format

Running Experiments

Online Learning Experiments

Parallel Numeric Experiments

Offline Learning from Trajectories

Repository Structure

Core Components

sam_learning/

solvers/

experiments/

statistics/

validators/

trajectory_creators/

utilities/

tests/

Learning Algorithms

SAM (Safe Action Model) Learning

Numeric SAM

NOAM (Numeric Online Action Model Learning)

Semi-Online Learning

Exploration Strategies

Solvers

ENHSP (Expressive Numeric Heuristic Search Planner)

Metric-FF

Output and Results

Examples

Example 1: Learning from Online Exploration

Example 2: Offline Learning from Trajectories

Testing

Contributing

Citation

License

Contact

Acknowledgments

Troubleshooting

Common Issues

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`sam_learning/`

`solvers/`

`experiments/`

`statistics/`

`validators/`

`trajectory_creators/`

`utilities/`

`tests/`

Packages