Skip to content

SPL-BGU/numeric-online-action-model-learning

Repository files navigation

Numeric Online Action Model Learning

A Python-based framework for learning numeric action models in planning domains using various online and offline learning algorithms. This repository implements several state-of-the-art algorithms for learning PDDL+ action models from observed trajectories, with support for both discrete and numeric state variables.

Table of Contents

Overview

This framework provides implementations of multiple action model learning algorithms:

  • SAM (Safe Action Model) Learning: Learns discrete action models from observations
  • Numeric SAM: Extension of SAM for learning numeric state variables and effects
  • NOAM (Numeric Online Action Model): Online learning algorithm for numeric action models
  • Semi-Online Learning: Hybrid approach combining exploration and solver-based planning
  • Informative Explorer: Uses informative state selection for efficient exploration
  • Goal-Oriented Explorer: Focuses exploration toward goal states
  • Optimistic Explorer: Uses optimistic model assumptions for exploration

The framework supports learning from:

  • Pre-recorded trajectory files
  • Online interaction with planning environments
  • Solver-generated plans (using ENHSP or Metric-FF)

Installation

Prerequisites

  • Python 3.7+
  • Java Runtime Environment (for ENHSP solver)
  • C++ compiler (for Metric-FF solver, optional)

Setup

  1. Clone the repository:
git clone https://github.com/yourusername/numeric-online-action-model-learning.git
cd numeric-online-action-model-learning
  1. Install Python dependencies:
pip install -r requirements.txt
  1. Configure solver paths (if using solvers):

Set environment variables for the solvers:

# For ENHSP solver
export ENHSP_FILE_PATH=/path/to/enhsp/enhsp.jar

# For Metric-FF solver
export METRIC_FF_DIRECTORY=/path/to/metric-ff

# For Plan Miner (optional)
export PLAN_MINER_DIR_PATH=/path/to/planminer

On Windows PowerShell:

$env:ENHSP_FILE_PATH="C:\path\to\enhsp\enhsp.jar"
$env:METRIC_FF_DIRECTORY="C:\path\to\metric-ff"

Datasets

The repository includes benchmark domains in the benchmark/ directory. Each domain is provided as a zip file containing PDDL problem and domain files:

  • counters: Simple counter manipulation domain
  • depots: Warehouse logistics with trucks and hoists
  • driverlog: Package delivery with drivers and trucks
  • farmland: Agricultural planning domain
  • pogo_stick: Minecraft task involving the multi-step creation of a pogo stick
  • rovers: Mars rover exploration and sample collection
  • sailing: Sailboat navigation with wind dynamics aiming to save people at sea
  • satellite: Satellite observation and data transmission
  • wooden_sword: Minecraft task involving the multi-step creation of a wooden sword

Dataset Format

Each benchmark contains:

  • Domain PDDL file (.pddl) - Defines the planning domain structure
  • Problem PDDL files (pfile*.pddl) - Individual problem instances
  • Partial domain files - Domains with incomplete action models for learning

Running Experiments

Online Learning Experiments

Run online learning experiments using the Planning with Online Learning (PIL) framework:

python experiments/concurrent_execution/planning_with_online_learning.py \
  --working_directory_path /path/to/benchmark/domain \
  --domain_file_name partial_domain.pddl \
  --problem_prefix pfile \
  --polynomial_degree 1 \
  --exploration_type noam_learning

Key Parameters:

  • --working_directory_path: Path to the domain directory
  • --domain_file_name: Name of the partial domain file
  • --problem_prefix: Prefix for problem files (default: "pfile")
  • --polynomial_degree: Degree of polynomial for numeric effects (0=linear, 1=quadratic, etc.)
  • --exploration_type: Learning algorithm to use:
    • noam_learning - Numeric Online Action Model Learning
    • semi_online - Semi-Online Learning with solver integration
    • informative_explorer - Informative state-based exploration
    • goal_oriented_explorer - Goal-oriented exploration
    • optimistic_explorer - Optimistic model exploration

Parallel Numeric Experiments

Run multiple experiments in parallel across different domains or folds:

python experiments/concurrent_execution/parallel_numeric_experiment_runner.py \
  --working_directory_path /path/to/benchmark \
  --domain_file_name domain.pddl \
  --polynom_degree 1 \
  --learning_algorithm numeric_sam \
  --fold_num 0

Offline Learning from Trajectories

Train models from pre-recorded trajectories:

from sam_learning.learners import NumericSAMLearner
from pddl_plus_parser.lisp_parsers import DomainParser, TrajectoryParser

# Load domain
domain = DomainParser("domain.pddl").parse_domain()

# Initialize learner
learner = NumericSAMLearner(
    partial_domain=domain,
    polynomial_degree=1
)

# Learn from observations
for trajectory in trajectories:
    learner.learn_action_model([trajectory])

# Export learned domain
learned_domain = learner.partial_domain

Repository Structure

Core Components

sam_learning/

The main learning framework containing all learning algorithms and utilities.

  • learners/: Implementation of learning algorithms

    • sam_learning.py - Base SAM learner for discrete models
    • numeric_sam.py - Numeric SAM extension for numeric state variables
    • noam_algorithm.py - Numeric Online Action Model Learning
    • semi_online_learning_algorithm.py - Semi-online learning with solver integration
  • core/: Core learning components and utilities

    • numeric_learning/: Numeric precondition and effect learning

      • Convex hull learning for safe preconditions
      • Linear regression for effect learning
      • Polynomial regression for complex effects
    • online_learning/: Online learning specific components

      • online_discrete_models_learner.py - Discrete model updates
      • online_numeric_models_learner.py - Numeric model updates
      • informative_states_learner.py - Informative state selection
      • episode_info_recorder.py - Episode statistics recording
    • online_learning_agents/: Environment interaction agents

      • abstract_agent.py - Abstract agent interface
      • ipc_agent.py - IPC environment agent implementation
    • propositional_operations/: Discrete precondition learning

    • predicates_matcher.py - Matches predicates to action parameters

    • vocabulary_creator.py - Creates lifted state vocabularies

    • environment_snapshot.py - Stores state transition snapshots

    • matching_utils.py - Utilities for matching and grounding

solvers/

Planning solver integrations for problem-solving and plan generation.

  • abstract_solver.py - Abstract solver interface
  • enhsp_solver.py - ENHSP (Expressive Numeric Heuristic Search Planner) integration
  • metric_ff_solver.py - Metric-FF solver integration

experiments/

Experiment execution and management utilities.

  • concurrent_execution/: Parallel experiment runners

    • planning_with_online_learning.py - Main PIL framework
    • parallel_numeric_experiment_runner.py - Parallel numeric experiments
    • parallel_basic_experiment_runner.py - Base parallel runner
    • distributed_results_collector.py - Collects results from parallel runs
    • folder_creation_for_parallel_execution.py - Setup for parallel runs
  • plotting/: Visualization and result plotting

    • plot_nsam_results.py - Plot NSAM performance
    • plot_online_learning_results.py - Plot online learning metrics
    • plot_numeric_precision.py - Plot numeric precision metrics

statistics/

Performance metrics and statistics calculation.

  • learning_statistics_manager.py - Manages learning statistics
  • numeric_performance_calculator.py - Calculates numeric precision/recall
  • semantic_performance_calculator.py - Semantic model evaluation
  • discrete_precision_recall_calculator.py - Discrete model metrics
  • trajectories_statistics.py - Trajectory-based statistics
  • utils.py - Statistical utilities

validators/

Model validation and correctness checking.

  • safe_domain_validator.py - Validates learned domains for safety
  • validator_script_data.py - Validation data structures
  • common.py - Common validation utilities

trajectory_creators/

Tools for generating training trajectories.

  • experiments_trajectories_creator.py - Creates experimental trajectories
  • plan_miner_trajectories_creator.py - Integration with PlanMiner
  • random_walk_trajectories_creator.py - Random walk trajectory generation

utilities/

General utility functions and type definitions.

  • util_types.py - Enumerations for algorithms, solvers, and policies
  • k_fold_split.py - K-fold cross-validation splitting
  • distributed_k_fold_split.py - Distributed k-fold splitting

tests/

Unit tests and integration tests.

  • sam_learning_test.py - Tests for SAM learning
  • numeric_learning_tests/ - Tests for numeric learning components
  • online_learning_tests/ - Tests for online learning algorithms
  • general_utilities_tests/ - Tests for utility functions
  • conftest.py - Pytest configuration and fixtures

Learning Algorithms

SAM (Safe Action Model) Learning

Learns discrete action models guaranteeing that learned preconditions are safe (never allow inapplicable actions).

Use case: Discrete planning domains without numeric state variables

Numeric SAM

Extends SAM to handle numeric state variables using:

  • Convex hull learning for safe numeric preconditions
  • Linear/polynomial regression for numeric effects

Use case: Numeric planning domains with known effect structures

NOAM (Numeric Online Action Model Learning)

Online learning algorithm that:

  • Selects informative state-action pairs for exploration
  • Incrementally updates models during execution
  • Balances exploration and exploitation

Use case: Online learning scenarios with environment interaction

Semi-Online Learning

Hybrid approach that:

  • Attempts to solve problems using learned models and solvers
  • Falls back to exploration when solvers fail
  • Integrates solver-generated trajectories into training

Use case: Domains where solver assistance can accelerate learning

Exploration Strategies

  • Informative Explorer: Prioritizes exploring states that provide maximum information gain
  • Goal-Oriented Explorer: Directs exploration toward goal states
  • Optimistic Explorer: Uses optimistic model assumptions to encourage exploration

Solvers

ENHSP (Expressive Numeric Heuristic Search Planner)

Supports full PDDL+ planning with:

  • Numeric state variables
  • Complex numeric expressions
  • Non-linear effects

Configuration: Set ENHSP_FILE_PATH environment variable

Metric-FF

Fast planning for domains with simple numeric constraints.

Configuration: Set METRIC_FF_DIRECTORY environment variable

Output and Results

Experiments generate several output files:

  • Learned domains: safe_domain.pddl, optimistic_domain.pddl
  • Episode statistics: exploration_statistics.csv
  • Trajectories: trajectory_*.txt files with observed state transitions
  • Performance metrics: Precision, recall, and F1 scores for learned models
  • Solution files: .solution files containing generated plans

Examples

Example 1: Learning from Online Exploration

from pathlib import Path
from sam_learning.learners import SemiOnlineNumericAMLearner
from sam_learning.core.online_learning_agents import IPCAgent
from solvers import ENHSPSolver
from pddl_plus_parser.lisp_parsers import DomainParser, ProblemParser

# Setup
workdir = Path("benchmark/rovers")
domain = DomainParser("benchmark/rovers/partial_domain.pddl").parse_domain()
agent = IPCAgent(domain)
solver = ENHSPSolver()

# Initialize learner
learner = SemiOnlineNumericAMLearner(
    workdir=workdir,
    partial_domain=domain,
    polynomial_degree=1,
    agent=agent,
    solvers=[solver]
)

# Initialize learning algorithms
learner.initialize_learning_algorithms()

# Run learning on problems
problem_paths = list(workdir.glob("pfile*.pddl"))
learner.try_to_solve_problems(problem_paths)

Example 2: Offline Learning from Trajectories

from sam_learning.learners import NumericSAMLearner
from pddl_plus_parser.lisp_parsers import TrajectoryParser

# Initialize learner
learner = NumericSAMLearner(
    partial_domain=domain,
    polynomial_degree=1
)

# Parse and learn from trajectory
trajectory = TrajectoryParser(domain, problem).parse_trajectory("trace.txt")
learner.learn_action_model([trajectory])

# Export learned domain
learner.partial_domain.to_pddl_file("learned_domain.pddl")

Testing

Run the test suite:

# Run all tests
pytest

# Run specific test module
pytest tests/sam_learning_test.py

# Run with verbose output
pytest -v

# Run tests with coverage
pytest --cov=sam_learning

Contributing

Contributions are welcome! Please follow these guidelines:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

Citation

If you use this work in your research, please cite:

@inproceedings{mordoch2026online,
  title={Online Learning of Numeric Action Models for Planning},
  author={Argaman Mordoch and Yarin Benyamin and Shahaf S. Shperberg and Brendan Juba and Roni Stern},
  booktitle={International Conference on Autonomous Agents \& Multiagent Systems ({AAMAS})},
  year={2026}
}

License

MIT License - see LICENSE file for details.

Copyright (c) 2026 SPL@BGU

Contact

For questions or issues, please open an issue on GitHub or contact the maintainers.

Acknowledgments

This project builds upon:

  • PDDL+ Parser library for PDDL parsing
  • ENHSP and Metric-FF solvers for planning
  • Various action model learning research (NSAM, SAM, etc.)

Troubleshooting

Common Issues

Issue: ENHSP_FILE_PATH not set

  • Solution: Set the environment variable to point to your ENHSP jar file

Issue: Out of memory errors

  • Solution: Reduce polynomial degree or batch size, or increase Java heap size

Issue: Solver timeout

  • Solution: Increase timeout parameters in solver configuration

Issue: No solution found

  • Solution: Check that the partial domain has correct action signatures and that problems are solvable

About

The code for the Numeric Online Action Model learning (NOAM) algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors