An open-source web automation framework powered by large multimodal models.
Developed under Nexus Labs
This project is based on Avenir-Web developed by the Princeton AI2 Lab.
OpenFlo enables autonomous web agents to perform tasks on any website using vision-language models. The system combines robust browser automation with intelligent action prediction to execute complex workflows.
src/openflo/: core agent implementation (OpenFloAgent)agent/: main agent logicagent.py: central agent class and execution flowconfig.py: configuration loading and validationreporting.py: result saving and summary generationevaluation.py: task success evaluation and termination logicexecutor.py: action execution logicpredictor.py: LLM interaction and action prediction
managers/ux_synthesis.py: SEQ-to-SUS evaluation orchestrationux/: UX scoring components (SEQ scorer, SUS calculator, report generator)personas/profile.py:PersonaProfiledataclass for persona-biased evaluation
src/run_agent.py: single-process runner (demo + batch)src/config/*.toml: sample configs (includingpersona.toml)data/: example data and task files
- Python
>=3.9(src/pyproject.toml) - A browser for Playwright (Chromium recommended)
- An API key for your chosen provider (OpenRouter preferred)
From the repository root:
# Create a conda environment
conda create -n openflo python=3.11
conda activate openflo
# Install the package in editable mode
pip install -e src
# Set up Playwright and install browser kernels
playwright installSet your API key in the .env file at the project root:
cp .env.example .env
# then edit .env and fill in your keyEnvironment variables take precedence over anything in [api_keys] inside the config.
Run scripts from src/ (paths in configs are written relative to src/):
cd srcIn your config (src/config/auto_mode.toml), set experiment.task_file_path, then run:
uv run run_agent.py -c config/auto_mode.tomlPass a persona config with -p to bias UX evaluation from a specific user's perspective:
uv run run_agent.py -c config/auto_mode.toml -p config/persona.tomlThe persona is injected into the final UX report — scores are evaluated as if the described user performed the task. See src/config/persona.toml for all available fields and inline documentation.
Batch mode expects a JSON array of tasks like:
[
{
"task_id": "task_001",
"confirmed_task": "Find the official API docs for X",
"website": "https://example.com/"
}
]Configs are TOML files; see src/config/auto_mode.toml.
[basic]save_file_dir: output root directorydefault_task,default_website: defaults for single-task runs
[experiment]task_file_path: JSON tasks list for batch modeoverwrite: skip or overwrite existing task output foldersmax_op,max_continuous_no_op,highlight
[model]name: model identifier (commonlyopenrouter/...)temperature,rate_limit- optional:
reasoning_model,checklist_model,completion_eval_model
[api_keys]- keys are loaded from
.env(OPENROUTER_API_KEY) - individual keys can be uncommented in the toml to override
- keys are loaded from
[playwright]headless,viewport,tracing,save_video,locale,geolocation
[ux](optional)enable_synthesis: enable SEQ/SUS evaluation (defaultfalse)generate_report: writesus_report.jsonat session end (defaulttrue)ux_model: model for SEQ scoring — defaults to main model if omittedseq_screenshot_context: include screenshots in step evaluation (defaulttrue)
[persona](optional — or pass via-p persona.toml)id,display_name,age_range: identification fieldsdigital_literacy:"expert"|"intermediate"|"beginner"|"very_low"primary_device:"desktop_keyboard"|"desktop_mouse"|"tablet_touch"|"mobile_touch"reading_speed:"fast"|"normal"|"slow"tolerance_for_friction:"high"|"medium"|"low"|"very_low"prior_experience: free text fed to the LLMdescription: 3–4 sentence narrative the LLM embodies when scoringcommon_friction_types: list of friction labels surfaced in the report[persona.scoring_bias]: integer offsets applied to metric scores after LLM response (e.g.seq_modifier = -1)
Each task writes to basic.save_file_dir/<task_id>/:
agent.log: per-task execution logresult.json: final summary (handled bysrc/openflo/agent/reporting.py)config.toml: resolved config snapshotall_predictions.json: recorded LLM I/O for the taskscreenshots/:screen_<step>.pngand sometimesscreen_<step>_labeled.pngsus_report.json: UX evaluation report (SEQ scores, SUS score, friction points, persona context) — written whenux.enable_synthesis = true
The runners also write run-level logs to src/logs/.
- Missing API key: fill in
OPENROUTER_API_KEYin.env(copy from.env.example) - Playwright browser not found: run
uv run playwright install chromium - Want to watch the browser: set
playwright.headless = false - Config paths look wrong: run from
src/or pass an absolute-cconfig path
OpenFlo is built upon Avenir-Web.
This project maintains the same license as the original framework. See the LICENSE file for details.