Skip to content

Commit 00002f2

Browse files
committed
Standardize gotap
1 parent 8cd2955 commit 00002f2

2 files changed

Lines changed: 63 additions & 69 deletions

File tree

README.md

Lines changed: 44 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -3,84 +3,76 @@
33
[![Docker Image CI](https://github.com/tool-spec/tool_template_python/actions/workflows/docker-image.yml/badge.svg)](https://github.com/tool-spec/tool_template_python/actions/workflows/docker-image.yml)
44
[![DOI](https://zenodo.org/badge/558416591.svg)](https://zenodo.org/badge/latestdoi/558416591)
55

6-
This is the template for a generic containerized Python tool following the [Tool Specification](https://tool-spec.github.io/tool-specs/) for reusable research software using Docker.
6+
Template repository for building a Python tool that follows the [Tool Specification](https://tool-spec.github.io/tool-specs/) container contract.
77

8-
This template can be used to generate new Github repositories from it.
8+
## How `gotap` works here
99

10+
This template ships with [`gotap`](https://github.com/tool-spec/gotap) inside the image. The default container command is:
1011

11-
## How generic?
12+
```Dockerfile
13+
CMD ["gotap", "run", "foobar", "--input-file", "/in/input.json"]
14+
```
1215

13-
Tools using this template can be run by the [toolbox-runner](https://github.com/tool-spec/tool-runner).
14-
That is only convenience, the tools implemented using this template are independent of any framework.
16+
At build time, `gotap generate` creates `parameters.py` from `src/tool.yml`. At runtime, `run.py` uses the generated bindings to:
1517

16-
The main idea is to implement a common file structure inside container to load inputs and outputs of the
17-
tool. The template shares this structures with the [R template](https://github.com/tool-spec/tool_template_r),
18-
[NodeJS template](https://github.com/tool-spec/tool_template_node) and [Octave template](https://github.com/tool-spec/tool_template_octave),
19-
but can be mimiced in any container.
18+
- validate `/in/input.json`
19+
- load typed parameters and data paths
20+
- emit structured run logs
2021

21-
Each container needs at least the following structure:
22+
## Required file structure
2223

23-
```
24+
```text
2425
/
2526
|- in/
26-
| |- parameters.json
27+
| |- input.json
2728
|- out/
2829
| |- ...
2930
|- src/
3031
| |- tool.yml
3132
| |- run.py
33+
| |- parameters.py (generated at build time)
34+
| |- CITATION.cff
3235
```
3336

34-
* `parameters.json` are parameters. Whichever framework runs the container, this is how parameters are passed.
35-
* `tool.yml` is the tool specification. It contains metadata about the scope of the tool, the number of endpoints (functions) and their parameters
36-
* `run.py` is the tool itself, or a Python script that handles the execution. It has to capture all outputs and either `print` them to console or create files in `/out`
37-
38-
## How to build the image?
39-
40-
You can build the image from within the root of this repo by
41-
```
42-
docker build -t tbr_python_tempate .
43-
```
37+
- `/in/input.json` contains parameter values and data references
38+
- `/out/` receives generated files plus `gotap` metadata such as `_metadata.json`
39+
- `/src/tool.yml` defines the tool metadata and command
40+
- `/src/run.py` is the tool entrypoint referenced by `tool.yml`
4441

45-
Use any tag you like. If you want to run and manage the container with [toolbox-runner](https://github.com/tool-spec/tool-runner)
46-
they should be prefixed by `tbr_` to be recognized.
42+
## Build and run
4743

48-
Alternatively, the contained `.github/workflows/docker-image.yml` will build the image for you
49-
on new releases on Github. You need to change the target repository in the aforementioned yaml.
44+
Build the image from the template root:
5045

51-
## How to run?
46+
```bash
47+
docker build -t tbr_python_template .
48+
```
5249

53-
This template installs the json2args python package to parse the parameters in the `/in/parameters.json`. This assumes that
54-
the files are not renamed and not moved and there is actually only one tool in the container. For any other case, the environment variables
55-
`PARAM_FILE` can be used to specify a new location for the `parameters.json` and `TOOL_RUN` can be used to specify the tool to be executed.
56-
The `run.py` has to take care of that.
50+
Run the sample tool with the bundled input and output folders:
5751

58-
To invoke the docker container directly run something similar to:
59-
```
60-
docker run --rm -it -v /path/to/local/in:/in -v /path/to/local/out:/out -e TOOL_RUN=foobar tbr_python_template
52+
```bash
53+
docker run --rm -it \
54+
-v "$(pwd)/in:/in" \
55+
-v "$(pwd)/out:/out" \
56+
-e TOOL_RUN=foobar \
57+
tbr_python_template
6158
```
6259

63-
Then, the output will be in your local out and based on your local input folder. Stdout and Stderr are also connected to the host.
60+
`TOOL_RUN` is only needed when the image contains more than one tool entry. The primary runtime contract is still `/in/input.json` plus `gotap run`.
6461

65-
With the [toolbox runner](https://github.com/tool-spec/tool-runner), this is simplyfied:
62+
## Customize
6663

67-
```python
68-
from toolbox_runner import list_tools
69-
tools = list_tools() # dict with tool names as keys
70-
71-
foobar = tools.get('foobar') # it has to be present there...
72-
foobar.run(result_path='./', foo_int=1337, foo_string="Please change me")
73-
```
74-
The example above will create a temporary file structure to be mounted into the container and then create a `.tar.gz` on termination of all
75-
inputs, outputs, specifications and some metadata, including the image sha256 used to create the output in the current working directory.
64+
1. Update `src/tool.yml` to describe your real tool.
65+
2. Add Python or system dependencies in `Dockerfile`.
66+
3. Implement the tool logic in `src/run.py`.
67+
4. Rebuild the image so `gotap generate` refreshes `parameters.py`.
7668

77-
## What about real tools, no foobar?
69+
## Generated bindings and logging
7870

79-
Yeah.
71+
The generated `parameters.py` file is not edited by hand. It exposes:
8072

81-
1. change the `tool.yml` to describe your actual tool
82-
2. add any `pip install` or `apt-get install` needed to the dockerfile
83-
3. add additional source code to `/src`
84-
4. change the `run.py` to consume parameters and data from `/in` and useful output in `out`
85-
5. build, run, rock!
73+
- `get_parameters()`
74+
- `get_data()`
75+
- `get_run_context()`
76+
- `get_logger()`
8677

78+
The starter code uses `get_logger()` to write structured JSON Lines logs to the file chosen by `gotap`.

src/run.py

Lines changed: 19 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,31 @@
1-
import logging
21
import os
3-
from datetime import datetime as dt
2+
import sys
43

5-
from parameters import get_parameters, get_data
4+
from parameters import get_data, get_logger, get_parameters
65

7-
logging.basicConfig(level=logging.INFO)
8-
logger = logging.getLogger(__name__)
9-
10-
# parse parameters (generated by gotap at container build time)
116
params = get_parameters()
127
data = get_data()
8+
logger = get_logger()
139

14-
# check if a toolname was set in env
15-
toolname = os.environ.get('TOOL_RUN', 'foobar').lower()
10+
toolname = os.environ.get("TOOL_RUN", "foobar").lower()
1611

17-
# switch the tool
18-
if toolname == 'foobar':
19-
# RUN the tool here and create the output in /out
20-
logger.info('This toolbox does not include any tool. Did you run the template?\n')
12+
logger.info("start", "Starting tool run", tool=toolname)
13+
logger.info(
14+
"input-loaded",
15+
"Loaded validated parameters and data paths",
16+
tool=toolname,
17+
parameter_count=len(vars(params)),
18+
data_keys=sorted(data.keys()),
19+
)
2120

22-
logger.info(vars(params))
21+
if toolname == "foobar":
22+
sys.stderr.write("This toolbox does not include any tool. Did you run the template?\n")
23+
sys.stderr.write(f"{vars(params)}\n")
2324

2425
for name, path in data.items():
25-
logger.info(f"\n### {name}: %s", path)
26+
sys.stderr.write(f"\n### {name}: {path}\n")
2627

27-
# In any other case, it was not clear which tool to run
28+
logger.info("finished", "Template run finished successfully", tool=toolname)
2829
else:
29-
raise AttributeError(f"[{dt.now().isocalendar()}] Either no TOOL_RUN environment variable available, or '{toolname}' is not valid.\n")
30+
logger.error("error", "Requested tool is not implemented in the template", tool=toolname)
31+
raise AttributeError(f"Either no TOOL_RUN environment variable available, or '{toolname}' is not valid.\n")

0 commit comments

Comments
 (0)