Add logging guide and docs navigation

tobiasraabe · tobiasraabe · commit 72a646bee8ef · 2026-03-22T22:22:00.000+01:00
diff --git a/docs/AGENTS.md b/docs/AGENTS.md
@@ -1,12 +1,7 @@
 # Documentation
 
-- Link to existing docs/API refs instead of re-explaining concepts - reduces duplication
-    and keeps info in sync - Prevents documentation drift and outdated explanations by
-    maintaining a single source of truth for each concept
-- Link to canonical docs rather than duplicating content - prevents drift and
-    maintenance burden - Consolidating documentation into existing files with
-    cross-references keeps information consistent and reduces the effort needed to
-    update multiple locations when changes occur.
+## General
+
 - Document only public APIs and user-facing behavior - exclude internals, framework
     abstractions, and implementation plumbing - Users need actionable documentation on
     what they can use, not confusing details about internal mechanics they can't control
@@ -17,9 +12,6 @@
     comprehensive coverage vs. fragmented mentions - Prevents users from missing
     features when they approach from different contexts (CLI vs. API) and allows
     features to be documented holistically rather than buried in subsections.
-- Avoid `# ruff: noqa` or `# type: ignore` in doc examples - ensures examples stay
-    correct and runnable - Skip directives hide bugs and type errors in documentation
-    code that users will copy, leading to broken examples in the wild
 - Explicitly mark parameters/features as 'optional' in docs, even when types show it -
     reduces cognitive load for readers - Users shouldn't need to parse type signatures
     to understand optionality; explicit labels make documentation scannable and
@@ -31,3 +23,21 @@
 - Strip boilerplate from docs examples - show only the feature being demonstrated -
     Reduces cognitive load and helps readers focus on the specific API or pattern being
     taught without distraction from scaffolding code.
+
+## Linking
+
+- Link to existing docs/API refs instead of re-explaining concepts - reduces duplication
+    and keeps info in sync - Prevents documentation drift and outdated explanations by
+    maintaining a single source of truth for each concept
+- Link to canonical docs rather than duplicating content - prevents drift and
+    maintenance burden - Consolidating documentation into existing files with
+    cross-references keeps information consistent and reduces the effort needed to
+    update multiple locations when changes occur.
+
+## Code Examples
+
+- Avoid `# ruff: noqa` or `# type: ignore` in doc examples - ensures examples stay
+    correct and runnable - Skip directives hide bugs and type errors in documentation
+    code that users will copy, leading to broken examples in the wild
+- Code file examples should have a title that shows the file name.
+- Important lines should be highlighted or annotated with a comment.
diff --git a/docs/source/how_to_guides/index.md b/docs/source/how_to_guides/index.md
@@ -13,6 +13,7 @@ specific tasks with pytask.
 - [Remote Files](remote_files.md)
 - [Functional Interface](functional_interface.md)
 - [Capture Warnings](capture_warnings.md)
+- [Manage Logging](logging.md)
 - [How To Influence Build Order](how_to_influence_build_order.md)
 - [Hashing Inputs Of Tasks](hashing_inputs_of_tasks.md)
 - [Using Task Returns](using_task_returns.md)
diff --git a/docs/source/how_to_guides/logging.md b/docs/source/how_to_guides/logging.md
@@ -0,0 +1,238 @@
+# Manage logging
+
+pytask can capture log records emitted during task execution, show them for failing
+tasks, stream them live to the terminal, and write them to a file.
+
+If you do not use Python's [`logging`](https://docs.python.org/3/library/logging.html)
+module often, think of log records simply as structured status messages such as
+"starting download", "loaded 200 rows", or "publishing failed".
+
+This guide focuses on the most common ways to work with logging in pytask.
+
+## Quick start
+
+If you want to... use this:
+
+- see log messages only when a task fails: run `pytask`
+- show only logs in failure reports: run `pytask --show-capture=log`
+- see logs immediately while tasks run: run `pytask --log-cli --log-cli-level=INFO`
+- save logs to a file: run `pytask --log-file=build.log`
+- capture more detailed messages such as `INFO` or `DEBUG`: add `--log-level=INFO` or
+    `--log-level=DEBUG`
+
+## A minimal example
+
+```py title="task_logging.py"
+import logging
+import sys
+
+
+logger = logging.getLogger(__name__)
+
+
+def task_prepare_report():
+    logger.info("preparing report.txt")
+
+
+def task_publish_report():
+    logger.warning("publishing report is about to fail")
+    print("stdout from publish")
+    sys.stderr.write("stderr from publish\n")
+    raise RuntimeError("simulated publish failure")
+```
+
+The most common logging levels are:
+
+- `DEBUG`: very detailed information for debugging
+- `INFO`: normal progress messages
+- `WARNING`: something unexpected happened, but execution can continue
+- `ERROR`: a more serious problem
+
+If you are just getting started, `INFO` and `WARNING` are usually the most useful
+levels.
+
+Here is what this looks like with live logging enabled and failure output restricted to
+captured logs:
+
+```console
+$ pytask --log-cli --log-cli-level=INFO --show-capture=log
+```
+
+--8<-- "docs/source/_static/md/logging-live.md"
+
+## Show captured logs for failing tasks
+
+Log records emitted with Python's
+[`logging`](https://docs.python.org/3/library/logging.html) module are attached to the
+report of a failing task in the same way as captured `stdout` and `stderr`.
+
+```py title="task_logging.py"
+import logging
+
+
+logger = logging.getLogger(__name__)
+
+
+def task_example():
+    logger.warning("something went wrong")
+    raise RuntimeError("fail")
+```
+
+```console
+$ pytask
+```
+
+By default, pytask shows captured log output for failing tasks together with the
+traceback and any captured `stdout` or `stderr`.
+
+This is useful when a task fails and you want to see what happened right before the
+error.
+
+Use `--show-capture` to control which captured output is shown:
+
+```console
+$ pytask --show-capture=log
+$ pytask --show-capture=all
+$ pytask --show-capture=no
+```
+
+`--show-capture=log` is useful when you only want log records in the failure report and
+want to hide captured `stdout` and `stderr`.
+
+## Control which log records are captured
+
+By default, pytask does not change the logging level. Captured output therefore depends
+on your normal logging configuration.
+
+In practice this often means that `WARNING` and `ERROR` messages appear, while `INFO`
+and `DEBUG` messages do not, unless you configure logging more explicitly.
+
+Use `--log-level` to set the threshold for captured log records explicitly:
+
+```console
+$ pytask --log-level=INFO
+$ pytask --log-level=DEBUG
+```
+
+As a rule of thumb:
+
+- use `INFO` if you want to see normal progress messages,
+- use `DEBUG` only when you need very detailed diagnostics.
+
+This option affects:
+
+- log records attached to failing task reports,
+- live logs shown with `--log-cli`,
+- exported logs written with `--log-file`.
+
+You can customize the formatting of captured log records with:
+
+```console
+$ pytask --log-format="%(asctime)s %(levelname)s %(message)s" \
+         --log-date-format="%Y-%m-%d %H:%M:%S"
+```
+
+## Stream logs live while tasks run
+
+Use `--log-cli` to print log records directly to the terminal during task execution.
+
+```console
+$ pytask --log-cli --log-cli-level=INFO
+```
+
+This is helpful when tasks take a while and you want immediate feedback instead of
+waiting for the final report.
+
+You can customize live logs separately from the captured report output:
+
+```console
+$ pytask --log-cli \
+         --log-cli-level=INFO \
+         --log-cli-format="%(levelname)s:%(message)s" \
+         --log-cli-date-format="%H:%M:%S"
+```
+
+If `--log-cli-format` or `--log-cli-date-format` are not provided, pytask falls back to
+`--log-format` and `--log-date-format`.
+
+## Write logs to a file
+
+Use `--log-file` to export log records from executed tasks to a file.
+
+```console
+$ pytask --log-file=build.log
+```
+
+This is useful for CI runs, long builds, or when you want to inspect logs after the run
+has finished.
+
+The file is overwritten by default. Use `--log-file-mode=a` to append instead.
+
+```console
+$ pytask --log-file=build.log --log-file-mode=a
+```
+
+You can control the file output independently:
+
+```console
+$ pytask --log-file=build.log \
+         --log-file-level=INFO \
+         --log-file-format="%(asctime)s %(name)s %(levelname)s %(message)s" \
+         --log-file-date-format="%Y-%m-%d %H:%M:%S"
+```
+
+Relative log file paths are resolved relative to the project root detected by pytask.
+
+## A good beginner setup
+
+If you want a practical setup without spending much time on logging configuration, this
+is a good default:
+
+```console
+$ pytask --log-cli --log-cli-level=INFO --log-file=build.log --show-capture=log
+```
+
+This gives you:
+
+- live progress messages in the terminal,
+- a log file you can inspect later,
+- only log output in failure reports, without extra `stdout` and `stderr` noise.
+
+## Configure logging defaults in `pyproject.toml`
+
+All logging options can be configured in `pyproject.toml`.
+
+```toml title="pyproject.toml"
+[tool.pytask.ini_options]
+log_level = "INFO"
+log_format = "%(asctime)s %(levelname)s %(message)s"
+log_date_format = "%Y-%m-%d %H:%M:%S"
+
+log_cli = true
+log_cli_level = "INFO"
+log_cli_format = "%(levelname)s:%(message)s"
+
+log_file = "build.log"
+log_file_mode = "w"
+log_file_level = "INFO"
+log_file_format = "%(asctime)s %(name)s %(levelname)s %(message)s"
+log_file_date_format = "%Y-%m-%d %H:%M:%S"
+```
+
+## Use logging with the programmatic interface
+
+The same options are available via
+[`pytask.build`](../api/functional_interfaces.md#build-workflow).
+
+```py title="build.py"
+from pytask import build
+
+
+session = build(
+    log_level="INFO",
+    log_cli=True,
+    log_cli_level="INFO",
+    log_file="build.log",
+    log_file_format="%(levelname)s:%(message)s",
+)
+```
diff --git a/docs/source/tutorials/capturing_output.md b/docs/source/tutorials/capturing_output.md
@@ -26,6 +26,11 @@ By default, capturing is done by intercepting writes to low-level file descripto
 allows capturing output from simple `print` statements as well as output from a
 subprocess started by a task.
 
+!!! seealso
+
+    [Manage logging](../how_to_guides/logging.md) for a dedicated guide to captured logs,
+    live logs, log files, and logging configuration.
+
 ## Setting capturing methods or disabling capturing
 
 There are three ways in which `pytask` can perform capturing:
@@ -49,23 +54,6 @@ $ pytask --capture=tee-sys   # combines 'sys' and '-s', capturing sys.stdout/std
                              # and passing it along to the actual sys.stdout/stderr
 ```
 
-## Controlling captured log output
-
-Use `--show-capture=log` to only display captured log records for failing tasks or
-`--show-capture=all` to display logs together with captured `stdout` and `stderr`.
-
-Use `--log-cli` to stream log records to the terminal while tasks run. You can customize
-the live output with `--log-cli-level`, `--log-cli-format`, and `--log-cli-date-format`.
-
-You can also export task logs to a file with `--log-file` and customize the formatting
-with `--log-format`, `--log-date-format`, `--log-file-format`, and
-`--log-file-date-format`.
-
-The animation below shows the same warning appearing once as a live log line during
-execution and again as captured log output in the failure report.
-
---8<-- "docs/source/_static/md/logging-live.md"
-
 ## Using print statements for debugging
 
 One primary benefit of the default capturing of stdout/stderr output is that you can use
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -41,6 +41,7 @@ nav:
       - Remote Files: how_to_guides/remote_files.md
       - Functional Interface: how_to_guides/functional_interface.md
       - Capture Warnings: how_to_guides/capture_warnings.md
+      - Manage Logging: how_to_guides/logging.md
       - How To Influence Build Order: how_to_guides/how_to_influence_build_order.md
       - Hashing Inputs Of Tasks: how_to_guides/hashing_inputs_of_tasks.md
       - Using Task Returns: how_to_guides/using_task_returns.md