Skip to content

Commit 942d2fa

Browse files
authored
[DOCS-1740] Add details about managing bucket storage (#2328)
1 parent 26aa1fb commit 942d2fa

6 files changed

Lines changed: 124 additions & 27 deletions

File tree

docs.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,7 @@
168168
"platform/hosting/data-security/data-encryption"
169169
]
170170
},
171+
"platform/hosting/managing-bucket-storage",
171172
"platform/hosting/env-vars"
172173
]
173174
},

models/artifacts/delete-artifacts.mdx

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ Delete artifacts interactively with the W&B App or programmatically with the W&B
88

99
The contents of the artifact remain as a soft-delete, or pending deletion state, until a regularly run garbage collection process reviews all artifacts marked for deletion. The garbage collection process deletes associated files from storage if the artifact and its associated files are not used by a previous or subsequent artifact versions.
1010

11+
<Note>
12+
Garbage collection is **best-effort**. W&B does not guarantee how quickly freed space appears in your object storage after you delete an artifact. Large deployments or backlogs can take longer than expected. For how this fits with run data, retention settings, and optional operator actions, see [Manage bucket storage and costs](/platform/hosting/managing-bucket-storage).
13+
</Note>
14+
1115
## Artifact garbage collection workflow
1216

1317
The following diagram illustrates the complete artifact garbage collection process:
@@ -24,7 +28,7 @@ graph TB
2428
SDKDelete --> SoftDelete
2529
TTLDelete --> SoftDelete
2630
27-
SoftDelete --> GCWait[(Wait for<br/>Garbage Collection<br/>Process)]
31+
SoftDelete --> GCWait[(Wait for<br/>best-effort<br/>Garbage Collection)]
2832
2933
GCWait --> GCRun[Garbage Collection<br/>Process Runs<br/><br/>- Reviews all soft-deleted artifacts<br/>- Checks file dependencies]
3034
@@ -196,31 +200,31 @@ Artifacts with protected aliases have special deletion restrictions. [Protected
196200

197201

198202
## Enable garbage collection based on how W&B is hosted
199-
Garbage collection is enabled by default if you use W&B's shared cloud. Based on how you host W&B, you might need to take additional steps to enable garbage collection, this includes:
200203

204+
<Note>Garbage collection timing is not guaranteed. See [Manage bucket storage and costs](/platform/hosting/managing-bucket-storage) for details.</Note>
205+
206+
Garbage collection is active by default if you use W&B Multi-tenant Cloud. In W&B Dedicated and Self-Managed, you might need to take these additional steps to activate garbage collection.
207+
208+
209+
1. **W&B Self-Managed**: Set `GORILLA_ARTIFACT_GC_ENABLED=true`.
210+
1. **Dedicated Cloud**: Contact support to verify that garbage collection is active.
211+
1. Enable bucket versioning if you use [AWS](https://docs.aws.amazon.com/AmazonS3/latest/userguide/manage-versioning-examples.html), [Google Cloud](https://cloud.google.com/storage/docs/object-versioning) or any other storage provider such as [Minio](https://min.io/docs/minio/linux/administration/object-management/object-versioning.html#enable-bucket-versioning). If you use Azure, [enable soft deletion](https://learn.microsoft.com/azure/storage/blobs/soft-delete-blob-overview), which is equivalent to bucket versioning.
201212

202-
* Set the `GORILLA_ARTIFACT_GC_ENABLED` environment variable to true: `GORILLA_ARTIFACT_GC_ENABLED=true`
203-
* Enable bucket versioning if you use [AWS](https://docs.aws.amazon.com/AmazonS3/latest/userguide/manage-versioning-examples.html), [Google Cloud](https://cloud.google.com/storage/docs/object-versioning) or any other storage provider such as [Minio](https://min.io/docs/minio/linux/administration/object-management/object-versioning.html#enable-bucket-versioning). If you use Azure, [enable soft deletion](https://learn.microsoft.com/azure/storage/blobs/soft-delete-blob-overview).
204-
<Note>
205-
Soft deletion in Azure is equivalent to bucket versioning in other storage providers.
206-
</Note>
207213

208214
The following table describes how to satisfy requirements to enable garbage collection based on your deployment type.
209215

210216
The `X` indicates you must satisfy the requirement:
211217

212218
| | Environment variable | Enable versioning |
213219
| -----------------------------------------------| ------------------------| ----------------- |
214-
| Shared cloud | | |
215-
| Shared cloud with [secure storage connector](/platform/hosting/data-security/secure-storage-connector)| | X |
220+
| Multi-tenant Cloud | | |
221+
| Multi-tenant Cloud with [BYOB storage](/platform/hosting/data-security/secure-storage-connector)| | X |
216222
| Dedicated Cloud | | |
217-
| Dedicated Cloud with [secure storage connector](/platform/hosting/data-security/secure-storage-connector)| | X |
218-
| Self-Managed cloud | X | X |
219-
| Self-Managed on-prem | X | X |
220-
223+
| Dedicated Cloud with [BYOB storage](/platform/hosting/data-security/secure-storage-connector)| | X |
224+
| Self-Managed | X | X |
221225

222226

223227
<Note>
224228
note
225229
Secure storage connector is currently only available for Google Cloud Platform and Amazon Web Services.
226-
</Note>
230+
</Note>

models/runs/delete-runs.mdx

Lines changed: 45 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,51 @@
11
---
22
title: Delete runs
3-
description: Learn how to delete runs from a W&B project using the W&B App.
3+
description: Delete runs from a W&B project using the W&B App or the Public API, and learn how deleted run data is removed from storage on Dedicated Cloud and Self-Managed deployments.
44
---
55

6-
Delete one or more runs from a project with the W&B App.
6+
## Delete runs
7+
Delete runs from a project with the W&B App or the Python API.
78

9+
<Tabs>
10+
<Tab title="W&B App" value="ui">
811
1. Navigate to the project that contains the runs you want to delete.
912
2. Select the **Runs** tab.
1013
3. Select the checkbox next to the runs you want to delete.
1114
4. Choose the **Delete** button (trash can icon) above the table.
1215
5. From the drawer that appears, choose **Delete**.
1316

14-
<Note>
17+
For projects that contain a large number of runs, you can use either the search bar to filter runs you want to delete using Regex or the filter button to filter runs based on their status, tags, or other properties.
18+
</Tab>
19+
20+
<Tab title="Python" value="python">
21+
You can delete runs programmatically with [`Run.delete()`](/models/ref/python/public-api/run#method-run-delete). Set `delete_artifacts=True` if you also want to remove artifacts associated with the run.
22+
23+
```python
24+
import wandb
25+
26+
api = wandb.Api()
27+
runs = api.runs("<entity>/<project>")
28+
for run in runs:
29+
if run.state == "finished": # Replace with your own condition
30+
run.delete(delete_artifacts=False)
31+
```
32+
33+
For the full method signature and behavior, see the [`Run.delete` reference](/models/ref/python/public-api/run#method-run-delete).
34+
35+
To remove individual files attached to a run, like logged media:
36+
1. Obtain the relevant file handles with [`Run.files()`](/models/ref/python/public-api/run#method-run-files).
37+
1. Use [`File.delete()`](/models/ref/python/public-api/file#method-file-delete) to delete individual files.
38+
39+
</Tab>
40+
</Tabs>
41+
1542
A run ID cannot be reused, even after the run is deleted. Instead, the run will fail with an error.
16-
</Note>
1743

18-
<Note>
19-
For projects that contain a large number of runs, you can use either the search bar to filter runs you want to delete using Regex or the filter button to filter runs based on their status, tags, or other properties.
20-
</Note>
44+
<Warning>
45+
When you delete a run and choose to delete associated artifacts, the artifacts are permanently removed and can't be recovered, even if the run is restored later. This includes artifacts linked to the Registry.
46+
</Warning>
2147

22-
### Run deletion flowchart
48+
## Run deletion flowchart
2349

2450
The following diagram illustrates the complete run deletion process, including the handling of associated artifacts and Registry links:
2551

@@ -55,6 +81,14 @@ graph TB
5581
style FullEnd fill:#ffcdd2,stroke:#333,stroke-width:2px,color:#000
5682
```
5783

58-
<Warning>
59-
When you delete a run and choose to delete associated artifacts, the artifacts are permanently removed and can't be recovered, even if the run is restored later. This includes artifacts linked to the Registry.
60-
</Warning>
84+
## When deleted run data is removed from storage
85+
86+
On [W&B Dedicated Cloud](/platform/hosting/hosting-options/dedicated-cloud) and [W&B Self-Managed](/platform/hosting/hosting-options/self-managed), the `GORILLA_DATA_RETENTION_PERIOD` environment variable controls how long **deleted run data** is retained before it can be permanently removed from object storage. **Artifacts are not removed by this setting**; they follow the artifact deletion and garbage collection flow described in [Delete an artifact](/models/artifacts/delete-artifacts).
87+
88+
Setting or changing `GORILLA_DATA_RETENTION_PERIOD` is irreversible for data past the retention window. Back up your database and bucket before enabling or tightening retention. See [Configure environment variables](/platform/hosting/env-vars) for the reference table and warnings.
89+
90+
Even after run or file deletion and retention processing, **bucket usage can lag** while background jobs catch up. W&B does not guarantee immediate reclamation of object storage. For a full overview of artifacts versus run data, timing expectations, and optional operator actions, see [Manage bucket storage and costs](/platform/hosting/managing-bucket-storage).
91+
92+
<Note>
93+
If deletions do not appear as expected in the W&B App when using the Public API, upgrade the W&B Python SDK to a current release and retry.
94+
</Note>

platform/hosting/data-security/secure-storage-connector.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Bring your own bucket (BYOB) allows you to store W&B artifacts and other related
1212

1313
<Note>
1414
* Communication between W&B SDK / CLI / UI and your buckets occurs using [pre-signed URLs](./presigned-urls).
15-
* W&B uses a garbage collection process to delete W&B Artifacts. For more information, see [Deleting Artifacts](/models/artifacts/delete-artifacts).
15+
* W&B uses garbage collection and related processes to remove deleted **artifacts** and **run data** from your bucket over time. Artifact deletion is covered in [Delete an artifact](/models/artifacts/delete-artifacts). Deleted run data on Dedicated Cloud and Self-Managed deployments also depends on `GORILLA_DATA_RETENTION_PERIOD` as described in [Configure environment variables](/platform/hosting/env-vars). Cleanup timing is not guaranteed. For a single overview of bucket usage and costs, see [Manage bucket storage and costs](/platform/hosting/managing-bucket-storage).
1616
* You can specify a sub-path when configuring a bucket, to ensure that W&B does not store any files in a folder at the root of the bucket. It can help you better conform to your organzation's bucket governance policy.
1717
</Note>
1818

platform/hosting/env-vars.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,9 @@ In addition to configuring instance level settings via the System Settings admin
3535
| `WANDB_DIR` | Where to store all generated files. If unset, defaults to the `wandb` directory relative to your training script. Make sure this directory exists and the running user has permission to write to it. This does not control the location of downloaded artifacts, which you can set using the `WANDB_ARTIFACT_DIR` environment variable. |
3636
| `WANDB_IDENTITY_TOKEN_FILE` | For [identity federation](/platform/hosting/iam/identity_federation/), the absolute path to the local directory where Java Web Tokens (JWTs) are stored. |
3737
<Note>
38-
Use the GORILLA_DATA_RETENTION_PERIOD environment variable cautiously. Data is removed immediately once the environment variable is set. We also recommend that you backup both the database and the storage bucket before you enable this flag.
38+
Use the `GORILLA_DATA_RETENTION_PERIOD` environment variable cautiously. It applies to **deleted run data** (including run-associated files such as media after deletion flows). It does **not** delete artifacts; use artifact deletion and `GORILLA_ARTIFACT_GC_ENABLED` as described in [Delete an artifact](/models/artifacts/delete-artifacts). For how deleting runs and files relates to storage and this setting, see [When deleted run data is removed from storage](/models/runs/delete-runs#when-deleted-run-data-is-removed-from-storage) in **Delete runs**. Data is removed according to the retention window once the variable is set. Back up both the database and the storage bucket before you enable or change this value.
39+
40+
Background removal of objects from your bucket is **best-effort** and not guaranteed to finish within a specific time. For expectations, troubleshooting, and how this relates to storage costs, see [Manage bucket storage and costs](/platform/hosting/managing-bucket-storage).
3941
</Note>
4042

4143
## Advanced reliability settings
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
---
2+
title: Manage bucket storage and costs
3+
description: Understand how W&B uses object storage, how deletion maps to bucket bytes, and how to reduce usage on self-managed, Dedicated Cloud, and bring-your-own-bucket deployments.
4+
---
5+
6+
When you use [Bring your own bucket (BYOB)](/platform/hosting/data-security/secure-storage-connector), [W&B Dedicated Cloud](/platform/hosting/hosting-options/dedicated-cloud), or [W&B Self-Managed](/platform/hosting/hosting-options/self-managed), your team often pays cloud storage providers directly. This page explains what occupies your bucket, how W&B removes objects after deletion in the app or API, and what you should expect in practice.
7+
8+
## What uses bucket space
9+
10+
W&B stores several categories of data in your configured object storage. The [BYOB overview](/platform/hosting/data-security/secure-storage-connector#data-stored-in-the-central-database-vs-buckets) lists examples, including experiment files and metrics, artifact files, media files, run files, and exported history in Parquet form. Together these drive bucket size and cost.
11+
12+
## How W&B removes data from storage
13+
14+
Deletion in the W&B App or [Public API](/models/ref/python/public-api/api) updates W&B metadata first. **Removing a run, artifact, or file from the product does not guarantee an immediate drop in reported bucket usage.** Object storage cleanup runs as background work that can lag, especially on busy instances.
15+
16+
### Artifacts
17+
18+
Deleted artifacts are soft-deleted, then processed by artifact garbage collection. Self-managed deployments must set `GORILLA_ARTIFACT_GC_ENABLED` and meet provider requirements such as versioning or soft delete. See [Delete an artifact](/models/artifacts/delete-artifacts) and [Configure environment variables](/platform/hosting/env-vars).
19+
20+
### Run data and run files
21+
22+
After runs or run-associated files are deleted, permanent removal of the underlying stored objects is controlled separately from artifacts. On Dedicated Cloud and Self-Managed deployments, `GORILLA_DATA_RETENTION_PERIOD` sets how long **deleted run data** is retained before it can be removed from storage. **This setting does not delete artifacts.** See [Configure environment variables](/platform/hosting/env-vars), [Data retention policy](/platform/hosting/hosting-options/dedicated-cloud#data-retention-policy) for Dedicated Cloud, and [Delete runs](/models/runs/delete-runs#when-deleted-run-data-is-removed-from-storage) for how run and file deletion relates to storage.
23+
24+
## What to expect from background cleanup
25+
26+
Garbage collection and related jobs that free object storage are **best-effort**. W&B does **not** guarantee that a given object disappears from your bucket within a specific time after you delete content in the UI or API. For projects with a large number of files per run, such as when logging many media files per run, expect **longer delays** before storage usage is released.
27+
28+
Monitor your bucket in your cloud provider and contact [W&B Support](mailto:support@wandb.ai) or your account team if cleanup appears stuck relative to your expectations.
29+
30+
## Reduce bucket usage
31+
32+
Use supported product flows first:
33+
34+
- [Delete runs in the W&B App](/models/runs/delete-runs#ui) or [with Python](/models/runs/delete-runs#python) when you no longer need them.
35+
- [Delete artifacts](/models/artifacts/delete-artifacts) you no longer need, and use [Artifact TTL](/models/artifacts/ttl) where it fits your workflow.
36+
37+
If you must **reclaim space immediately**, operators with access to the bucket may delete specific object keys directly in cloud storage. Be aware of the following:
38+
39+
- Objects you remove will **no longer be available to download** through W&B.
40+
- You should delete **only** keys you intend to remove. Incorrect deletes can break access to data the app still references.
41+
- If your bucket uses **object versioning** or **provider soft delete** (for example on Google Cloud Storage), storage charges can persist until non-current versions or soft-deleted objects expire under your cloud lifecycle rules.
42+
43+
For high-level usage in W&B Multi-tenant Cloud, organization admins can review storage-related usage from organization settings. See [Billing settings](/platform/app/settings-page/billing-settings).
44+
45+
## Troubleshooting
46+
47+
If deletions do not appear correctly in the W&B App after you use the Public API, **upgrade the W&B Python SDK** to a current release and retry. Very large per-run file counts can increase how long background cleanup takes across the instance.
48+
49+
For scripted cleanup patterns that match your deployment, contact [W&B Support](mailto:support@wandb.ai) or your account team.
50+
51+
## Related documentation
52+
53+
- [Delete runs](/models/runs/delete-runs#delete-runs)
54+
- [Delete an artifact](/models/artifacts/delete-artifacts)
55+
- [Configure environment variables](/platform/hosting/env-vars)
56+
- [Bring your own bucket (BYOB)](/platform/hosting/data-security/secure-storage-connector)

0 commit comments

Comments
 (0)