Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions src/multiplayer-servers/monitoring/dashboards.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a highlighted section that this dashboard is not causally consistent with network issues. It only provides indicators that specific routes from the server to the predefined targets might be disrupted. Network issues may occur despite no probes failing, and vice versa, game servers might not experience connectivity issues even though there are probes failing. Failing probes do not equal "network issues" per se.

Because of the vantage point of the probes, this is also only a local view: probes towards cloud provider endpoints are a) highly selective (1 public, global endpoint per cloud provider, no regional or "other" ways to the targets -- those might not be equally disrupted as towards the global endpoints; this depends on their implementation) and b) target cloud provider services, not the entire cloud platform, thus just giving a selective view of what services might encounter issues, e.g., just because probes towards AWS S3 might be failing, that doesn't mean that all traffic towards AWS experiences connection issues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, will do

Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Dashboards

GameFabric provides predefined Grafana dashboards for monitoring your infrastructure.
You can find these under "Dashboards" in your Grafana instance.

## BBE Probes from Nodes

This dashboard shows BlackBox Exporter (BBE) probe results from each of your assigned nodes to predefined targets, including major cloud providers (AWS, Azure, GCP) and DNS servers (such as 1.1.1.1 and 8.8.8.8).
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list items in parentheses should follow the technical writing guideline to define acronyms/abbreviations on first use. While "AWS, Azure, GCP" are defined in the BBE expansion above, "1.1.1.1 and 8.8.8.8" are presented without context. Consider clarifying what these IP addresses represent (Cloudflare and Google DNS respectively) for readers who may not immediately recognize them.

Copilot generated this review using guidance from repository custom instructions.

### Purpose

Use this dashboard to quickly identify whether game server issues are caused by network connectivity problems to a particular cloud provider rather than bugs in your application code.
Comment on lines +8 to +12
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This dashboard shows BlackBox Exporter (BBE) probe results from each of your assigned nodes to predefined targets, including major cloud providers (AWS, Azure, GCP) and DNS servers (such as 1.1.1.1 and 8.8.8.8).
### Purpose
Use this dashboard to quickly identify whether game server issues are caused by network connectivity problems to a particular cloud provider rather than bugs in your application code.
This dashboard shows BlackBox Exporter (BBE) probe results from each of your assigned nodes to predefined targets, including major cloud providers (AWS, Azure, GCP) and DNS servers (such as 1.1.1.1 and 8.8.8.8).
This dashboard helps you determine whether game server incidents originate from cloud-provider connectivity issues rather than defects in the application.

IMO no need for a Purpose sub-section, just state the purpose directly


### Interpreting the Dashboard

- **Red sections** indicate the timespan during which a probe failed.
- **Short probe failures** are usually nothing to worry about.
- **Prolonged failures** to a single target (for example, a cloud provider your game doesn't use, or a backup DNS server) may have no impact on your game servers.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment

Comment on lines +16 to +18
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list items lack parallel structure in punctuation. The first two items use periods, but the third item does not. According to the technical writing guidelines, list items should be consistent in punctuation. Either add a period to line 18 or remove periods from lines 16 and 17.

Copilot generated this review using guidance from repository custom instructions.
- If probe failures to **multiple targets persist**, GameFabric automatically sets the status to degraded on [status.gamefabric.com](https://status.gamefabric.com).

:::warning Probe results are not causally consistent with network issues
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning title "Probe results are not causally consistent with network issues" uses unclear terminology. The phrase "causally consistent" is ambiguous and may confuse readers. Consider rephrasing to something clearer like "Probe results do not always reflect network issues" or "Probes provide limited visibility into network issues" to better match the technical writing guideline for clarity and directness.

Copilot generated this review using guidance from repository custom instructions.
Failing probes do not necessarily indicate network issues, and network issues may occur even when all probes succeed. Probes only test specific routes from nodes to predefined targets.

The dashboard provides a limited view:

- Only one public, global endpoint is probed per cloud provider. Regional routes may behave differently.
- Probes target specific cloud services (for example, AWS S3), not the entire cloud platform. Other services on the same provider may be unaffected.
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second list item appears to use "for example" parenthetically but is describing a specific example with "AWS S3". According to the technical writing guidelines for clarity and consistency, consider rephrasing to "Probes target specific cloud services (such as AWS S3), not the entire cloud platform." to match the usage pattern established earlier in the document on line 18.

Copilot generated this review using guidance from repository custom instructions.
:::
Comment on lines +14 to +28
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is a weird use for a bullet list: the 1st point is about what red sections are, and the 3 following ones are about how to interpret various durations of the 1st point. Those are not siblings/parallel if that makes sense. I'd suggest rephrasing this whole section like such, wdyt?

Suggested change
### Interpreting the Dashboard
- **Red sections** indicate the timespan during which a probe failed.
- **Short probe failures** are usually nothing to worry about.
- **Prolonged failures** to a single target (for example, a cloud provider your game doesn't use, or a backup DNS server) may have no impact on your game servers.
- If probe failures to **multiple targets persist**, GameFabric automatically sets the status to degraded on [status.gamefabric.com](https://status.gamefabric.com).
:::warning Probe results are not causally consistent with network issues
Failing probes do not necessarily indicate network issues, and network issues may occur even when all probes succeed. Probes only test specific routes from nodes to predefined targets.
The dashboard provides a limited view:
- Only one public, global endpoint is probed per cloud provider. Regional routes may behave differently.
- Probes target specific cloud services (for example, AWS S3), not the entire cloud platform. Other services on the same provider may be unaffected.
:::
### Interpreting the Dashboard
Red segments represent periods where a probe failed.
In practice:
- Brief probe failures are common and usually not actionable.
- A sustained failure to a single target may still have no impact—for example, if the target is a provider your game does not use or a backup DNS endpoint.
- If failures persist across multiple targets, GameFabric automatically marks the service as **Degraded** on [status.gamefabric.com](https://status.gamefabric.com).
:::note About probe results
Probe results are not a definitive measure of network health: a failing probe does not necessarily indicate a network issue, and network issues can occur even when probes succeed. Probes test only specific routes from our nodes to a fixed set of predefined targets.
Limitations:
- Only one public, global endpoint is probed per cloud provider; regional routes may behave differently.
- Probes target specific cloud services (for example, AWS S3), not the entire cloud platform. Other services on the same provider may be unaffected.
:::

Also, the admonition here should be more of a note than a warning, there's nothing dangerous here, it's more of an information panel/PSA IMO, would you agree?

4 changes: 4 additions & 0 deletions src/multiplayer-servers/monitoring/sidebar.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
"text": "Introduction",
"link": "/monitoring/introduction"
},
{
"text": "Dashboards",
"link": "/monitoring/dashboards"
},
{
"text": "Audit Logs",
"link": "/monitoring/auditlogs"
Expand Down