You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: Guide end users through SourceMedium BigQuery setup, access verification, schema-aware SQL analysis, and auditable SQL receipts. Use when users need help from first-run gcloud/bq configuration through answering analytical questions against SourceMedium datasets.
26
+
compatibility: Requires gcloud CLI, bq CLI, and network access to BigQuery.
27
+
---
30
28
31
-
That page includes minimum IAM roles, a copy/paste admin request, and official Google links.
29
+
# SourceMedium BigQuery Analyst
30
+
31
+
Use this skill to help end users work with SourceMedium BigQuery data from setup to analysis.
32
+
33
+
## Workflow
34
+
35
+
1. Validate local tooling (`gcloud`, `bq`) and authentication state.
36
+
2. Confirm active project and dataset/table visibility before writing analysis SQL.
37
+
3. Use docs-first guidance for setup, definitions, and table discovery.
38
+
4. Answer analytical questions with reproducible SQL receipts.
39
+
5. Call out assumptions and caveats explicitly.
40
+
41
+
## Required Output Format
42
+
43
+
For analytical questions, always return:
44
+
45
+
1.`Answer`: concise plain-English conclusion.
46
+
2.`SQL (copy/paste)`: BigQuery Standard SQL used for the result.
If access/setup fails, do not fabricate results. Return:
51
+
52
+
1. Exact failing step.
53
+
2. Exact project/dataset that failed.
54
+
3. Direct user to request access from their internal admin.
55
+
56
+
## Query Guardrails
57
+
58
+
1. Fully qualify tables as `` `project.dataset.table` ``.
59
+
2. For order analyses, default to `WHERE is_order_sm_valid = TRUE`.
60
+
3. Use `sm_store_id` (not `smcid` — that name does not exist in customer tables).
61
+
4. Use `SAFE_DIVIDE` for ratio math.
62
+
5. Handle DATE/TIMESTAMP typing explicitly (`DATE(ts_col)` when comparing to dates).
63
+
6. Use `order_net_revenue` for revenue metrics (not `order_gross_revenue` unless explicitly asked).
64
+
7. Use `*_local_datetime` columns for date-based reporting (not UTC `*_at` columns).
65
+
8. Avoid `LIKE`/`REGEXP` on low-cardinality fields (`sm_channel`, `utm_source`, `utm_medium`, `source_system`); discover values first with a `SELECT DISTINCT` query, then use exact match.
66
+
9.`LIKE` is fine for free-text fields (`utm_campaign`, `product_title`, `page_path`).
67
+
10. Keep exploration bounded (`LIMIT`, date filters, partition filters). Max 100 rows returned.
68
+
11.**LTV tables (`rpt_cohort_ltv_*`)**: always filter `sm_order_line_type` to exactly ONE value (`'all_orders'`, `'subscription_orders_only'`, or `'one_time_orders_only'`). Without this, metrics inflate 3x.
69
+
12. For multi-touch attribution questions, use `sm_experimental` dataset. For standard analysis, use `sm_transformed_v2`.
2.`is_order_sm_valid = TRUE` default for order analyses
46
-
3.`sm_store_id` naming convention
47
-
4.`SAFE_DIVIDE` for ratios
48
-
5. Discovery-first handling for categorical values
152
+
If you cannot run queries due to permissions, see [BigQuery Access Request Template](/ai-analyst/agent-skills/bigquery-access-request-template) for a copy/paste request you can send to your internal admin.
49
153
50
154
## Related docs
51
155
@@ -68,7 +172,4 @@ For analytical questions, this skill returns:
0 commit comments