Tracking quality, attribution coverage, and data health queries.
Use the same query standards from the SQL Query Library overview: is_order_sm_valid = TRUE for order analyses, sm_store_id scoping for multi-store setups, and your_project placeholders.
These templates help you assess attribution coverage and basic data health before doing deeper analysis.If you want table-level freshness/coverage metadata, start with: dim_data_dictionary.
Which tables are stale or missing data?
What you’ll learn: Which tables in your data warehouse haven’t been updated recently or are missing data entirely. Run this first to identify pipeline issues before diving into analysis.
-- Assumptions: timeframe=all_time | metric=table_freshness | grain=dataset+table | scope=sm_metadataSELECT dataset_name, table_name, MAX(CAST(table_has_data AS INT64)) > 0 AS table_has_data, MAX(CAST(table_has_fresh_data_14d AS INT64)) > 0 AS table_has_fresh_data_14d, MAX(table_last_data_date) AS table_last_data_date, ANY_VALUE(table_description) AS table_descriptionFROM `your_project.sm_metadata.dim_data_dictionary`WHERE dataset_name IN ('sm_transformed_v2', 'sm_experimental') AND dataset_name IS NOT NULL AND table_name IS NOT NULLGROUP BY 1, 2ORDER BY table_has_fresh_data_14d ASC, table_has_data ASC, table_last_data_date ASC, dataset_name, table_nameLIMIT 200;
Attribution column coverage on orders
What you’ll learn: How complete your attribution data is—what percentage of orders have UTM source, zero-party attribution, discount codes, landing pages, and referrer domains. Low coverage in key columns signals tracking gaps.
-- Assumptions: timeframe=all_time | metric=column_coverage | grain=column | scope=sm_metadata_obt_ordersWITH cols AS ( SELECT 'sm_utm_source' AS column_name UNION ALL SELECT 'sm_utm_medium' UNION ALL SELECT 'sm_utm_source_medium' UNION ALL SELECT 'sm_zero_party_attribution_source' UNION ALL SELECT 'order_discount_codes_csv' UNION ALL SELECT 'sm_order_landing_page' UNION ALL SELECT 'sm_order_referrer_domain')SELECT d.table_name, d.column_name, ROUND(100 - d.column_null_percentage, 1) AS non_null_pct, d.column_distinct_count, ( SELECT STRING_AGG( CONCAT(v.value, ' (', FORMAT('%.1f', v.pct), '%)'), ', ' ORDER BY v.pct DESC LIMIT 8 ) FROM UNNEST(IFNULL(d.categorical_value_distribution, [])) AS v WHERE v.value IS NOT NULL AND v.pct IS NOT NULL ) AS top_valuesFROM `your_project.sm_metadata.dim_data_dictionary` dINNER JOIN cols c ON d.column_name = c.column_nameWHERE d.dataset_name = 'sm_transformed_v2' AND d.table_name = 'obt_orders'ORDER BY non_null_pct DESC, d.column_distinct_count DESC, d.column_name;
When UTMs are missing, what other attribution signals exist?
What you’ll learn: For orders without UTM tracking, what fallback attribution data is available (zero-party surveys, discount codes, landing pages, referrer domains). Helps you understand how much attribution you can recover.
-- Assumptions: timeframe=last_90_days | metric=fallback_coverage | grain=overall | scope=valid_orders_only_missing_utmsWITH missing_utms AS ( SELECT sm_order_key, order_net_revenue, sm_zero_party_attribution_source, order_discount_codes_csv, sm_order_landing_page, sm_order_referrer_domain FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') = '(none) / (none)')SELECT COUNT(DISTINCT sm_order_key) AS orders_missing_utms, SUM(order_net_revenue) AS order_net_revenue_missing_utms, COUNTIF(sm_zero_party_attribution_source IS NOT NULL AND TRIM(sm_zero_party_attribution_source) NOT IN ('', '(none)')) AS orders_with_zero_party, SAFE_DIVIDE( COUNTIF(sm_zero_party_attribution_source IS NOT NULL AND TRIM(sm_zero_party_attribution_source) NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0) ) AS pct_with_zero_party, COUNTIF(order_discount_codes_csv IS NOT NULL AND TRIM(order_discount_codes_csv) NOT IN ('', '(none)')) AS orders_with_discount_code, SAFE_DIVIDE( COUNTIF(order_discount_codes_csv IS NOT NULL AND TRIM(order_discount_codes_csv) NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0) ) AS pct_with_discount_code, COUNTIF(sm_order_landing_page IS NOT NULL AND TRIM(sm_order_landing_page) NOT IN ('', '(none)')) AS orders_with_landing_page, SAFE_DIVIDE( COUNTIF(sm_order_landing_page IS NOT NULL AND TRIM(sm_order_landing_page) NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0) ) AS pct_with_landing_page, COUNTIF(sm_order_referrer_domain IS NOT NULL AND TRIM(sm_order_referrer_domain) NOT IN ('', '(none)')) AS orders_with_referrer_domain, SAFE_DIVIDE( COUNTIF(sm_order_referrer_domain IS NOT NULL AND TRIM(sm_order_referrer_domain) NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0) ) AS pct_with_referrer_domainFROM missing_utms;
Top referrer domains for orders missing UTMs
What you’ll learn: Which external sites are sending you traffic that isn’t tagged with UTMs. Use this to identify partners, affiliates, or other untracked sources that need proper tracking or attribution rules.
-- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=referrer_domain | scope=valid_orders_only_missing_utmsWITH base AS ( SELECT LOWER(TRIM(sm_order_referrer_domain)) AS referrer_domain, sm_order_key, order_net_revenue FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') = '(none) / (none)' AND sm_order_referrer_domain IS NOT NULL AND TRIM(sm_order_referrer_domain) NOT IN ('', '(none)'))SELECT referrer_domain, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenueFROM baseGROUP BY 1HAVING orders >= 25ORDER BY order_net_revenue DESCLIMIT 50;
What you’ll learn: The percentage of orders missing customer keys and order lines missing SKUs. Critical for data integrity—high null rates here break customer-level analysis and product reporting.
-- Assumptions: timeframe=last_30_days | metric=null_rate_checks | grain=overall | scope=valid_orders_onlyWITH orders AS ( SELECT COUNT(*) AS orders_total, COUNTIF(sm_customer_key IS NULL) AS orders_missing_customer_key FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)),lines AS ( SELECT COUNT(*) AS lines_total, COUNTIF(sku IS NULL OR TRIM(sku) = '' OR LOWER(sku) = 'missing sku') AS lines_missing_sku FROM `your_project.sm_transformed_v2.obt_order_lines` WHERE is_order_sm_valid = TRUE AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))SELECT orders_total, orders_missing_customer_key, SAFE_DIVIDE(orders_missing_customer_key, NULLIF(orders_total, 0)) AS pct_orders_missing_customer_key, lines_total, lines_missing_sku, SAFE_DIVIDE(lines_missing_sku, NULLIF(lines_total, 0)) AS pct_lines_missing_skuFROM ordersCROSS JOIN lines;
Attribution health trend (weekly)
What you’ll learn: How your attribution coverage has changed week-over-week—UTM coverage, unattributed orders, and direct traffic share. Spot tracking regressions or improvements over time.
-- Assumptions: timeframe=last_26_weeks | metric=utm_coverage+direct_share+unattributed_share | grain=week | scope=valid_orders_onlyWITH base AS ( SELECT DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start, COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium, sm_order_key, order_net_revenue FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY)),weekly AS ( SELECT week_start, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, COUNTIF(source_medium = '(none) / (none)') AS unattributed_orders, SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END) AS unattributed_revenue, COUNTIF(source_medium IN ('(direct) / (none)', 'direct / (none)')) AS direct_orders, SUM(CASE WHEN source_medium IN ('(direct) / (none)', 'direct / (none)') THEN order_net_revenue ELSE 0 END) AS direct_revenue, COUNTIF(source_medium != '(none) / (none)') AS orders_with_utm_source_medium FROM base GROUP BY 1)SELECT week_start, orders, order_net_revenue, SAFE_DIVIDE(orders_with_utm_source_medium, NULLIF(orders, 0)) AS pct_orders_with_utm_source_medium, SAFE_DIVIDE(unattributed_orders, NULLIF(orders, 0)) AS pct_orders_unattributed, SAFE_DIVIDE(unattributed_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_unattributed, SAFE_DIVIDE(direct_orders, NULLIF(orders, 0)) AS pct_orders_direct, SAFE_DIVIDE(direct_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_directFROM weeklyORDER BY week_start;
Attribution health by store and sales channel
What you’ll learn: How attribution coverage varies across your stores and sales channels (online vs. POS vs. wholesale). Some channels naturally have lower attribution—this helps set expectations.
-- Assumptions: timeframe=last_30_days | metric=unattributed_share | grain=sm_store_id+sm_channel | scope=valid_orders_onlyWITH base AS ( SELECT sm_store_id, COALESCE(NULLIF(LOWER(TRIM(sm_channel)), ''), '(unknown)') AS sm_channel, COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium, sm_order_key, order_net_revenue FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) AND NULLIF(LOWER(TRIM(sm_channel)), '') IS NOT NULL)SELECT sm_store_id, sm_channel, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, SAFE_DIVIDE(COUNTIF(source_medium = '(none) / (none)'), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_unattributed, SAFE_DIVIDE( SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END), NULLIF(SUM(order_net_revenue), 0) ) AS pct_revenue_unattributedFROM baseGROUP BY 1, 2HAVING orders >= 50ORDER BY pct_revenue_unattributed DESC, orders DESCLIMIT 100;
Discount code parsing (top codes by revenue)
What you’ll learn: Your top discount codes ranked by revenue, with order counts and AOV. Use this to evaluate promo effectiveness and identify codes that might be over-used or under-attributed.
-- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=discount_code | scope=valid_orders_only-- Note: If an order has multiple discount codes, its revenue will be counted under each code (this is a code-usage view, not strict attribution).WITH orders_with_codes AS ( SELECT sm_order_key, order_net_revenue, order_discount_codes_csv FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND order_discount_codes_csv IS NOT NULL AND TRIM(order_discount_codes_csv) != ''),exploded AS ( SELECT sm_order_key, order_net_revenue, TRIM(code_raw) AS discount_code FROM orders_with_codes, UNNEST(SPLIT(order_discount_codes_csv, ',')) AS code_raw WHERE TRIM(code_raw) != '')SELECT discount_code, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, SAFE_DIVIDE(SUM(order_net_revenue), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS aovFROM explodedGROUP BY 1HAVING orders >= 25ORDER BY order_net_revenue DESCLIMIT 50;
Top landing pages for orders missing UTMs
What you’ll learn: Which pages customers land on when they arrive without UTM tracking. Useful for identifying untracked entry points and pages that need better tracking implementation.
-- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=landing_host+landing_path | scope=valid_orders_only_missing_utmsWITH base AS ( SELECT sm_order_key, order_net_revenue, sm_order_landing_page FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') = '(none) / (none)' AND sm_order_landing_page IS NOT NULL AND TRIM(sm_order_landing_page) NOT IN ('', '(none)')),parsed AS ( SELECT sm_order_key, order_net_revenue, REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?([^/?#]+)') AS landing_host, REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?[^/?#]+(/[^?#]*)') AS landing_path FROM base)SELECT COALESCE(NULLIF(LOWER(TRIM(landing_host)), ''), '(unknown)') AS landing_host, COALESCE(NULLIF(landing_path, ''), '/') AS landing_path, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenueFROM parsedGROUP BY 1, 2HAVING orders >= 25ORDER BY order_net_revenue DESCLIMIT 50;
Click-id coverage vs UTM coverage (gclid/fbclid)
What you’ll learn: The overlap between UTM tracking and ad-platform click IDs (gclid for Google, fbclid for Meta). Reveals orders where click IDs exist but UTMs don’t—potential attribution recovery opportunities.
-- Assumptions: timeframe=last_90_days | metric=utm_coverage+click_id_coverage | grain=week | scope=valid_orders_onlyWITH base AS ( SELECT DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start, sm_order_key, order_net_revenue, COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium, sm_gclid, sm_fbclid FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY))SELECT week_start, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, COUNTIF(source_medium != '(none) / (none)') AS orders_with_utm_source_medium, SAFE_DIVIDE(COUNTIF(source_medium != '(none) / (none)'), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_with_utm_source_medium, COUNTIF(sm_gclid IS NOT NULL AND sm_gclid NOT IN ('', '(none)')) AS orders_with_gclid, SAFE_DIVIDE(COUNTIF(sm_gclid IS NOT NULL AND sm_gclid NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_with_gclid, COUNTIF(sm_fbclid IS NOT NULL AND sm_fbclid NOT IN ('', '(none)')) AS orders_with_fbclid, SAFE_DIVIDE(COUNTIF(sm_fbclid IS NOT NULL AND sm_fbclid NOT IN ('', '(none)')), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_with_fbclid, COUNTIF(source_medium = '(none) / (none)' AND sm_gclid IS NOT NULL AND sm_gclid NOT IN ('', '(none)')) AS utm_missing_but_gclid_orders, COUNTIF(source_medium = '(none) / (none)' AND sm_fbclid IS NOT NULL AND sm_fbclid NOT IN ('', '(none)')) AS utm_missing_but_fbclid_ordersFROM baseGROUP BY 1ORDER BY week_start;
These are deeper-dive investigations for when attribution looks “weird” (too much direct/unattributed), or when downstream metrics are being skewed by edge-case orders.
$0 / negative net-revenue order share by source/medium (last 90 days)
What you’ll learn: Which source/mediums have an unusually high share of valid orders with order_net_revenue = 0 (or negative). This often indicates replacements/comp orders or heavy discounts that can skew repeat/retention metrics.
-- Assumptions: timeframe=last_90_days | metric=paid_vs_zero_vs_negative_order_share | grain=source_medium | scope=valid_orders_onlyWITH base AS ( SELECT COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium, sm_order_key, order_net_revenue FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY))SELECT source_medium, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, COUNTIF(order_net_revenue > 0) AS paid_orders, SAFE_DIVIDE(COUNTIF(order_net_revenue > 0), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_paid_orders, COUNTIF(order_net_revenue = 0) AS zero_net_revenue_orders, SAFE_DIVIDE(COUNTIF(order_net_revenue = 0), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_zero_net_revenue_orders, COUNTIF(order_net_revenue < 0) AS negative_net_revenue_orders, SAFE_DIVIDE(COUNTIF(order_net_revenue < 0), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_negative_net_revenue_ordersFROM baseGROUP BY 1HAVING orders >= 200ORDER BY pct_zero_net_revenue_orders DESC, orders DESCLIMIT 50;
Unattributed share by source system and sales channel (last 90 days)
What you’ll learn: Where unattributed orders are coming from by commerce platform (source_system) and sales channel (sm_channel). Some channels (e.g., marketplaces or POS) naturally have lower UTM coverage—this helps separate “expected” vs “broken tracking.”
-- Assumptions: timeframe=last_90_days | metric=unattributed_share | grain=source_system+sm_channel | scope=valid_orders_onlyWITH base AS ( SELECT COALESCE(NULLIF(LOWER(TRIM(source_system)), ''), '(unknown)') AS source_system, COALESCE(NULLIF(LOWER(TRIM(sm_channel)), ''), '(unknown)') AS sm_channel, COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium, sm_order_key, order_net_revenue FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND NULLIF(LOWER(TRIM(source_system)), '') IS NOT NULL AND NULLIF(LOWER(TRIM(sm_channel)), '') IS NOT NULL)SELECT source_system, sm_channel, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, COUNTIF(source_medium = '(none) / (none)') AS unattributed_orders, SAFE_DIVIDE(COUNTIF(source_medium = '(none) / (none)'), NULLIF(COUNT(DISTINCT sm_order_key), 0)) AS pct_orders_unattributed, SAFE_DIVIDE( SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END), NULLIF(SUM(order_net_revenue), 0) ) AS pct_revenue_unattributedFROM baseGROUP BY 1, 2HAVING orders >= 200ORDER BY pct_revenue_unattributed DESC, orders DESCLIMIT 100;
Top landing pages for direct traffic orders (last 90 days)
What you’ll learn: Which landing pages are most associated with “direct” orders (based on sm_utm_source_medium)—and whether landing page capture is missing. This helps diagnose tracking gaps (e.g., missing UTMs or missing landing-page capture on key entry flows).
-- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=landing_host+landing_path | scope=valid_orders_only_directWITH base AS ( SELECT sm_order_key, order_net_revenue, NULLIF(TRIM(sm_order_landing_page), '') AS sm_order_landing_page FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') IN ('(direct) / (none)', 'direct / (none)')),parsed AS ( SELECT sm_order_key, order_net_revenue, REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?([^/?#]+)') AS landing_host, REGEXP_EXTRACT(sm_order_landing_page, r'^(?:https?://)?[^/?#]+(/[^?#]*)') AS landing_path FROM base)SELECT COALESCE(NULLIF(LOWER(TRIM(landing_host)), ''), '(missing_landing_page)') AS landing_host, COALESCE(NULLIF(landing_path, ''), '(missing_landing_page)') AS landing_path, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenueFROM parsedGROUP BY 1, 2HAVING orders >= 25ORDER BY order_net_revenue DESCLIMIT 50;
Attribution health trend with week-over-week deltas (weekly)
What you’ll learn: A “tracking regression detector”: week-over-week changes in unattributed/direct order share and revenue share. Sudden jumps typically indicate tagging/measurement changes.
-- Assumptions: timeframe=last_26_weeks | metric=utm_coverage+direct_share+unattributed_share+wow_deltas | grain=week | scope=valid_orders_onlyWITH base AS ( SELECT DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start, COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium, sm_order_key, order_net_revenue FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY)),weekly AS ( SELECT week_start, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, COUNTIF(source_medium = '(none) / (none)') AS unattributed_orders, SUM(CASE WHEN source_medium = '(none) / (none)' THEN order_net_revenue ELSE 0 END) AS unattributed_revenue, COUNTIF(source_medium IN ('(direct) / (none)', 'direct / (none)')) AS direct_orders, SUM(CASE WHEN source_medium IN ('(direct) / (none)', 'direct / (none)') THEN order_net_revenue ELSE 0 END) AS direct_revenue, COUNTIF(source_medium != '(none) / (none)') AS orders_with_utm_source_medium FROM base GROUP BY 1),metrics AS ( SELECT week_start, orders, order_net_revenue, SAFE_DIVIDE(orders_with_utm_source_medium, NULLIF(orders, 0)) AS pct_orders_with_utm_source_medium, SAFE_DIVIDE(unattributed_orders, NULLIF(orders, 0)) AS pct_orders_unattributed, SAFE_DIVIDE(unattributed_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_unattributed, SAFE_DIVIDE(direct_orders, NULLIF(orders, 0)) AS pct_orders_direct, SAFE_DIVIDE(direct_revenue, NULLIF(order_net_revenue, 0)) AS pct_revenue_direct FROM weekly)SELECT week_start, orders, order_net_revenue, pct_orders_with_utm_source_medium, pct_orders_unattributed, pct_orders_unattributed - LAG(pct_orders_unattributed) OVER (ORDER BY week_start) AS delta_pct_orders_unattributed, pct_revenue_unattributed, pct_revenue_unattributed - LAG(pct_revenue_unattributed) OVER (ORDER BY week_start) AS delta_pct_revenue_unattributed, pct_orders_direct, pct_orders_direct - LAG(pct_orders_direct) OVER (ORDER BY week_start) AS delta_pct_orders_direct, pct_revenue_direct, pct_revenue_direct - LAG(pct_revenue_direct) OVER (ORDER BY week_start) AS delta_pct_revenue_directFROM metricsORDER BY week_start;
UTM source/medium discovery (top normalized values, last 90 days)
What you’ll learn: What your UTM source/medium values actually look like in practice (normalized with LOWER(TRIM())). Use this to discover the exact strings you should filter on—without guessing.
-- Assumptions: timeframe=last_90_days | metric=orders+net_revenue | grain=source_medium | scope=valid_orders_onlyWITH base AS ( SELECT COALESCE(NULLIF(LOWER(TRIM(sm_utm_source_medium)), ''), '(none) / (none)') AS source_medium, sm_order_key, order_net_revenue FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY))SELECT source_medium, COUNT(DISTINCT sm_order_key) AS orders, SUM(order_net_revenue) AS order_net_revenue, SAFE_DIVIDE(COUNT(DISTINCT sm_order_key), NULLIF(SUM(COUNT(DISTINCT sm_order_key)) OVER (), 0)) AS pct_orders, SAFE_DIVIDE(SUM(order_net_revenue), NULLIF(SUM(SUM(order_net_revenue)) OVER (), 0)) AS pct_revenueFROM baseGROUP BY 1ORDER BY orders DESCLIMIT 50;
What you’ll learn: Whether key join fields are getting worse over time. Spikes in missing sm_customer_key (orders) or missing sku (order lines) will break customer-level and product-level analysis.
-- Assumptions: timeframe=last_26_weeks | metric=missing_key_trends | grain=week | scope=valid_orders_onlyWITH orders_weekly AS ( SELECT DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start, COUNT(*) AS orders_total, COUNTIF(sm_customer_key IS NULL) AS orders_missing_customer_key FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY) GROUP BY 1),lines_weekly AS ( SELECT DATE_TRUNC(DATE(order_processed_at_local_datetime), WEEK(MONDAY)) AS week_start, COUNT(*) AS lines_total, COUNTIF(sku IS NULL OR TRIM(sku) = '' OR LOWER(sku) = 'missing sku') AS lines_missing_sku FROM `your_project.sm_transformed_v2.obt_order_lines` WHERE is_order_sm_valid = TRUE AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 182 DAY) GROUP BY 1)SELECT o.week_start, o.orders_total, o.orders_missing_customer_key, SAFE_DIVIDE(o.orders_missing_customer_key, NULLIF(o.orders_total, 0)) AS pct_orders_missing_customer_key, l.lines_total, l.lines_missing_sku, SAFE_DIVIDE(l.lines_missing_sku, NULLIF(l.lines_total, 0)) AS pct_lines_missing_skuFROM orders_weekly oLEFT JOIN lines_weekly l USING (week_start)ORDER BY o.week_start;
Multiple discount codes prevalence (double-counting risk, last 90 days)
What you’ll learn: How often orders have multiple discount codes applied. This matters because any “revenue by discount code” view will double-count revenue across codes when multiple codes exist.
-- Assumptions: timeframe=last_90_days | metric=discount_code_multiplicity | grain=code_count_bucket | scope=valid_orders_onlyWITH base AS ( SELECT sm_order_key, order_net_revenue, order_discount_codes_csv FROM `your_project.sm_transformed_v2.obt_orders` WHERE is_order_sm_valid = TRUE AND order_cancelled_at IS NULL AND DATE(order_processed_at_local_datetime) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY) AND order_discount_codes_csv IS NOT NULL AND TRIM(order_discount_codes_csv) NOT IN ('', '(none)')),code_counts AS ( SELECT sm_order_key, order_net_revenue, ARRAY_LENGTH( ARRAY( SELECT TRIM(code_raw) FROM UNNEST(SPLIT(order_discount_codes_csv, ',')) AS code_raw WHERE TRIM(code_raw) != '' ) ) AS code_count FROM base),bucketed AS ( SELECT sm_order_key, order_net_revenue, LEAST(code_count, 5) AS code_count_bucket FROM code_counts)SELECT CASE code_count_bucket WHEN 5 THEN '5+' ELSE CAST(code_count_bucket AS STRING) END AS code_count_bucket, COUNT(DISTINCT sm_order_key) AS orders, SAFE_DIVIDE(COUNT(DISTINCT sm_order_key), NULLIF(SUM(COUNT(DISTINCT sm_order_key)) OVER (), 0)) AS pct_orders, SUM(order_net_revenue) AS order_net_revenue, SAFE_DIVIDE(SUM(order_net_revenue), NULLIF(SUM(SUM(order_net_revenue)) OVER (), 0)) AS pct_revenueFROM bucketedGROUP BY 1, code_count_bucketORDER BY code_count_bucket;