SQL in Sheets
Use analyze_range and transform_range to run Postgres SQL against live sheet data. The definitive AX cookbook.
Note
Agent Experience first. This guide is written for AI agents using sheets-mcp, not just human developers. Every pattern here is optimised to minimise tool calls, protect the agent context window, and produce reliable, deterministic outcomes.
Why SQL instead of read → process → write?
The naive pattern for an AI agent operating on spreadsheet data is:
read_range(large range) → 50,000 cells enter context → LLM processes → write_rangeThis is the worst possible approach for agents:
- Context bloat: 50k cells can consume the entire context window, leaving no room for reasoning
- Token cost: Unnecessary token spend on raw data the LLM doesn't need to see
- Hallucination risk: The LLM may produce incorrect aggregations on large numeric datasets
The correct pattern:
analyze_range(range, SQL) → only the aggregated result enters context → doneSQL runs in Postgres inside Vercel's infrastructure. The sheet data never touches the LLM context beyond the final summary.
The data model
analyze_range and transform_range both expose your sheet as a CTE named sheet_data with a single JSONB column data. Row 1 of your range is treated as the header row. Each subsequent row becomes one JSONB object with header keys.
Sheet:
| Name | Revenue | Region |
|---------|---------|--------|
| Acme | 12000 | South |
| Globex | 8500 | North |
JSONB objects:
{ "Name": "Acme", "Revenue": "12000", "Region": "South" }
{ "Name": "Globex", "Revenue": "8500", "Region": "North" }Warning: All values in data are strings — even numbers. Always cast before arithmetic: (data ->>'Revenue')::numeric
Step 0 — Always inspect headers first
Before writing any query, run this to see the exact header keys (case, spaces, special chars):
SELECT jsonb_object_keys(data) AS header
FROM sheet_data
LIMIT 50This prevents null results from silent key mismatches like 'Revenue ' vs 'Revenue'.
Core patterns
Aggregate with GROUP BY
SELECT
data ->>'Region' AS region,
COUNT(*)::int AS deals,
SUM((data ->>'Revenue')::numeric) AS total_revenue,
ROUND(AVG((data ->>'Revenue')::numeric), 2) AS avg_deal
FROM sheet_data
WHERE data ->>'Status' = 'Closed Won'
GROUP BY region
ORDER BY total_revenue DESCFilter rows
SELECT data ->>'Name', data ->>'Revenue', data ->>'Due Date'
FROM sheet_data
WHERE (data ->>'Revenue')::numeric > 10000
AND data ->>'Status' != 'Cancelled'Find duplicates
SELECT data ->>'Email' AS email, COUNT(*) AS occurrences
FROM sheet_data
GROUP BY email
HAVING COUNT(*) > 1
ORDER BY occurrences DESCRunning total (window function)
SELECT
data ->>'Date' AS date,
(data ->>'Amount')::numeric AS amount,
SUM((data ->>'Amount')::numeric) OVER (ORDER BY data ->>'Date') AS running_total
FROM sheet_dataConditional classification
SELECT
data ->>'Name' AS name,
(data ->>'Score')::numeric AS score,
CASE
WHEN (data ->>'Score')::numeric >= 90 THEN 'A'
WHEN (data ->>'Score')::numeric >= 75 THEN 'B'
WHEN (data ->>'Score')::numeric >= 60 THEN 'C'
ELSE 'F'
END AS grade
FROM sheet_data
ORDER BY score DESCanalyze_range — read-only
Use when you need insight without mutation. Results go to the agent only — the sheet is untouched.
{
"tool": "analyze_range",
"spreadsheetId": "1BxiMVs0XRA...",
"range": "Sheet1!A:F",
"query": "SELECT data ->>'Region' AS region, SUM((data ->>'Revenue')::numeric) AS total FROM sheet_data GROUP BY region ORDER BY total DESC"
}Caching behaviour: analyze_range caches the raw sheet data in Supabase for 60 seconds. Repeated calls within this window skip the Sheets API fetch entirely, reducing latency and API quota usage.
transform_range — read → SQL → write back
Use when you need to mutate the sheet data in-place using SQL logic. The agent writes the transformation rule, not the transformed data.
{
"tool": "transform_range",
"spreadsheetId": "1BxiMVs0XRA...",
"range": "Sheet1!A:D",
"query": "SELECT data ->>'Date', INITCAP(TRIM(data ->>'Name')), (data ->>'Revenue')::numeric * 1.1, data ->>'Region' FROM sheet_data"
}Critical: transform_range never uses cache. It always fetches fresh data from Google Sheets before executing SQL. This prevents stale data from being written back.
Cell limit: transform_range is capped at 50,000 cells per call. For larger datasets, narrow the range or use column subsets.
Pre-write snapshot: Before every write, transform_range saves the original range to write_history in Supabase. Use restore_snapshot to undo.
Safe workflow for transforms
1. analyze_range(range, query) → Preview the SQL output
2. If output looks correct →
transform_range(range, same_query) → Write it back
3. If something went wrong →
restore_snapshot(action: "list") → Find the snapshot
restore_snapshot(action: "restore", snapshotId: "...") → UndoAX anti-patterns to avoid
| ✗ Anti-pattern | ✓ Correct approach |
|---|---|
read_range on 10k+ rows then summarise in LLM | analyze_range with SQL GROUP BY |
Multiple sequential analyze_range calls for the same range | One query with multiple CTEs or subqueries |
Using transform_range without previewing with analyze_range first | Always preview before mutating |
| Casting headers as numbers without inspecting them first | SELECT jsonb_object_keys(data) first |
Passing raw LLM-generated strings directly into values[][] | Clean and validate in SQL, then write |
Security constraints (automatic — agents cannot bypass)
| Constraint | Effect |
|---|---|
BEGIN READ ONLY transaction | Postgres blocks any mutation even if SQL contains INSERT/UPDATE |
; stripped from query | Statement chaining blocked |
DROP, DELETE, UNION rejected | Destructive keywords rejected before execution |
LIMIT 1000 auto-appended | Result sets capped to prevent context overflow |
SET LOCAL ROLE sheet_analyzer | Query runs as a restricted Postgres role |