Autopilots runbook
Overview
| Autopilot | Repo | Hosting | Supabase project | Clients |
|---|---|---|---|---|
| HH Google Ads | hh-google-ads-autopilot |
Vercel | fcwuqbexvggjvwgdpnya |
14 HH clients with Google Ads |
| HH Meta Ads | hh-meta-ads-autopilot |
Vercel | fcwuqbexvggjvwgdpnya |
HH clients with Meta Ads |
| MM Google Ads | mm-google-ads-autopilot |
Vercel | texktztyyacsvdkgxbkr |
8 MM mechanic clients |
All three post approval cards to Slack and log actions to ClickUp. The MM autopilot is internally nicknamed Lightning McQueen.
HH Google Ads
Detection logic
| Keyword pause | Spend ≥ $50 with 0 conversions (14-day lookback), or CPA > 3× ad group average |
| Negative keyword | Search term spend ≥ $20 with 0 conversions, matched against irrelevant terms list. Max 10 per campaign per run |
| Ad rewrite | CTR below 1% after 1,000+ impressions. Minimum 2 active ads per group (won't rewrite the last one) |
| Circuit breaker | Cumulative spend impact > 15% bumps all actions to Tier 2. Max 5 actions per ad group |
Chain runner architecture
Clients are split into two chains (A/B) processed sequentially via /api/run-chain and /api/run-chain-b. Each chain uses Vercel after() to hand off to the next segment. A chain watchdog cron fires every 5 minutes during the run window to rescue stalled chains. Per-client timeout: 240 seconds (60s headroom under Vercel's 300s max).
Pipeline per client: fetchAllClientData() → parseAllData() → runRulesEngine() → runAIAnalysis() (6 Claude sub-agents: Keyword, Negative Keyword, Ad Evaluation, Headline, Description, Reviewer) → classifyActions() → applyCircuitBreaker() → execute or queue for approval.
Chain halt was fixed in PRs #8, #10, and #13. The reject/intervene modal sync path was fixed in PR #12. PR #11 (ClickUp subtask ID capture) is open as of June 2026.
Cron schedule (all times AEST)
| Job | Schedule | Purpose |
|---|---|---|
| Full pipeline | Monday 7:00 AM | All 14 clients, chained A/B |
| Weekly digest | Monday 9:00 AM | Portfolio summary to Slack |
| Anomaly check | Mon-Fri 8:00 AM | Daily anomaly detection + Slack alerts |
| Monthly report | 1st of month 7:00 AM | Monthly trend reports |
Action tiers
Tier 1 Pause keywords, add negative keywords (auto in tiered mode)
Tier 1.5 Ad copy rewrites (always requires approval)
Tier 2 Budget changes, bid changes, campaign/ad group status changes (always requires approval)
Safety bumps: confidence < 0.7 → Tier 2. >3 entities in same ad group → Tier 2. >15% cumulative spend impact → all Tier 2.
Slack channel: #hh-ads-autopilot
MCP read: my-mcp-server-nwchtq.fly.dev | MCP write: hedgehog-ads-writer.fly.dev
HH Meta Ads
Detection logic
| Ad set pause | Spend ≥ threshold with 0 conversions, or CPA > N× account average. TOFU/awareness objectives (OUTCOME_AWARENESS, REACH, VIDEO_VIEWS) are excluded |
| Creative fatigue | Dedicated fatigue detection module monitors frequency and performance decay |
| Landing pages | Landing page performance analysis module |
Architecture
Single /api/run-all endpoint processes all clients sequentially (no chain runner). Simpler than the Google Ads autopilot. The original A/B chain pattern was removed after PR #6 (HTTP 508 loop detected from ping-pong). Approval cards include Hook Rate and Hold Rate metrics. Competitor ad intelligence via Meta Ads Library (PR #23).
Cron schedule (all times AEST)
| Job | Schedule | Purpose |
|---|---|---|
| Full pipeline | Monday 8:00 AM | All clients sequentially |
| Weekly analysis | Monday 8:30 AM | Creative fatigue, landing pages, digest |
| Anomaly check | Mon-Fri 9:00 AM | Daily snapshot + threshold checks |
| Expire objections | Daily 4:00 PM | Expire held client objections older than 7 days |
Slack channel: #hh-meta-autopilot
MCP: hedgehog-meta-ads MCP server
MM Google Ads (Lightning McQueen)
Detection logic
Identical thresholds to HH Google Ads: keyword pause at $50/0 conversions, negative keyword at $20/0 conversions, ad rewrite at CTR < 1% after 1,000 impressions, circuit breaker at >15% cumulative spend impact. Same 3-tier action classification.
Architecture
Uses staggered per-client crons (3-minute gaps) instead of the A/B chain runner. Each client is its own Vercel cron invocation. This avoids the chain stall bugs that plagued HH.
Cron schedule (all times AEST)
| Time | Client |
|---|---|
| Mon 7:00 AM | Auto Response |
| Mon 7:03 AM | Automotive Insight |
| Mon 7:06 AM | Ballarat Roadworthy |
| Mon 7:09 AM | Ballarat Service Centre |
| Mon 7:12 AM | Core Diesel |
| Mon 7:15 AM | Noranda Service Centre |
| Mon 7:18 AM | Performance Plus |
| Mon 7:21 AM | Ultra Tune North Ryde |
| Mon 7:30 AM | Weekly digest |
| Mon-Fri 8:00 AM | Anomaly check (all clients) |
| 1st of month 7:00 AM | Monthly report |
Slack channel: #mm-ads-autopilot
MCP read: mechanic-mcp-server.fly.dev | MCP write: hedgehog-ads-writer.fly.dev (shared, multi-MCC)
ClickUp bot name: Lightning McQueen (bot user ID 89551429)
Approval flow
approvals table and synced to ClickUp. The Burrow portal also surfaces pending approvals for client-side review.Intervening
Step-by-step for a human taking over:
- Pause for a client: Set the client's row to
is_active = falsein the relevant Supabase table (google_ads_clientsormeta_clients). The next cron run will skip them. - Halt a running chain (HH Google Ads only): Set
chain_status = 'halted'in thechain_runstable. The watchdog will not restart it. - Check what was last done: Query the Supabase
approvalstable filtered by client and date. ClickUp subtasks under the autopilot's parent task also show the full action log. - Reverse a change in Google Ads: Use the Google Ads UI or the
hedgehog-ads-writerMCP to undo the specific action (re-enable a paused keyword, remove a negative keyword, revert ad copy). - Reverse a change in Meta Ads: Use Meta Ads Manager or the
hedgehog-meta-adsMCP to undo the action.
Failure modes
- Chain halts mid-run (HH Google Ads): Clients with no active campaigns complete in ~6 seconds, which can cause Vercel to reap the function before
after()registers. The watchdog cron rescues stalled chains. Fixed across PRs #8, #10, #13 but watch for recurrence with new clients. - Slack card posted but buttons unresponsive: The
trigger_idexpires if there's a doubleafter()hop (fixed in PR #12). Also check that the Slack app's interactivity URL is correctly pointed at the autopilot's/api/slack/interactionsendpoint. - Duplicate recommendations: Expired feedback rules staying
is_active = truedespite being filtered at query time (fixed in PR #15). If duplicates reappear, check thefeedback_rulestable for stale active rows. - MCP 503s under load: The Fly.io MCP servers can return 503 when multiple autopilots hit them simultaneously. MM autopilot uses 3-minute client gaps to mitigate. HH uses chain sequencing. If 503s spike, check Fly.io machine status.
- Negative keyword execution failures: Campaign IDs missing from AI agent output, or
keywordspassed as array instead of comma-separated string. Fixed in PRs #9-#10. Watch for regressions when prompt templates change. - Meta HTTP 508 loop: Original A/B chain pattern caused ping-pong. Replaced with single
/api/run-allin PR #6. Should not recur unless chain architecture is reintroduced. - Wrong account ID stored (Meta): The Visions incident (Apr 2026) — wrong ad account stored due to similar-named accounts. Always get the account ID from the client's own Ads Manager, not from search results.
- Client added without service verification (Meta): The Doctors on Demand incident (Apr 2026) — client added to
meta_clientswithout confirming Meta Ads was a purchased service. Both repos now have a service-scope verification gate.