The Anthropic cost story stopped being a vibes piece today. The Information confirmed Together AI is processing 400 trillion tokens a month (up from 30B a year ago), Hugging Face paid subs doubled in five months, Not Diamond’s router quietly saves coding customers 20-40% by silently demoting work to older Anthropic models, and Jeff Hunter (TheTip) is claiming Fable 5 just moved behind a metered paywall at $10/M input / $50/M output — outside Pro/Max/Team/Enterprise inclusion. The Hunter claim is editorial and unverified from a first-party source as of this brief — treat with caution until Anthropic publishes — but the direction is corroborated four ways.
What this means for your stack: Ben (Xero/Anthropic API), Always-On Reeve (Sonnet 4.6 daemon), and your daily Claude Code routine are all uncapped Anthropic spend that you’ve never seriously load-shed. This week, do two things: (1) verify Hunter’s Fable 5 pricing claim against the Anthropic status page / billing console before you commit to anything, and (2) cost-out one week of Reeve heartbeats + Ben daily-scans + your CC sessions. If the bill is sliding past $50/mo on PaperClip-budgeted agents, the GLM-5.2 / Together AI / OpenRouter escape valve is now mature enough to route the cheap turns through. The Apr 16 pricing-change tracker entry was the warning shot. This is the follow-through.
1 What to Know Today
Tier 1 — ElevenLabs Ads Engine ships inside ElevenCreative (MACA)
Verified shipped. ElevenLabs pushed an Ads Engine that pulls ad creatives directly from connected Google/Meta/LinkedIn accounts, localizes across 50+ languages using Dubbing v2 (preserves the original voice, emotion, pacing), adapts text overlays, and pushes finished creatives back to the platform. Point-and-click only at launch — no developer API yet, which is the only reason this isn’t an emergency. Recommended action: spend 20 minutes this week running one of your UBX South Bank ads through it as a localisation pass (Mandarin, Vietnamese, Spanish — the South Brisbane premium-CrossFit demo skews multicultural) and capture the output as a MACA differentiator demo. The moment ElevenLabs ships the API, “localized ad variants” stops being a MACA wedge and becomes a feature anyone with a credit card has. Get the demo in your pocket before that happens.
Tier 1 — Practicaly’s overnight “build a second brain” Codex + Obsidian loop (Always-On Reeve)
Beta / verified pattern. Nick Vasiles published a working pattern this morning: install Codex, connect email/docs/notes/calendar, point it at a fresh Obsidian vault, run /goal mode for ~12 hours, wake up to a structured searchable knowledge base. This is the Always-On Reeve Phase 2 demo you’ve been describing in voice notes — except it’s Codex, not Claude. Recommended action: clone the pattern to validate the architecture (one overnight run on a throwaway vault), then port the loop shape into Reeve’s headless prompt — the value is the heartbeat/scheduling shape, not the model. If it works, that’s your first credible “Reeve did this while I slept” artefact to show clients during CourseBuilds Aria pitches.
Tier 1 — Open-source escape valve goes mainstream (Ben, Reeve, CC routines)
Verified — Information cited specific numbers. Together AI’s annualised revenue projection has been raised 3+ times in recent months; tokens processed jumped from 30B/month a year ago to 400T/month now. Hugging Face paid subs doubled Jan-June. CEO Vipul Ved Prakash: at DeepSeek v4 Pro’s ~18¢/M tokens, the spend delta vs. closed labs is roughly $70M/month for Together’s customer base. Cisco and Adobe both using model routers; Not Diamond is shifting AI-coding work to older Anthropic models and saving customers 20-40%. Recommended action: wire OpenRouter or Not Diamond as a fallback provider behind Ben for non-judgment turns (categorisation, balance-reading, settlement parsing) and keep Opus only for the corrections-learning path. Same pattern for Reeve’s heartbeat checks. The 20-40% savings number is real customer data, not vendor marketing.
2 What You Already Know That Most People Don't
Josh Elman (a16z) just published your harness thesis
Elman joined a16z as consumer partner today and his opening essay (“The World-Building Doors Are Open, Again”) delivers the line: “It isn’t the models that matter, but the harnesses, loops, and context which will lead to so many new opportunities ahead.” You’ve been building exactly this since Ben — Agent SDK + MCP (Xero, Google Workspace) + Telegram + Playwright + SQLite + PaperClip heartbeat, all wrapped around Claude as a replaceable inference engine. ben/tools/paperclip_client.py is the harness. ~/Reeve/reeve-headless.md + HEARTBEAT.md + GUARDRAILS.md is the loop. Your LEARNINGS.md / CAPABILITIES_WANTED.md is the context layer that survives model swaps. While the discourse argues about Opus 4.8 vs Fable 5 vs GLM-5.2, you’ve already shipped the architecture Elman is now telling LPs to look for. Use this in your CourseBuilds pitch deck — “we already build at the layer a16z is now funding.”
3 Worth a Deeper Look This Week
Sakana Fugu — the routing model that benchmarks above Anthropic’s frontier
[Source: Sakana AI launch via Information + Rundown, 2026-06-23]. Tokyo’s Sakana shipped Fugu and Fugu Ultra — orchestration models that route each request to a proprietary pool of underlying models behind one API. Information cites benchmarks: Fugu Ultra 93.2 vs Fable 5’s 89.8 on LiveCodeBench, Fugu/Fugu Ultra 95.5 vs Mythos Preview 94.6 on GPQA-D. Ethan Mollick and others flag that real-world performance lags the benchmark numbers, and the model mix is a black box. Why it’s worth 30 mins for you: the routing pattern is exactly what Ben needs — Opus for corrections-learning, GLM-5.2 / Mythos / cheaper Sonnets for categorisation. Even if you never use Fugu itself, reading the architecture sharpens the case for shipping a router layer into MACA + Ben before the Anthropic pricing pressure (see PAY ATTENTION) actually hits your card. Treat the benchmarks as a Sakana marketing artifact, not gospel — read for the architecture, not the leaderboard.
4 Conversation Capital
“Anthropic’s paying SpaceX one-and-a-quarter billion dollars a month for Colossus compute. Google’s paying nine-twenty million. Together AI just hit four hundred trillion tokens a month processed — up from thirty billion a year ago. Not Diamond’s router is quietly saving its customers twenty to forty percent by silently demoting work to older Anthropic models. The model wars aren’t ending — the pricing wars are starting. The question for every CIO right now isn’t which model is best, it’s whether their stack is router-ready.”
Use case: Drop into any Aria, Rio Tinto, or AI-pro conversation where someone says “we’re standardising on Claude” or “we’re worried about model lock-in.” Specific dollar figures land — they’re all from The Information (June 23) so you can defend every number. Pivots a vague worry into a concrete architectural decision: router-ready or not.
5 Something You Haven't Thought About
The Codex → Claude-Opus adversarial review loop. Practicaly buried this in their pro-tip section: after Codex finishes designing an API or schema, automatically pipe the output into claude -p for an Opus second opinion. Two-model adversarial review catches blind spots neither sees alone. You haven’t wired this into MACA, Ben, or Reeve — but the pattern is a 30-minute add to any Claude Code routine: tee the output, fire codex -p "review this for X", fail loud if disagreement. Cost is trivial (one extra round of cheap model), value is real (catches the class of error where Opus is overconfident on its own work). Act/queue/drop: act this weekend for MACA’s ad-copy quality gate — you’ve explicitly flagged copy quality as the blocker for pitching. An adversarial Opus-reviews-Codex (or Codex-reviews-Opus) pass on every ad variant is the cheapest possible “is this human-grade?” filter you can build before pitching. Doesn’t need a router, doesn’t need new infra.
6 Skip File
- [TLDR — “SpaceX-Reflection $6.3B Colossus deal”]: covered in Conversation Capital via Information’s $1.25B/mo Anthropic comparison; standalone deal value isn’t actionable.
- [Information — “Trump signs sweeping quantum executive orders”]: AI-adjacent but not your stack.
- [Information — “Meta WhatsApp leader replaced by Cred CEO Kunal Shah”]: org news, no read-through.
- [Information — “ByteDance Seedance 2.5”]: video model, not in your build path.
- [Information — “Meta paused internal Model Capability Initiative after security lapse”]: governance lesson but already covered as a pattern in Apr 23 Meta keystroke-logging story.
- [Rundown — “Google DeepMind $75M in A24 filmmaking”]: media-AI, off-thesis.
- [Rundown — “Micron strategic deal + Series H investment in Anthropic”]: supply-chain plumbing.
- [Rundown — “Baseten $1.5B at $13B valuation”]: inference compute, infra-layer.
- [Practicaly — “NVIDIA water-cooling at 45°C”]: cool, not actionable.
- [Practicaly — “NotebookLM flashcards now editable”]: incremental feature.
- [Practicaly — “PAI Labs Tavily competitor brief tutorial”]: pattern useful but you already run discovery loops.
- [BagelBots — “Personal AI Operating System mega-prompt”]: filler, not actionable.
- [BagelBots — “Pew 49% AI chatbot adoption”]: stat-bait.
- [BagelBots — “Microsoft 20-year Chevron natural gas deal”]: data-center politics.
- [BagelBots — “Worldcoin Thailand bribery probe”]: gossip.
- [BagelBots — “GLM-5.2 from z.AI”]: already on your radar from yesterday; absorbed into Tier 1 escape-valve item.
- [TheTip — “Anthropic ID + selfie verification July 8”]: real but Team/Enterprise/API exempt — your Reeve and Ben surfaces are not in scope. Reference, don’t act.
- [TheTip — “AI Meet Live launch”]: tiny startup, $29/mo alpha, real pattern but not GA — queue for if/when CourseBuilds Aria embedded work needs an in-meeting agent.
- [TheTip — “Jeff Hunter hates social media”]: editorial, no signal.
- [a16z — “How to Win a Space War” (June 22)]: off-thesis.
- [Neil Patel — “ChatGPT Ads Manager opportunity”]: SEO-pitch promo for npdigital consulting, no new product info.
- [TLDR — “Alibaba HappyHorse 1.1 #2 globally”]: video model leaderboard.
- [TLDR — “Anthropic Cowork mobile + task scheduling”]: track for Reeve Phase 2 mobile surface but no detail yet.
- [TLDR — “claude-sonnet-5 slug spotted”]: rumour, no announcement.
- [TLDR — “Loop engineering” essay]: solid principles but you’re already shipping at this layer.
- [Information — “As Anthropic Costs Rise, Some Customers Eye Cheaper AI” promo]: redundant pointer to the Applied AI story already absorbed into Tier 1 and PAY ATTENTION.
Brief Metadata
- Sources scanned: 13 (9 primary newsletters returned 23 threads; 4 secondary-account queries returned nothing — account not connected this session)
- Items extracted: ~45
- Items surfaced: 7 (1 PAY ATTENTION cluster, 3 Tier 1, 1 anxiety-flip, 1 deeper-look, 1 first-mover pattern)
- Items skipped: 25
- Read time: ~8 minutes