← Knowledge

Cost and Budget

Where Claude costs come from, how prompt caching cuts them by ~70%, where to see total cost per workflow run, and how to set spend caps.

What you actually pay for

Two distinct cost streams in TikTok Army:

  1. Claude API costs — every agent step that calls Claude. Pennies per call typically; a full Campaign Launch runs in the low single-digit dollars on a small brand.
  2. TikTok Ads spend — only when you run the Campaign Launch workflow and approve the ad spend gate. This is the paid media budget you set on the campaign brief, not a cost of using TikTok Army itself.

This article covers Claude costs. TikTok Ads spend is whatever you approve at gate #3 of How to Launch a Campaign.

In mock mode (the default for local dev — CLAUDE_MODE=mock), no real Claude calls are made, all costs are simulated, and you'll see realistic-looking dollar figures in the dashboard but nothing actually charges. Useful for testing workflows end-to-end without a Claude key.

How Claude costs are calculated

Three Claude models are used across the fleet. Pricing (per 1M tokens) is pinned in ~/projects/tiktok-army/tiktok_army/lib/claude.py:111:

ModelInputOutputCache readCache write
Claude Opus 4.7$15.00$75.00$1.50$18.75
Claude Sonnet 4.6$3.00$15.00$0.30$3.75
Claude Haiku 4.5$0.80$4.00$0.08$1.00

Which model each agent uses is set per-agent (see the catalog — typical_model on each AgentSpec in ~/projects/tiktok-army/tiktok_army/agents/_catalog.py):

  • Haiku — Account Health, Trend Watcher, Comment Triage, Performance Feedback, Catalog Sync, Inventory Sync. Fast classification and triage work.
  • Sonnet — Audience Mapper, Compliance, Listing Optimizer, Content Producer, Ad Campaign Director, Creator Outreach, Live Stream Ops, plus the synthesis step. Reasoning and drafting work.
  • Opus — Reserved for orchestration/planning where it's needed; not used by default in any of the seeded workflows. (Available on the model parameter of call_claude_cached.)

A single Claude call rolls up like this:

cost_usd = (input_tokens   × input_rate
          + output_tokens  × output_rate
          + cache_read_tokens  × cache_read_rate
          + cache_write_tokens × cache_write_rate) / 1_000_000

Tokens are measured by Anthropic's tokenizer, not character counts. A rough rule: 1 token ≈ 4 characters of English.

Prompt caching — the big saving

Every Claude call in TikTok Army uses prompt caching. The wrapper at lib.claude.call_claude_cached marks the system prompt with cache_control: ephemeral, which tells Anthropic to keep it in a 5-minute cache. Subsequent calls within that window with the same system prompt get charged the cache read rate instead of the full input rate — about 90% off.

Concretely on Sonnet:

  • First call with a 2,000-token brand profile in the system prompt: ~$0.006 to write the cache + $0.006 for the input + output costs.
  • Every call within 5 minutes after, same system prompt: ~$0.0006 for the cache read, plus output. About 10× cheaper.

The Comment Triage agent benefits enormously from this — it makes one call per comment and they all share the same brand-voice system prompt. A batch of 50 comments runs at the cache-read rate from comment 2 onward.

Why it works. The orchestrator runs steps as fast as it can, and most workflow runs complete inside the 5-minute window. The cache only misses on the first call of each agent, plus any call where the system prompt actually changed.

What breaks the cache. The system prompt has to be byte-identical call to call. Putting a timestamp, a UUID, or per-request data into the system prompt invalidates it and you pay full input rate every time. The Claude wrapper enforces the discipline by separating system_stable (cached) from user_message (per-request, never cached).

Where to see total cost per run

Three views in the dashboard.

Run summary card. On the run page, top-right: Total cost: $0.X. Pulled from the total_cost_usd column on tiktok_workflow_runs. Sums every step's cost_usd, which in turn is the sum of every Claude call's cost via tiktok_agent_steps. See ~/projects/tiktok-army/migration/009_tiktok_workflows.py:189 for the column definition.

Per-step breakdown. Inside the run page, each step row shows its individual cost. Useful for spotting an expensive outlier — an agent that ran 50 Claude calls when others ran 2.

Per-call trace. Click any agent step → "View trace" → see every tiktok_agent_steps row for that run. Each row is one Claude call (or one provider call, etc.) with input tokens, output tokens, cache reads, cache writes, and exact dollar cost. This is the lowest-level cost detail and it's where you go when "the audit cost $4 last time and $0.40 this time, why?"

Daily spend caps

Two layers, both implemented in the codebase:

Per-call spend cap on paid TikTok Ads launches. tiktok_army/lib/spend_cap.py wraps axion_studio.lib.spend_cap.charge. The Ad Campaign Director (and any agent that launches ads) must call charge(...) before invoking TikTok Ads. The function fails fast if the requested amount would exceed the workspace's daily cap. This is hard-enforced — the agent will fail the workflow step, not silently overspend.

Per-call Claude budget guards. Not enforced at the model wrapper today (April 2026). The convention is that workflow definitions set a reasonable max_tokens per step and the synthesis step caps at 2,048 (see runner.py:432). A future claude_spend_cap is on the roadmap that would block calls that would exceed a daily Claude budget for the workspace.

For now, the practical defense against runaway Claude cost is:

  • Run in mock mode while iterating on workflows.
  • Watch the run summary cost on the first real-mode run of any new workflow.
  • Set a reasonable max_tokens on agents whose options include it.

Rough cost benchmarks

These are mock-mode estimates from the fixture data — actual numbers in real mode depend on the size of the account being audited and the verbosity of the model on the day. Treat them as order-of-magnitude.

  • Profile Audit on a small account (~100 posts): $0.10 – $0.40.
  • Profile Audit on a large account (~5,000 posts): $0.80 – $2.50, mostly because Account Health and Audience Mapper have more data to chew through.
  • Campaign Launch: $1.50 – $5.00 in Claude costs alone, plus whatever you approve in TikTok Ads spend.
  • Post-Launch Loop (single run): $0.20 – $0.60. Designed to be cheap enough to run nightly.

Things to know

  • The total_cost_usd is approximate — token counts come from Anthropic's response and pricing is pinned in code. A retroactive Anthropic price change would show up as a discrepancy until _PRICING is updated. There's a comment in ~/projects/tiktok-army/tiktok_army/lib/claude.py:96 that says "Update if Anthropic prices change."
  • Costs accrue as a workflow runs. If you reject an approval gate halfway through, you've already paid for the steps that ran. That's typically pennies, but worth knowing.
  • Mock mode shows simulated dollar figures so the dashboard renders realistically. Don't trust the numbers in mock mode for budgeting — switch to real mode for the actual cost shape on your data.