BUILDING proxy v0 · OpenAI + Anthropic first · shipping soonFOUNDER solo · built on AWS EKSPRICING beta · free · post-beta two tiers · BYOK + managed tokensPROVIDERS day-one targets · Anthropic · OpenAI · Google · Mistral · Groq · DeepSeekSIGNUP shapes what ships in v1CALL 20 minutes · tell me what you'd actually pay for · link belowDEVICES desktop app + Chrome · track team spend per person · coming soonBUILDING proxy v0 · OpenAI + Anthropic first · shipping soonFOUNDER solo · built on AWS EKSPRICING beta · free · post-beta two tiers · BYOK + managed tokensPROVIDERS day-one targets · Anthropic · OpenAI · Google · Mistral · Groq · DeepSeekSIGNUP shapes what ships in v1CALL 20 minutes · tell me what you'd actually pay for · link belowDEVICES desktop app + Chrome · track team spend per person · coming soon

Status pre-launch · buildingDay-one providers all major LLM APIsTwo modes BYOK or managed tokensHosted on AWS EKSBeta access free

FILED · PRE-LAUNCH EDITIONBUILT IN PUBLIC · NO TRACKERS · NO BS

YOUR AI BILL HAS A PROBLEM. IT IS NOT A SECRET.

LLMBurner meters every dollar your AI burns — in the products you ship and across your team. One proxy for your code's LLM calls; a desktop app + Chrome extension for everyone else. Every cost, one panel. Currently being built — get in early.

BOOK A 20-MIN CALL —→JUST NOTIFY ME

Solo founder · I personally reply · krupesh.dev

ENTERPRISE LLM SPEND · 6 MONTHS$3.5B → $8.4Bdoubled in half a year · most teams have no idea where it went
· Menlo Ventures · 2025

TEAMS WHOSE LLM BILL EXCEEDED THEIR FORECAST78%within the first year of deployment
· a16z survey · 2024

PRODUCT SPEND · TO INSTALL1 urlproxy your code's LLM calls · no SDK swap · no lock-in

TEAM SPEND · WHO BURNS WHATper seatdesktop app + Chrome extension · per person, per machine · coming soon

CLAIM 01.

Spend is exploding.

$8.4B

Enterprise LLM spend more than doubled in six months. The teams writing those cheques mostly can't tell you which feature spent what.

SRC: Menlo Ventures · 2025

CLAIM 02.

Budgets miss every time.

78%

Of enterprise AI teams report LLM spend exceeded their first-year projections. Surprise bills are the norm, not the exception.

SRC: Andreessen Horowitz · 2024

CLAIM 03.

Not re-routing is overpaying.

10×/yr

The cost of equivalent-quality inference drops 10× every year. Teams that don't re-route stay locked to last year's prices — paying multiples for the same answer.

SRC: a16z · "LLMflation" 2024

CLAIM 04.

The fix is one line.

1 url

To meter your product's own LLM calls — one base URL, no SDK swap, no middleware. For your team's AI spend, the desktop app + Chrome extension are coming.

SRC: How the proxy works.

FROM THE FOUNDERPRE-LAUNCH · OPEN BUILD

I'm building this because I needed it.

I was shipping a small AI product — a Claude wrapper, the kind every indie builder ships these days. Every user request hit the API. Every reply ate tokens. And every month the bill arrived without telling me which feature, which user, or which prompt ate the most.

I looked for a tool that would just meter the calls and tell me where the money was going. What I found was either an enterprise observability suite priced for Series B, or a logging library I'd have to wire into every SDK. Nothing in between. Nothing for someone who just wanted to see the bill, route the cheap calls to a cheap model, and cap the budget before something runs away overnight.

So I'm building it. Solo, on AWS, in public. If you've ever opened an LLM invoice and squinted — I'd love to talk to you before I write the next line of code.

— Krupeshkrupesh.devBook 20 minutes →

WHAT IT WILL LOOK LIKE.
ALL THE DATA. ONE SCREEN.

8 panels · design preview
numbers below are illustrative

CONSOLE / previewDESIGN PREVIEWILLUSTRATIVE NUMBERS

panels 8not live data

Telemetry · spend / 24hUSD · LIVE

$84.17−12.4%

vs. yesterday · forecast $340/mo · trending down

OpenAIgpt-533%$27.84

Anthropicopus 4.731%$26.22

Anthropicsonnet 4.616%$13.40

Googlegemini 2.511%$9.63

Groqllama 3.39%$7.08

AI advisory · this weekRANKED BY $

Route support-bot through Haiku 4.5 instead of Opus 4.7.

opus 4.7 → haiku 4.5
40 prompts tested · quality Δ ≈ 0%

Saved / mo

−$340

Quality drop

~0%

Alerts · last 4h5 EVENTS

14:07CRITsupport-bot hit $30/day cap · calls blocked
13:54WARNOpus 4.7 at 80% of monthly budget
13:31INFOSuggested: gpt-5 → nano on /summary · −$58
12:48WARNDeepSeek V3 latency p95 above SLO · 1.84s
12:11OKGemini Flash repriced · −13.6% / 1M tokens

Playground · same prompt, every modelRAN 14:06

"Reset my password. The link in the email isn't working."

opus 4.7 · current

"I understand how frustrating that is. Let me walk you through resetting your password step by step…"

$0.0186ms 412tok 186

haiku 4.5★ PICK

"Sorry — try this fresh reset link: [link]. Expires in 30 min. If it still fails, reply with the error."

$0.0014ms 189tok 42

gpt-5

"Apologies for the trouble! I can help you reset your password. Could you click this new link: [link]…"

$0.0098ms 340tok 58

Prompt history6 EDITS

system_v3.2@you · 2dlive
system_v3.1@maya · 5dpinned
system_v3.0@you · 7drolled back
system_v2.9@maya · 12darchived

Live pricing · per 1M tokensUPDATED 14:00

Model	In	Out
haiku 4.5	$0.80	$4.00 −12%
gpt-5 nano	$0.05	$0.40 0%
gemini flash	$0.38	$1.50 −14%
llama 3.3	$0.09	$0.59 0%
sonnet 4.6	$3.00	$15.00 0%
opus 4.7	$15.00	$75.00 —

Call heatmap · 7dBY HOUR

036912151821

MON

TUE

WED

THU

FRI

SAT

SUN

LOW

HIGHcalls / hr

PEAKTue 11:00 · 48 calls/hr

COSTLIESTMon 14:00 · $4.20/hr

QUIETESTSun 03:00 · 0 calls

Devices & seats · burn by source3 PCS · 2 SEATS

One company, every machine — desktop app + Chrome, per PC, per person.

MacBook Prodesktop · @krupesh41%$34.50

Mac Studiodesktop · @maya22%$18.40

WIN-DEV-01chrome · @krupesh19%$16.00

LINUX-CIchrome · service12%$10.10

misc2 idle seats6%$5.17

SEEN ENOUGH?
HELP ME BUILD THE RIGHT THING.

I'm before v1. Talk to me for 20 minutes — show me your stack, your worst LLM bill, the feature you wish existed — and your priorities go straight into v1. The first 10 callers get free beta access for life.

No sales people · no funnel · your priorities go straight into v1

BOOK 20 MIN —→

or just drop your email →

WHERE YOUR
BILL COMES FROM.

Every route, every feature flag, every model — attributed in real time. No more looking at a monthly invoice and guessing which feature caused the spike.

Attributed per route, per flag, per team
Daily, weekly, 30-day rollups
Export to CSV or query direct SQL

SPEND BY ROUTE · LAST 24H$84.17 TOTAL

/support-botopus 4.7$32.00 38%

/summarisersonnet 4.6$20.20 24%

/embeddings cronnightly$14.31 17%

/intent-routinggpt-5 nano$9.26 11%

/classifyhaiku$5.89 7%

misc · under 4 flags$2.51 3%

TWO WAYS TO PLUG IT IN.

CHOOSE WHAT FITS · NO LOCK-IN EITHER WAY

MODE 01 · BYOK

Bring your own API keys.

FOR · teams already paying providers directly

Keys are stored encrypted on AWS · we never see plaintext
You keep your existing billing with OpenAI, Anthropic, etc.
LLMBurner only meters · routes · alerts · advises
Switch back to the provider direct anytime · just change the URL back

PRICING · flat per tracked request · well below what the meter saves

MODE 02 · MANAGED TOKENS

Buy tokens through us. One bill.

FOR · solo builders + small teams tired of 5 provider invoices

One topup · access to every supported provider at our rates
Starting discount when you load tokens · no minimum
Single monthly invoice instead of five
Same observability, same routing, same caps

PRICING · provider rate + small margin · margin disclosed up front

ONE LINE. EVERY PROVIDER. NO LOCK-IN.

BEFORE.PYBLIND

# no visibility
# no cost tracking
# no alerts

from openai import OpenAI

client = OpenAI(
  api_key="sk-...",
  base_url="https://api.openai.com/v1",
)

AFTER.PYMETERED

# every call tracked
# costs · tokens · latency
# budgets · alerts · routes

from openai import OpenAI

client = OpenAI(
  api_key="sk-...",
  base_url="https://api.llmburner.app/v1",
)

TWO SURFACES.
ONE SPEND PANEL.

PRODUCT SPEND + TEAM SPEND · ALL COMING SOON

SURFACE 01 · API / TOKEN

Your product's AI spend.

FOR · apps, products & internal tools you ship

One base URL · every model · no SDK swap
Attributed per route, per feature, per flag
Works in any language that can hit an API
Hard spend caps + alerts before something runs away

● COMING SOON

SURFACE 02 · DESKTOP + CHROME

Your team's AI spend.

FOR · seeing which employee is burning how much

Lightweight desktop app · drops in via npm
Chrome extension · tracks browser AI tools your team uses
Per-person, per-machine breakdown in dollars
Rolls into the same one panel as your product spend

● COMING SOON

WHAT YOU GET.

07 MODULES · ONE PROXY

01.

PER-FEATURE COST

Every call, attributed to the route or feature flag it came from. SQL on top if you want it. CSV export if you don't.

live · $/call

02.

SIDE-BY-SIDE MODELS

Same prompt across GPT, Claude, Gemini, Mistral, Groq. Stacked outputs, tokens, latency, cost. Pick the cheapest one that's correct.

8+ providers

03.

HARD LIMITS

Cap any provider by dollars or requests. When a runaway loop hits the ceiling, the burner stops the burn.

trips < 30s

04.

LIVE PRICING

Every model, every provider, current price. Refreshed when providers push. No more stale spreadsheets.

real-time

05.

VERSIONED PROMPTS

Folders per project. Pin the winning system prompt. Diff and roll back when the new one ships a regression.

∞ versions

06.

PLAIN-ENGLISH INSIGHTS

"Route support-bot through Haiku 4.5, save ~42%, no quality drop." A sentence. Not a chart you have to interpret.

weekly digest

07.

DEVICE & SEAT BURN

Spend by machine and by person, not just by route. Desktop app and Chrome usage land in one panel — track a single account across three PCs, or a whole company with many seats, and see exactly which device is burning.

per-device · per-seat

WHERE THIS ACTUALLY IS.

I'm not going to fake a green LIVE badge to make this look further along than it is. Here's the real status — built, building, and the two things I'd genuinely like your opinion on.

BUILTv0

The design & spec for the proxy interface (this page)
AWS EKS deployment plan, key-encryption approach
Two-mode pricing decided: BYOK + managed tokens
Confidence in the problem · validated by my own LLM bill

BUILDING NOWv1 targets

Proxy router · OpenAI + Anthropic first
Per-call cost logging · batched calc on first ship
One dashboard view · spend breakdown by route
Hard spend caps · per-provider, per-day
Email alerts when caps near or break

HELP ME DECIDEyour input matters

Dashboard: when you log in, what's the first view you want? Spend by route? By model? A "what changed today" summary?
Streaming responses: the proxy has to handle SSE/streaming cleanly. What's broken about how today's tools do it for you?

Strong opinion?Book 20 min → · or reply to the waitlist email when it lands.

OBJECTIONS.

Is the product actually working right now?

No. LLMBurner is pre-launch — I'm a solo founder, building v1 right now. The console screenshots above are design previews of where I'm headed. The numbers in them are illustrative, not real customers. If you sign up, you'll get a personal email from me, and the first version when it ships. If that's not what you want, please don't sign up — I'd rather have 30 honest signups than 1,000 strangers.

How is this different from Helicone, LangSmith, OpenRouter?

Helicone and LangSmith are observability-first; OpenRouter is routing-first. LLMBurner combines both in one proxy and gives you a second mode where you can buy tokens through us at a small discount instead of juggling five provider accounts. That managed-tokens mode is the differentiator — nobody else does it that I've seen. If they ship it before me, I'll tell you here honestly.

Is this another middleware I have to maintain?

No. LLMBurner will be a hosted proxy on AWS. You change one base URL in your existing OpenAI/Anthropic SDK. No library to install, no service mesh, no agents. Point the URL back to the provider any time you want — there is no lock-in either way.

What about my prompts and data?

The plan: we don't train on your data, ever. By default we'd log sizes, tokens, latency, timestamps — not prompt or response bodies. Body logging will be opt-in per project, encrypted at rest. I'd rather lock this down right than ship it loose, so if you have strong opinions on the privacy model, that's another thing worth a call.

What will it cost?

Free during beta. After that, two modes: BYOK = flat per tracked request, well below what the meter saves you. Managed tokens = provider rate plus a disclosed margin, with a starting discount on first topup. If it doesn't save you more than it costs, don't pay me.

Does the proxy add latency?

Plan is to deploy in the same regions as the providers. Target added latency <15ms p50. I'll publish the real number on a public status page once v1 is live — not before, because measuring an unbuilt thing is exactly the kind of fake number I'm trying to avoid putting on this page.

Who are you and why should I trust you?

I'm Krupesh — solo developer, building in public. krupesh.dev. I'm building this because I ran into the problem myself on a small Claude-wrapper project. I personally read and reply to every waitlist signup. If you book a call, you'll be talking to me, not a salesperson — because there are no salespeople.

STOP GUESSING. START METERING.

Drop your email. You'll get a personal note from me, an honest update when v1 ships, and the chance to shape what gets built first. No drip campaign. No newsletter. Just real signal.

or skip the queue · book a 20-min call →