Mastering Constant Product AMM Math
I understand x * y = k now. I understand that AMM pools hold reserves of two tokens, that their product stays constant, and that every trade slides along that hyperbolic curve. I can look at a pool's reserves and tell you the spot price. I can explain slippage to someone in plain English. But understanding the concept and being able to compute exact outputs are two very different things — and my bot needs the latter.
Today I'm sitting down to derive the actual formula. The one that takes a pool's current reserves and a specific input amount and spits out the precise number of tokens I'll receive. Then I'm layering fees on top of that. Because there's a gap between "I know x * y = k" and "I can compute, to the lamport, what a swap will return" — and that gap is where my bot either makes money or loses it.
This is the SAT math problem I need to ace. Not the kind where you can estimate and pick the closest answer choice. The kind where the answer has to be exact, because close enough means losing money.
Starting from What I Know
Let me lay out the setup cleanly.
A constant product AMM pool holds two tokens. Call them Token X and Token Y. The pool's reserves are x and y. The invariant — the fundamental rule that governs everything — is:
x · y = k
k is the constant product. Before any trade, after any trade, always k. That's the constraint.
The spot price — the theoretical price of one unit of Token X denominated in Token Y — is simply the ratio:
Spot Price = y / x
If the pool has 1,000 SOL and 150,000 USDC, the spot price of SOL is 150,000 / 1,000 = 150 USDC per SOL. Straightforward division. This is the price you'd get if you traded an infinitesimally small amount — so small that it doesn't meaningfully change the reserves. It's the price printed on the sticker, not the price you actually pay at the register.
I already know this much. But now comes the question that actually matters for implementation: if I put Δx tokens into the pool, how many tokens Δy do I get out?
Deriving the Swap Output Formula
This is where I stop hand-waving and start doing algebra. And it's more satisfying than I expected, because the formula emerges cleanly from the single constraint.
Before the trade:
x · y = k
I'm adding Δx of Token X to the pool (that's my input). I'm receiving Δy of Token Y from the pool (that's my output). After the trade:
(x + Δx) · (y - Δy) = k
The pool's Token X reserves go up by Δx (because I put tokens in). The pool's Token Y reserves go down by Δy (because I took tokens out). And the product must still equal k.
Now I can solve for Δy. Since both expressions equal k, I can set them equal:
(x + Δx) · (y - Δy) = x · y
Expanding the left side:
x·y - x·Δy + Δx·y - Δx·Δy = x·y
The x·y terms cancel:
-x·Δy + Δx·y - Δx·Δy = 0
Factor out Δy:
Δy · (x + Δx) = Δx · y
And there it is:
Δy = y · Δx / (x + Δx)
That's the formula. Given reserves x and y, if I put in Δx of Token X, I receive Δy of Token Y. Clean. Deterministic. No ambiguity. And I can see immediately why it produces slippage — the denominator is (x + Δx), which means the more I put in (larger Δx), the larger the denominator grows, and the less output I get per unit of input. The formula is the slippage.
Let me sanity-check this with the example from my earlier work. Pool: 100 SOL, 50,000 USDC. Someone puts in 5,000 USDC to buy SOL. Here Token X = USDC, Token Y = SOL (because USDC is what we're putting in).
Δy = 100 · 5,000 / (50,000 + 5,000) = 500,000 / 55,000 ≈ 9.09 SOL
That matches. The trader puts in 5,000 USDC and gets about 9.09 SOL — not the 10 SOL they'd get at the spot price. The formula checks out.
This is the formula that drives half of DeFi. Every constant product AMM — Raydium's standard pools, Uniswap V2 forks, PancakeSwap, and dozens of others — uses this exact calculation. The code that executes a swap on these protocols is, at its core, doing this single division. Everything else is plumbing.
But here's the thing. In the real world, a pool doesn't just hand you Δy tokens and call it a day. It takes a cut first.
Fee Application: Where the Textbook Ends and Reality Begins
Every AMM pool charges a fee on every swap. This is how liquidity providers earn their return — the fee is the interest on the capital they've deposited. Without fees, there'd be no incentive to provide liquidity, and without liquidity, there'd be no pool.
The way fees work in constant product AMMs is elegant but specific: the fee is deducted from the input before the swap formula is applied. Not from the output. From the input.
Let me say that again because it matters: the fee reduces the effective input amount, and then the reduced input goes through the formula.
If the pool charges a 0.3% fee and I'm swapping Δx of Token X, the effective input — the amount that actually enters the pool's reserves — is:
amountInWithFee = Δx × (1 - fee)
For a 0.3% fee:
amountInWithFee = Δx × 0.997
And the swap output formula becomes:
Δy = y · amountInWithFee / (x + amountInWithFee)
Or, expanding it:
Δy = y · (Δx × 0.997) / (x + Δx × 0.997)
This is a small difference, numerically. On a 1,000 USDC swap with a 0.3% fee, you're computing the swap with 997 USDC instead of 1,000 USDC. But when you're looking for arbitrage opportunities where the profit margin is 0.1% or 0.2%, this "small" fee is the difference between profit and loss. Getting the fee calculation wrong by even a fraction of a percent means my bot will think it sees profitable opportunities that don't actually exist. It'll execute trades expecting to make money and come out with less than it started.
Think of it like calculating your take-home pay. If your gross salary is $100,000, you don't actually take home $100,000. Federal income tax takes a chunk. State tax takes another chunk. Social Security. Medicare. Each one is a percentage deducted before you get your check. Your "effective" salary — the amount that actually hits your bank account — is lower than the headline number. And if you're budgeting based on the gross number, you're going to overdraw your account. The AMM fee works the same way. The headline input amount isn't what goes into the swap. The after-fee amount is.
Let me run through the concrete example again. Pool: 100 SOL, 50,000 USDC. Swap: 5,000 USDC input. Fee: 0.3%.
Without fee:
Δy = 100 × 5,000 / (50,000 + 5,000) = 9.0909 SOL
With 0.3% fee:
amountInWithFee = 5,000 × 0.997 = 4,985
Δy = 100 × 4,985 / (50,000 + 4,985) = 498,500 / 54,985 = 9.0663 SOL
The fee cost the trader about 0.025 SOL on this trade. At $150 per SOL, that's around $3.70. On a $5,000 trade, that's 0.074% of the trade value. Not exactly 0.3% — because the fee interacts with slippage in a nonlinear way — but in the same ballpark. The fee feels small, but it accumulates. Across millions of swaps per day across all the pools in the ecosystem, it adds up to substantial revenue for liquidity providers.
And here's the subtle but important detail: k is not actually constant over time. After the fee is deducted and the swap happens, the product of the new reserves is slightly larger than the old k. Why? Because the fee portion of the input goes into the pool's reserves without a corresponding withdrawal. The pool absorbed 5,000 USDC, sent out 9.0663 SOL, and the 15 USDC fee stayed in the pool as pure addition to reserves. New k > old k. Over time, k grows. This is the mechanism by which fees accrete to LPs — the pool literally gets richer with every trade.
This detail doesn't affect my bot's output calculations (I compute the output based on the pre-fee input, which is correct), but it matters when I'm trying to predict the pool's state after a series of trades. If I'm simulating "what happens if Trade A executes and then I execute Trade B right after," I need to account for the fact that Trade A's fees slightly changed k. In practice, the difference is tiny for a single trade. But when you're modeling multi-hop routes through several pools, these tiny differences compound.
Price Impact: The Gap Between Sticker Price and Checkout Price
There's a concept that falls naturally out of the swap formula, and it's one that every trader — human or bot — needs to internalize: price impact.
Spot price is y/x. That's the "sticker price" — the theoretical price of an infinitesimally small trade. It's the number you see on the DEX interface, the one that catches your eye and makes you think "hey, SOL is $150 here."
But the moment you trade any meaningful amount, you don't get the sticker price. You get a worse price. Because your trade changes the reserves, and the price you get is the average across the entire curve from your entry point to your exit point.
The execution price — the price you actually paid — is simply:
Execution Price = Δx / Δy
Using the fee-inclusive example above: 5,000 USDC / 9.0663 SOL ≈ 551.47 USDC per SOL. The sticker price was 500 USDC per SOL. The trader paid 551.47. That's a 10.3% price impact.
And the post-trade spot price? After the swap, the pool has 50,000 + 5,000 = 55,000 USDC and 100 - 9.0663 = 90.9337 SOL (I'm being slightly imprecise here because of the fee mechanics, but the point stands). The new spot price is 55,000 / 90.93 ≈ 604.9 USDC per SOL. The pool's quoted price moved from 500 to 605 because of a single trade.
Price impact is the tax on impatience and size. It's the AMM saying, "Sure, I'll sell you SOL, but the more you want, the more you'll pay per unit." It's exactly like the progressive tax brackets the IRS uses. Your first $11,000 of income is taxed at 10%. The next chunk at 12%. Then 22%. Then 24%. You don't pay the top marginal rate on your entire income — you pay escalating rates on each successive bracket. The AMM does the same thing, but continuously. The first tiny fraction of SOL costs you 500 USDC each. The last fraction costs you way more. Your effective price is the blended average across all those "brackets."
For my bot, price impact cuts both ways. On one hand, it limits how much I can profit from any single opportunity — if I try to push too much volume through a mispriced pool, the price impact eats up the margin. On the other hand, price impact creates opportunities in the first place. When someone else makes a large trade and pushes a pool's price away from the market price, that displacement is the opportunity my bot is looking for. Price impact is both the obstacle and the origin of profit. Understanding exactly how much impact a given trade will cause is non-negotiable for computing whether an arbitrage is actually profitable.
Fee Layers: What the Textbooks Don't Teach
Here's where I start running into the gap between theory and practice. The formula I derived above handles a single, flat fee percentage. That's the textbook version. And for some pools, it's accurate enough. But real DEXs on Solana often have more complex fee structures, and getting them right is essential.
Most constant product AMMs on Solana decompose fees into multiple layers:
LP Fee — This is the fee that goes back to liquidity providers. It's the primary fee, and it's the one that corresponds to the 0.3% in my examples above. This fee is added to the pool's reserves, growing k, benefiting LPs. This is the one that's in the textbooks.
Protocol Fee — Many DEXs take a cut of the LP fee (or charge an additional fee) that goes to the protocol's treasury. This is like the franchise fee at a fast-food restaurant — the individual restaurant (pool) earns revenue from customers, but the franchise (protocol) takes a percentage off the top. On some DEXs, the protocol fee might be 1/6 of the LP fee. So if the total fee is 0.3%, the protocol takes 0.05% and LPs get 0.25%.
Host Fee — Some DEXs have a "host" or "referrer" fee — a cut that goes to the frontend that routed the trade. If you swap through a particular DEX aggregator's interface, the aggregator might receive a small slice. This is the commission that a real estate agent takes for bringing in a buyer — a separate layer on top of the other costs.
Why do these layers matter to me? Because they affect the effective swap output differently depending on how the protocol implements them.
In some implementations, the total fee (LP + protocol + host) is deducted from the input before the swap. In this case, the formula is exactly what I showed above — you just use the combined fee percentage. Simple.
In other implementations, the LP fee is deducted from the input (and added to reserves), but the protocol fee is deducted from the output. This changes the math. Now I need to compute the swap using the LP-fee-adjusted input, get the output, and then subtract the protocol fee from that output. The formula becomes:
Step 1: amountInWithFee = Δx × (1 - lpFee)
Step 2: grossOutput = y × amountInWithFee / (x + amountInWithFee)
Step 3: netOutput = grossOutput × (1 - protocolFee)
That gives a different result than applying (lpFee + protocolFee) as a single upfront deduction. Not a huge difference, but a difference. And when I'm computing whether a 0.08% arbitrage margin is profitable, a 0.02% error in fee calculation is catastrophic.
The formula x * y = k is public knowledge. It's on the Wikipedia page for "Constant Product Market Maker." You can learn it in ten minutes. But the specific fee structures — how each DEX decomposes fees, where each layer is applied, which fees go to reserves and which are siphoned off separately — that's in the source code, often buried in Rust structs and conditional logic. The textbook stops at "there's a 0.3% fee." Reality has fee schedules that look like IRS withholding tables — multiple line items, each applied at a different stage, each flowing to a different recipient.
I'm finding that mastering the formula itself was the easy part. Mastering how each protocol applies the formula, with its specific fee architecture, is the actual implementation work. And it has to be perfect. Not "pretty close." Perfect. My bot is competing against other bots that have this math nailed down. If my fee calculations are off by a few basis points, I'll either miss real opportunities (because I think they're unprofitable) or execute on phantom opportunities (because I think they're profitable when they're not). Both are bad. The first leaves money on the table. The second actively loses money.
The Integer Math Trap
There's one more thing about this formula that the theory doesn't adequately warn you about, and it bit me harder than I expected: all of this math happens in integers on-chain.
On a blockchain, there are no floating-point numbers. Token amounts are represented as integers in their smallest denomination — lamports for SOL (1 SOL = 10^9 lamports), and whatever decimal precision the token's mint specifies. When the AMM program computes the swap output, it's doing integer division. And integer division truncates.
This matters more than it sounds. Consider a toy example. Pool: 1,000 of Token A, 1,000 of Token B. I'm swapping 1 of Token A. The formula says:
Δy = 1,000 × 1 / (1,000 + 1) = 1,000 / 1,001 = 0.999...
In floating-point, that's approximately 0.999. In integer math, 1,000 / 1,001 = 0. Zero. The swap returns nothing. The input is consumed, the output is rounded down to zero, and the trader just donated their token to the pool.
Real AMM implementations handle this by working with the raw integer amounts at full precision — the reserves are in lamports or the token's smallest unit, which are large enough numbers that the rounding errors are negligible relative to the trade size. But "negligible" isn't "zero." On very small trades, or when computing the output of the last hop in a multi-hop route where the intermediate amount is small, that rounding truncation can be a few lamports — and a few lamports in the wrong direction can turn a marginally profitable trade into a marginally unprofitable one.
My bot has to do the math the same way the on-chain program does it. Not in floating-point. In integers, with the same truncation behavior. If my off-chain calculation says "this trade should return 1,000,001 lamports of SOL" and the on-chain program actually returns 1,000,000 lamports because of integer truncation, that one lamport difference might not matter on its own — but it means my math is wrong, and being wrong by a little today might mean being wrong by a lot tomorrow when a different set of numbers makes the truncation more significant.
This is like doing your taxes by hand versus using accounting software. If you round every line item to the nearest dollar, your final number might be off by several dollars. The IRS might not care about a few dollars. But the AMM cares about every lamport, because the on-chain program's math is the final arbiter. If your calculation doesn't match the program's calculation, your transaction fails. There's no "close enough" — either the output matches your minimum expected amount or it doesn't, and if it doesn't, the transaction reverts and you wasted your transaction fee for nothing.
Putting It All Together
Let me write out the complete picture — the full calculation that my bot needs to perform for every potential swap it evaluates.
Given:
- reserveIn: the pool's reserve of the input token
- reserveOut: the pool's reserve of the output token
- amountIn: the amount I want to swap
- feeNumerator, feeDenominator: the fee rate (e.g., 25 and 10000 for 0.25%)
Step 1: Apply fee to input
amountInWithFee = amountIn × (feeDenominator - feeNumerator) / feeDenominator
Note: real implementations often express this as integer math to avoid precision loss. Instead of multiplying by 0.997, they multiply by 9970 and divide by 10000 (or use the specific numerator/denominator the protocol defines).
Step 2: Compute output
amountOut = reserveOut × amountInWithFee / (reserveIn + amountInWithFee)
Step 3: Apply additional fee layers (if applicable)
If the protocol has a separate output fee:
netAmountOut = amountOut - (amountOut × protocolFeeNum / protocolFeeDenom)
Step 4: Truncate to integer
finalOutput = floor(netAmountOut)
That's it. Four steps. The formula fits on a napkin. But getting each step exactly right — matching the on-chain program's integer arithmetic, fee structure, and truncation behavior — is the engineering challenge. The formula is the skeleton key. The implementation details are the specific cuts on the key's blade that determine whether it actually turns the lock.
The Competitive Reality
Here's what sits with me as I work through all of this. The swap output formula is not a secret. It's in the Uniswap V2 whitepaper, which Raydium and dozens of other AMMs derive from. It's in open-source Rust code on GitHub. Anyone building a trading bot has access to the same formula. The math is table stakes. You can't compete without it, but having it doesn't give you an edge.
The edge comes from three things that sit on top of the math:
First, precision. Not just knowing the formula, but implementing it in a way that exactly matches the on-chain computation. Using the right integer types, applying fees in the right order, truncating in the right direction. Every lamport matters.
Second, speed. Computing the output of one swap is trivial — any language can evaluate that formula in nanoseconds. But computing the outputs of thousands of potential swaps across hundreds of pools, comparing them, finding the combination that yields a profit, and doing all of this faster than competing bots — that's the performance challenge.
Third, coverage. Different DEXs have different fee structures. Different rounding behaviors. Different edge cases. A bot that only handles the simple 0.3%-flat-fee case will miss (or miscalculate) opportunities on pools with more complex fee architectures. The more accurately I model each protocol's specific implementation, the more of the market I can see clearly.
The formula is public. The optimization parameters — how I prioritize which pools to check, how I size my trades to balance slippage and profit, how I chain multiple swaps together for multi-hop arbitrage — those are where the competitive advantage lives. But none of that optimization matters if the underlying math is wrong. You can't optimize on top of an incorrect foundation.
It's like unit pricing at the grocery store. The formula for "price per ounce" is trivial: total price divided by total ounces. Every shopper can do it. But the shopper who actually calculates unit prices for every item, across every brand, across every package size, and does so quickly enough to make decisions in real time while navigating the store — that shopper gets the better deal. The formula is free. The systematic application of the formula, at scale, under time pressure — that's where value gets created.
I now have the mathematical foundation. The formula. The fee mechanics. The integer arithmetic constraints. The understanding that textbook math gets me to the starting line, but the specific on-chain implementation details are what separate a correct bot from a broken one. The formula x * y = k is elegant in its simplicity, deceptive in its apparent completeness, and absolutely non-negotiable in its demand for precision.
This one formula drives half of DeFi. And I'm just getting started with the other half.
Disclaimer
This article is for informational and educational purposes only and does not constitute financial, investment, legal, or professional advice. Content is produced independently and supported by advertising revenue. While we strive for accuracy, this article may contain unintentional errors or outdated information. Readers should independently verify all facts and data before making decisions. Company names and trademarks are referenced for analysis purposes under fair use principles. Always consult qualified professionals before making financial or legal decisions.