This article is a personal engineering essay describing the author's experience debugging an MEV / on-chain arbitrage bot on Solana. It is for informational and educational purposes only and does not constitute investment, legal, or financial advice, nor a recommendation to engage in MEV extraction, arbitrage, or any specific trading strategy on any blockchain. Tokens, pools, and protocols are named only as technical illustrations of the system being debugged; their inclusion is not an endorsement. The publisher holds no positions in and has received no compensation from any project named herein.

Chasing the Same Phantom Opportunity for Ten Hours

I'm staring at a spreadsheet I just built from my bot's execution log, and something isn't right.

It's the morning after the 0% landing rate — 220 bundles sent, zero landed, zero profit. I didn't sleep well. I kept thinking about those eight layers of failure I found, the cascade of problems hiding behind each other. But I told myself the first step is to understand before I fix. So I'm doing what any obsessive engineer does when the system fails: I'm reading every single line of log output, timestamping events, building a timeline.

And the timeline is telling me something that makes my skin crawl.

There's a cycle — SOL to hoodrat to USDC back to SOL — that my bot tried to execute 15 times over the course of 10.2 hours. Ten hours. The same three-hop cycle. The same pools. The same token pair sequence. My bot found it, evaluated it, decided it was profitable, built a transaction, submitted a bundle to Jito, watched it fail, and then — like a golden retriever chasing the same tennis ball into the same empty field — went back and did it again. And again. And again. For ten straight hours.

This isn't a bug I can point to. This is something worse. This is my bot telling me it sees opportunity where none exists, and I need to figure out why it keeps seeing ghosts.

The Spreadsheet from Hell

I pull every execution event from the log, sort by cycle ID, and start counting. The hoodrat cycle is bad, but it's not the worst offender.

There's another one. SOL to cbBTC to USDC back to SOL. Forty-seven execute attempts. Over 43.3 hours. Nearly two full days of my bot hammering the same phantom opportunity, sending bundle after bundle after bundle into the void. Forty-seven attempts. Zero landings. Not a single one.

I start mapping the execution timestamps and a pattern jumps out: clustering. These aren't evenly spaced attempts over 43 hours. They come in bursts. The fifth cluster alone contains 17 executions in a two-minute window. Seventeen bundles fired off in 120 seconds, each one chasing the same ghost, each one coming back on_chain_not_found. It's like watching someone feed dollar bills into a vending machine that clearly has an "Out of Order" sign on it, except the sign is invisible to my bot.

And here's the detail that really gets me: I check the predicted profit values for the cbBTC cycle. The number $0.5458 appears four times. The exact same number. Not approximately the same — identically the same, down to the fourth decimal place. That means for at least four consecutive evaluations, spanning roughly 33 seconds, my bot's view of this opportunity didn't change at all. The pool state it was computing against was frozen. The data was stale. My bot was making decisions based on a snapshot of reality that was at least half a minute old — which, in the world of Solana where slots tick every 400 milliseconds, might as well be a geological epoch.

But it gets worse. The hoodrat cycle shows the same clustering behavior, except tighter. Seven consecutive executions in 71.4 seconds. That's one attempt every 10.2 seconds, like clockwork. And the predicted profits oscillate wildly: $0.29, then $1.55, then $0.19, then $1.48. The numbers are jumping around, which means the data IS updating — the pool state is changing between evaluations. The bot is seeing different numbers each time. It's not frozen. It's watching a moving target.

So... the data is fresh and the predictions are still wrong?

I put my coffee down. That's not the answer I expected.

The Anatomy of a Ghost

I need to understand what a phantom opportunity actually looks like from the bot's perspective. I pick the cleanest example: the JUP cycle. SOL to JUP to USDC back to SOL. Thirteen bundles submitted in two minutes — average interval 9.4 seconds. All thirteen failed.

I pull the evaluation logs for this cycle. There are 690 evaluate events generated during that two-minute window. Six hundred and ninety times, the screener looked at this cycle and computed a potential profit. Out of those 690 evaluations, I find 139 unique profit values. The data is clearly updating — you don't get 139 different numbers from stale data. The pool states are changing. The WebSocket subscriptions are delivering updates. The information flowing into the AMM math engine is, by every measurable standard, fresh.

Predicted profit starts at $0.163 and trends downward to $0.105 over those two minutes. That's a declining curve — which makes sense, because if a real opportunity exists, other traders will eat into it. The declining profit looks like a real market signal. My bot is watching what appears to be a legitimate opportunity slowly closing, and it's racing to capture whatever's left before it disappears entirely.

Except there's nothing to capture. All thirteen bundles come back dead. The opportunity isn't closing — it never existed. What my bot interprets as a shrinking-but-real profit window is actually the mathematical residue of a calculation error that happens to get smaller as pool states update. It's not a closing window. It's a mirage that shimmers less as you get closer.

I think about those carnival games on the midway — the ones where you throw the softball at the milk bottles stacked on a shelf. The bottles look normal. The ball feels normal. The throw feels good. But the bottles are weighted at the bottom, and the ball is slightly undersized, and the shelf is angled, and the whole thing is engineered so that it looks like you almost knocked them over every single time. You keep paying three dollars for three throws because each throw looks so close. You almost had it. The physics feel right. But the game is rigged — not by anyone malicious, just by a system whose parameters are slightly off from what you think they are.

My bot is throwing softballs at weighted bottles. The physics feel right. The math says it should work. But the parameters are off, and "almost profitable" is the same as "not profitable at all."

Stale Data: The Obvious Suspect

The first hypothesis is the obvious one: stale data. My bot is computing profits against pool states that don't reflect reality. The opportunity exists in my cache but not on-chain. Simple. Understandable. Fixable.

I start digging into the data freshness metrics, and I find what I'm looking for almost immediately. One of the pools in my most-traded cycles is a Meteora DLMM pool for the WEN token. I check its price history in my logs and find something horrifying: the same profit value, displayed unchanged, for over 14 minutes straight. Fourteen minutes of completely frozen data. On Solana, where blocks are produced roughly every 400 milliseconds, 14 minutes is about 2,100 slots. My bot was making trading decisions based on pool state that was 2,100 blocks behind reality.

This is like trying to trade stocks using a ticker that's running on a 14-minute delay and not knowing it. You're looking at prices that feel live — they're on your screen, they're updating (they're not), the interface looks real-time — but you're watching the past. Every decision you make based on that data is already wrong by the time you make it.

I dig further. Why is this pool's data frozen? The answer is in my WebSocket subscription coverage. Out of 1,099 Meteora DLMM pools that my bot tracks, only 215 are subscribed via WebSocket. That's 19.5%. The other 884 pools — over 80% — are relying on HTTP polling for their state updates. And HTTP polling runs on an interval. Every 30 to 60 seconds, my bot sends an RPC request, gets the current state, and updates its cache.

Thirty to sixty seconds. In a market where opportunities appear and disappear in milliseconds. It's like trying to photograph hummingbirds with a camera that takes 30 seconds between shots. You'll get a lot of pictures of empty feeders.

So stale data is real. It's a legitimate, documented, measurable problem. Some of my pools are updating once a minute while the market moves thousands of times faster. Mystery solved, right? The bot sees ghosts because it's looking at old pictures. Fix the data freshness, fix the ghosts.

Except...

The Paradox

I run the numbers on data freshness across all my execution attempts. Not just the frozen WEN pool — all of them. I look at every cycle that generated an execution, and I check how often the predicted profit value changed between consecutive evaluations.

82.8% of profit predictions changed within 5 seconds.

I read that number three times to make sure I'm not hallucinating. Eighty-two point eight percent. That means for the overwhelming majority of my execution attempts, the underlying data was updating rapidly. The pool states were fresh. The WebSocket subscriptions were delivering real-time updates. The AMM math engine was receiving new inputs and producing new outputs multiple times per second.

The data isn't stale. Or rather — some of it is, but most of it isn't. The WEN pool is an outlier. For most of my cycles, the data pipeline is working exactly as designed. Fresh data in, fresh calculations out, fresh predicted profits displayed.

And still: 273 total executions across all my runs. Zero landings. Zero.

This is the moment where the ground shifts under my feet. The stale data hypothesis was comfortable. It was a clean explanation — bad inputs lead to bad outputs. Fix the inputs, fix the outputs. Simple cause and effect. The kind of problem that has a straightforward solution.

But if the data is fresh and the predictions are still wrong, then the problem isn't in the data pipeline. The problem is in the prediction itself. The math is getting good inputs and producing bad outputs. Which means the math is wrong.

I sit with that thought for a minute. The math is wrong. Not the data. The math.

I think about GPS again — the one from the previous episode that drives you into a lake. I said the map was wrong. But what if the map is perfectly accurate and the routing algorithm is broken? The GPS has a correct, up-to-date satellite image of every road. It knows where the lake is. It knows where the bridges are. But the algorithm that computes the route has a flaw in how it calculates distances, and it consistently picks paths that look shorter on paper but are actually impassable. The map is right. The logic is wrong.

That's scarier. A wrong map is easy to replace. A wrong algorithm is embedded in the bones of the system.

Peeling Back the Simulation

I need to understand exactly how my predictions diverge from reality. Fortunately, I have simulation data — before submitting a bundle, my bot can simulate the transaction against the current on-chain state to see what would happen. I haven't been paying close attention to the simulation results because the bot was running in production mode, just firing bundles. But the data is there in the logs.

I pull the simulation results for my failed executions. The breakdown:

67% fail with error code 6000 — InsufficientProfit. The on-chain router program has a guard that checks whether the final output of the cycle exceeds the input by enough to cover costs. Sixty-seven percent of the time, the answer is no. My bot predicts profit. The chain says no profit.

22% fail with error code 3007 — ExceededSlippageTolerance. The price moved between when my bot calculated the expected output and when the transaction tried to execute. Even though the data was "fresh" by my metrics — updated within the last few seconds — those few seconds were enough for the price to shift beyond the acceptable range.

Together, that's 89% of failures explained by a single root cause: my bot thinks there's more profit than actually exists. It's not a data timing issue. It's not a network latency issue. It's a mathematical overestimation issue. The screener is looking at real, current pool data and consistently computing a profit number that's higher than what the blockchain will actually deliver.

How much higher?

I start comparing predicted profits to simulation results where I can get them. The screener predicts $0.14 profit on a cycle. The simulation says the cycle actually loses money after slippage and fees. The screener says +0.3%. The chain says -0.1%. Over and over, the same direction of error: the screener overestimates.

I run a statistical analysis on the divergence. The average overestimation is 47%.

Forty-seven percent.

My AMM math engine is telling me that opportunities are 47% more profitable than they actually are. Nearly half the predicted profit is fictional. It's like a restaurant menu where every dish costs 47% less than what you'll actually be charged. The menu looks great. The prices look reasonable. You order with confidence. And then the bill arrives and it's almost half again as much as you expected, and suddenly none of the meals were worth it.

Where 47% Goes to Die

Forty-seven percent overestimation. I need to find where those phantom percentage points come from. This is a math problem now — the kind where you take apart an equation term by term and figure out which components are contributing to the error.

The AMM math engine computes swap outputs for each hop in the cycle. For a three-hop cycle like SOL→USDC→BONK→SOL, it computes three swap outputs sequentially: how much USDC do I get for my SOL? How much BONK do I get for that USDC? How much SOL do I get for that BONK? Each computation uses the constant-product formula (or the concentrated liquidity formula for CLMM pools, or the bin-based formula for DLMM pools), and each one needs to account for fees, price impact, and slippage.

There are at least three places where overestimation can creep in.

First: fee calculation. Each pool charges a swap fee — typically 0.25% to 1% for standard AMM pools, but variable for concentrated liquidity pools. If my math underestimates the fee on any hop, the predicted output will be higher than reality. A 0.1% fee error on each of three hops compounds to roughly 0.3% total error — small, but when margins are already razor-thin, it's the difference between profit and loss.

Second: price impact. When you swap a large amount relative to the pool's liquidity, you move the price against yourself. The constant-product formula handles this inherently — the larger your swap, the worse your rate. But "large relative to the pool's liquidity" is a moving target. If the pool's liquidity has shifted since my last update (even by a little), my price impact calculation will be off. And concentrated liquidity pools make this worse, because liquidity isn't spread evenly across the price range — it's concentrated in specific ticks, and missing even one tick of liquidity data can dramatically change the price impact calculation.

Third: the compounding effect. Each hop's error feeds into the next hop's input. If hop 1 overestimates output by 2%, then hop 2 receives an inflated input, which produces an inflated output, which feeds into hop 3. The errors don't just add — they multiply. A 2% error per hop across three hops isn't 6% total error. It's closer to 6.1% due to compounding. And when the base margins are 0.1% to 0.5%, a 6% calculation error means the bot sees profit where there's actually a loss.

I'm sitting here staring at the math, and I'm realizing something deeply uncomfortable: this is not a bug in the traditional sense. There's no line of code that's wrong. There's no off-by-one error, no misplaced decimal, no logic flaw. The formulas are correct. They're textbook AMM math. They produce correct outputs for the inputs they receive.

But the inputs are approximations of reality, and the formulas assume precision. The gap between approximate input and precise reality is where the 47% lives. Each tiny imprecision — a fee rate that's 0.05% off, a liquidity distribution that's slightly stale, a tick boundary that was crossed between my data fetch and the on-chain execution — adds a little bit of phantom profit. Individually, each imprecision is negligible. Collectively, they're catastrophic.

It's like measuring the distance from New York to Los Angeles using a ruler that's off by one thirty-second of an inch. For a single measurement, the error is invisible. For a thousand consecutive measurements laid end to end, you'll be off by a mile. My bot is making thousands of tiny measurements, each one imperceptibly wrong, and the accumulated error is 47% of the total predicted profit.

The Cruelest Kind of Wrong

There's a taxonomy of software failures that every developer learns, usually the hard way. At the bottom — the easiest to fix — are crashes. The program stops working and tells you why. Above that are logic errors — the program runs but does the wrong thing, and you can see the wrong output. Above that are race conditions — the program works sometimes and fails sometimes, depending on timing you can't control.

But at the very top of the pyramid — the hardest, cruelest, most maddening category — is the program that works correctly and produces wrong answers. Not because the code has a bug. Not because the logic is flawed. But because the model of reality that the code implements is an imperfect approximation, and the imperfection accumulates to the point where the output is meaningless.

This is where I am.

My code is correct. Every function does what it's supposed to do. Every formula matches the textbook. Every test passes. If you give my AMM math engine the exact pool state from the blockchain at the exact moment of execution, it will compute the correct swap output. I've verified this.

But I can never give it the exact pool state at the exact moment of execution. I can give it the pool state from 200 milliseconds ago. Or 2 seconds ago. Or 30 seconds ago, if the pool is on HTTP polling instead of WebSocket. And in those 200 milliseconds, other traders have swapped through the pool. Liquidity providers have adjusted their positions. The price has moved. The reality my code models has diverged from the reality the blockchain enforces.

The code is correct. The answers are wrong.

I think about weather forecasting. Modern weather models are mathematically rigorous — they solve the Navier-Stokes equations, they incorporate satellite data, they run on supercomputers that process billions of calculations per second. The physics is right. The math is right. The code is right. And yet, the forecast for next Tuesday is wrong. Not because anyone made a mistake, but because the atmosphere is a chaotic system where tiny measurement errors in initial conditions amplify exponentially over time. You can have perfect math and perfect code and still get the wrong answer, because the inputs are imperfect and the system amplifies imperfection.

That's my bot. Perfect math. Imperfect inputs. And a system — the DEX ecosystem on Solana — that amplifies every imperfection until my predicted profit of $0.14 becomes an on-chain loss.

What Ten Hours of Ghost-Chasing Looks Like

I go back to the hoodrat cycle — the one that ran for 10.2 hours — and I reconstruct the timeline minute by minute. Not because I need to. Because I can't stop myself. I need to see the full shape of the failure.

Hour 1: The bot identifies the cycle as profitable. Predicted profit: $0.29. It builds a transaction, submits a bundle, gets back on_chain_not_found. The opportunity evaluation cooldown kicks in — a brief pause before it tries again. But the cycle keeps showing up as profitable in the next scan, so the bot queues another execution.

Hour 3: Still going. The predicted profit has jumped to $1.55. That's a huge number — if real, it would be one of the most profitable cycles I've ever seen. My bot's excitement (if code can be excited) is justified. A dollar fifty-five of profit on a single three-hop cycle is extraordinary. It submits another bundle. on_chain_not_found.

Hour 5: Profit prediction drops to $0.19. The number has fallen by 87% from the hour-3 peak. To my bot, this looks like the opportunity is closing — other traders are eating into the spread. Rational behavior would be to give up. But $0.19 is still above the minimum profit threshold, so the bot tries again. Failed.

Hour 7: Profit shoots back up to $1.48. Wait — what? The opportunity that looked like it was closing is suddenly wide open again? That doesn't make sense if real traders are arbitraging it away. Real opportunities don't bounce back like this. They close and stay closed. The oscillation between $0.19 and $1.48 is a signal that something is fundamentally wrong with the underlying data or math, but my bot doesn't have that kind of meta-reasoning. It sees $1.48 and says "profitable!" and fires another bundle into the void.

Hour 10: The seventh consecutive execution in a 71.4-second window. Ten-second intervals. The bot is in a frenzy — it's found what it believes is a rich vein of opportunity and it's mining as fast as it can. Seven bundles in just over a minute. Seven tips paid to Jito validators. Seven transactions constructed, serialized, signed, submitted, and silently discarded.

And then the cycle finally falls below the profit threshold, or the consecutive failure filter kicks in hard enough to suppress it, and the bot moves on. Ten hours. Fifteen attempts. Zero results. My bot spent almost half a day chasing the same phantom, and I didn't even know it was happening because I was asleep for most of it.

This is the part that chills me. While I was sleeping, my bot was running. I trusted it to make good decisions. I trusted its math. I trusted its data pipeline. And it spent ten hours doing the MEV equivalent of running on a treadmill — exerting maximum effort, burning resources, going absolutely nowhere. If I hadn't pulled the logs and built that spreadsheet, I might never have known. I would have woken up, seen "15 executions, 0 landings" in the summary, attributed it to bad luck or high competition, and moved on. The phantom would have remained invisible.

But I did pull the logs. And now I see it. And now I see it everywhere.

The Full Horror

Once I know what to look for, the phantoms are everywhere. I expand my analysis to every cycle that triggered more than five executions.

The cbBTC cycle: 47 attempts over 43 hours. Forty-three hours. That's almost two full days. My bot chased this single phantom opportunity for two days straight, sending 47 bundles, paying 47 tips, and receiving 47 rejections. The fifth cluster alone — 17 executions in two minutes — represents a burst of frantic activity that, from the bot's perspective, must have looked like hitting the jackpot. Seventeen opportunities in two minutes! But from reality's perspective, it's seventeen letters to Santa.

I count the total: across all phantom cycles, 273 execution attempts over the analysis period. Two hundred and seventy-three bundles submitted. Two hundred and seventy-three tips paid. Two hundred and seventy-three transactions signed, serialized, and broadcast to the network. Every single one of them chasing something that didn't exist.

It's like being a door-to-door salesman who knocks on 273 doors and every single one of them is a vacant house. The address is in the database. The house number is on the mailbox. The listing says someone lives there. But when you knock, nobody answers, because nobody's home. Nobody has been home for a long time.

The difference is that a salesman, after maybe ten empty houses in a row, would stop and check the database. My bot doesn't have that instinct. It trusts its data implicitly. If the math says "profit," the bot says "execute." There's no skepticism layer. No meta-reasoning that asks "I've tried this cycle fifteen times and failed every time — maybe the cycle isn't real?" The bot has no concept of suspicion.

I built a system that's maximally trusting of its own calculations, and I deployed it into an environment where its calculations are consistently wrong. It's like teaching someone to navigate purely by compass and then dropping them in an area with massive magnetic anomalies. The compass is a precision instrument. It's well-calibrated. It's functioning perfectly. But the ground underneath is full of iron, and the needle points everywhere except north.

The Questions I Can't Answer Yet

It's afternoon now. I've been at this for six hours — since the moment I woke up and opened the logs. My coffee is cold. My spreadsheet has 47 tabs. I know things I didn't know this morning, and what I know is that the problem is deeper than I thought.

The stale data problem is real but partial. Some pools are on 30-to-60-second polling cycles and that's clearly too slow. But most of the data is fresh, and the predictions are still wrong. Fixing the polling won't fix the phantoms.

The AMM math overestimation is the bigger monster. Forty-seven percent average overestimation means my bot's entire world model is distorted. It thinks opportunities are nearly twice as profitable as they actually are. It's operating in a fantasy version of the market where spreads are wider, fees are lower, and price impact is gentler than reality. Every decision it makes is based on this fantasy.

And I don't know yet how to fix it. The formulas are correct in isolation. The error comes from the gap between the model and reality — the accumulated imprecision of approximate inputs flowing through exact equations. Making the inputs more precise will help, but by how much? If I reduce the data staleness from seconds to milliseconds, does the overestimation drop from 47% to 5%? To 20%? To 45%? I don't know. The relationship between input freshness and prediction accuracy isn't linear. It might not even be monotonic.

There's a possibility I haven't fully confronted yet: what if the fundamental approach is flawed? What if computing swap outputs from cached pool states and comparing them across a multi-hop cycle is inherently too imprecise for the margins available in Solana MEV? What if the only way to know whether an opportunity is real is to simulate the transaction on-chain, and by the time you get the simulation result, the opportunity is gone? What if the phantom problem isn't solvable within the architecture I've built?

I don't think that's the case. I think the math can be made more accurate. I think the data can be made more fresh. I think the gap between prediction and reality can be narrowed enough that the bot stops chasing ghosts and starts catching real opportunities. But I'm not sure. And "not sure" is a hard place to stand when you've just watched your system fail 273 consecutive times.

The 0% landing rate from the first runs was demoralizing, but it felt diagnosable. Eight problems. Fix them one by one. Rebuild. Try again. It felt like work — hard work, but manageable work.

The phantom opportunity problem feels different. It feels like the system is lying to me, and I'm not sure which part of the system is doing the lying, and I'm not sure the system even knows it's lying. The code is correct. The data is mostly fresh. The math is textbook. And the answers are wrong by 47%.

The code is correct. The data is fresh. The math is textbook.

And the answers are wrong.

How do you fix correct code that produces wrong answers? How do you debug a system that works perfectly but believes in ghosts? I'm sitting here with 43 hours of phantom-chasing data spread across my screen, and I genuinely don't know.

Disclaimer

This article is for informational and educational purposes only and does not constitute financial, investment, legal, or professional advice. Content is produced independently and supported by advertising revenue. While we strive for accuracy, this article may contain unintentional errors or outdated information. Readers should independently verify all facts and data before making decisions. Company names and trademarks are referenced for analysis purposes under fair use principles. Always consult qualified professionals before making financial or legal decisions.