Phantom Profit Filter — Blocking Ghost Opportunities
The screener lights up. A three-hop cycle through a mid-cap token is showing a profit that makes me lean forward in my chair. It's not a small edge — it's a number that, if real, would make the entire week worthwhile in a single shot. My pulse picks up. I watch the log, waiting for the bundle to land.
It doesn't land. It never lands.
I've seen this before. I've seen it dozens of times. But this time, instead of investigating why the opportunity was fake, I'm doing something different. I'm building a wall to keep the ghosts out.
The Email You Should Never Click
Everyone in America has received the email. The subject line promises you've won a prize, inherited a fortune, or been selected for an exclusive refund. The dollar amount is always eye-popping — just large enough to make you want to believe it, just absurd enough that a part of your brain whispers this can't be real. Most people have learned, through years of spam filters and cautionary news segments, to delete these without opening them. The profit is too good. The opportunity is too easy. If it were real, someone else would have claimed it already.
My arbitrage screener is sending me the crypto equivalent of Nigerian prince emails, and I've been opening every single one.
The phantom profit problem isn't new to me. I've spent painful hours watching my bot chase the same fake opportunity over and over — the same cycle appearing as "profitable" across dozens of evaluation windows, each time resulting in a failed bundle. I've dug into why the math overestimates, why stale state creates mirages, why the gap between predicted and actual profit can stretch into the hundreds of percent. I understand the disease. What I haven't built yet is the immune system.
Today, I'm building the spam filter.
What Phantom Profit Actually Looks Like
Here's what happens mechanically. My screener reads the state of several liquidity pools — their reserves, their current price positions, their fee parameters. It runs the AMM math for a multi-hop cycle: swap A to B in pool one, B to C in pool two, C back to A in pool three. The math says the output of the final swap exceeds the input of the first swap by some margin. After subtracting fees, network costs, and the Jito tip, there's profit left over. The screener flags the cycle as a go.
The problem is that every piece of data feeding that calculation is a photograph of a moment that has already passed. Pool reserves shift with every block — roughly every 400 milliseconds on Solana. By the time my screener reads the state, computes the math, builds the transaction, and submits it, multiple blocks have elapsed. The state my math operated on is no longer the state that exists on-chain.
This is the stale data problem, and I've written about it before. But there's a specific flavor of stale data that I keep running into, and it's the one that produces the most dangerous false signals: the phantom profit.
A phantom profit isn't just a slightly-off prediction. It's a wildly inflated one. It's the screener reporting that a cycle will return several percent profit when the actual on-chain reality would yield a fraction of that — or nothing at all. These aren't small overestimates. They're hallucinations. The screener sees a mirage in the desert and reports an oasis.
The hallucinations share a pattern. They tend to be large. Suspiciously large. Unreasonably large. And this is the key insight that makes filtering possible: in the world of competitive on-chain arbitrage, large profits are almost always fake.
The $100 Bill on the Sidewalk
There's an old joke in economics. A professor and a student are walking down the street. The student spots a $100 bill on the sidewalk and bends to pick it up. The professor says, "Don't bother. If it were real, someone would have already picked it up."
The joke is usually told to mock the professor — obviously the bill is real, just pick it up. But in MEV, the professor is right. The professor is almost always right.
Think about what it means for a large arbitrage profit to exist on-chain for more than a fraction of a second. It means that a price discrepancy between two or more pools has created a profitable cycle. That discrepancy represents free money sitting on the sidewalk. Now think about who else is walking down this sidewalk. Not just me. Hundreds of other bots. Professional trading firms with dedicated Solana validators. Searchers running on bare-metal hardware co-located with RPC nodes. Operations with microsecond-level latency advantages and teams of quantitative engineers optimizing every nanosecond of their pipeline.
If a genuinely large arbitrage opportunity appears — a real $100 bill on the Solana sidewalk — it gets snatched up within a single block. Often within a single slot. The competition is so fierce, the participants so numerous and so fast, that any opportunity large enough to be interesting is consumed almost before it finishes existing. By the time my screener notices it, evaluates it, and decides to pursue it, the bill is already in someone else's pocket.
This is the efficient market hypothesis, MEV edition. Not perfectly efficient — small inefficiencies persist, which is the entire basis of my bot's existence — but efficient enough that large inefficiencies don't survive. If my screener reports a large profit on a cycle, one of two things is true:
- The opportunity was real but has already been consumed by a faster competitor, and my screener is looking at stale state that reflects the pre-consumption reality.
- The opportunity never existed in the first place — the data feeding the calculation was stale, corrupted, or otherwise unreliable.
Either way, the correct response is the same: don't chase it.
This is counterintuitive. Every instinct says to pursue the big number. The big number is what makes the whole operation worthwhile, right? But in practice, the big number is the lie. The small numbers — the tight margins, the modest edges — those are the ones that might actually be real, precisely because they're not large enough to attract instant competition.
Ten Hours of Chasing the Same Ghost
I've lived this. Not in theory — in excruciating practice.
There was a period where my bot detected the same cycle as "profitable" repeatedly over the course of an entire day. Not once. Not twice. Dozens of times. The same token path, the same pools, the same predicted profit, hour after hour. My bot would evaluate the cycle, see profit, build a transaction, submit a bundle, get back a failure, and then — like someone refreshing a sold-out concert ticket page over and over — immediately re-evaluate and try again.
Every single attempt failed. Not one landed. The cycle was a ghost. The pool state feeding the calculation hadn't actually changed in the way my screener believed. Or the state had changed, but in a direction that eliminated the opportunity before my transaction could execute. Either way, I was burning compute cycles, consuming rate limits, and — most critically — missing other opportunities while my bot was fixated on a phantom.
This is the real cost of phantom profits. It's not just that the fake opportunity doesn't pay out. It's that pursuing the fake opportunity has an opportunity cost. Every millisecond my bot spends evaluating, building, and submitting a doomed transaction is a millisecond it's not spending on the next real opportunity. It's like spending your entire afternoon driving to a garage sale because the Craigslist listing promised a mint-condition vintage guitar for $20, only to arrive and find it was posted three weeks ago. The guitar was sold on day one. But you still lost the afternoon.
The repetition is what finally makes the problem undeniable. If a cycle shows up as profitable once and fails once, that's normal — competition, bad timing, network congestion. But if the same cycle shows up as profitable fifteen times and fails fifteen times, that's not bad luck. That's a systematic data problem, and the correct engineering response is to stop chasing it.
Building the Filter
So I build a filter. Conceptually, it's simple. Practically, it requires making uncomfortable decisions about where to draw lines.
The Profit Cap
The first filter is the most psychologically difficult one to implement. I set an upper bound on how much profit I'm willing to believe. If the screener reports a profit above a certain threshold, the filter rejects it automatically. Not "flag for review." Not "submit with caution." Reject. Skip. Move on.
This feels insane. I'm literally telling my bot to ignore the biggest opportunities it finds. It's like configuring your email client to automatically delete any message that promises more than a certain dollar amount — which, come to think of it, is exactly what a spam filter does. Gmail doesn't show you the Nigerian prince email and ask you to evaluate it critically. It just puts it in spam. Because the heuristic works: if the promised value is abnormally high, the probability that it's legitimate is abnormally low.
The specific threshold I use is something I'm keeping to myself — it's derived from observation of what actual landed trades produce versus what phantom signals claim, and it's one of the few genuine competitive edges I have. But the concept is universal: there's a profit level above which the probability of the opportunity being real drops below the probability of it being worth pursuing. Every bot operator needs to find their own number based on their own data.
Repetition Detection
The second filter targets the "golden retriever chasing the same tennis ball" problem. If the same cycle — same token path, same pools — keeps appearing as profitable within a short time window, something is wrong. Real opportunities don't recur. They get consumed. If an opportunity keeps coming back, it means either the underlying data isn't updating (stale state) or the opportunity is illusory for structural reasons (an inactive pool, a token with no real liquidity, a calculation artifact).
I implement a cooldown mechanism. When a cycle is evaluated and either fails to land or gets filtered out, it enters a cooldown period. During that period, the same cycle is automatically skipped if the screener flags it again. The cooldown duration is another parameter I tune through observation — too short and the ghosts get through; too long and I miss legitimate recurrences.
This is the equivalent of your phone's spam call filter. The first call from an unknown number might be legitimate. But if the same number calls you eight times in two hours, it's not your doctor's office. It's a robocall. Block it.
Data Freshness Validation
The third filter checks how old the underlying pool state data actually is. Stale data produces unreliable predictions. If the pool state hasn't been updated recently, any opportunity based on that data becomes automatically suspect.
Think of it like checking the "listed" date on a real estate listing. A house that went on the market yesterday at a competitive price? Maybe worth visiting. A house that's been listed for six months at a "too good to be true" price? There's a reason nobody has bought it. Either the listing is out of date, or there's something wrong that isn't visible in the photos.
The Filter Paradox
Here's where it gets uncomfortable.
Every filter is a trade-off between two types of errors. In statistics, these are called false positives and false negatives, but I think of them in more visceral terms:
False Positive (filter too loose): A ghost opportunity gets through the filter, the bot wastes time and resources pursuing it, and nothing lands. Cost: wasted compute, burned rate limits, missed real opportunities elsewhere.
False Negative (filter too tight): A real opportunity gets caught by the filter and rejected. The bot skips a trade that would have actually landed and produced real profit. Cost: missed revenue.
These two errors pull in opposite directions, and there's no setting that eliminates both. Tighten the filter, and you block more ghosts but also block more real opportunities. Loosen the filter, and you catch more real opportunities but also let more ghosts through.
It's the same tension that governs every filtering system humans have built. Your email spam filter occasionally puts a legitimate email in your spam folder (false negative — a real message treated as spam). It occasionally lets a spam email into your inbox (false positive — spam treated as legitimate). Gmail's engineering team has spent years tuning these thresholds, and they still don't get it perfect. Neither will I.
The difference is that Gmail handles billions of messages and can use machine learning to optimize the boundary. I'm working with a much smaller dataset — my own bot's history of attempts, successes, and failures. My filter is more like a hand-tuned spam rule than a neural network. It's a set of heuristic thresholds that I adjust manually based on what I observe.
This is one of those areas where the MEV developer community's shared experience directly maps to mine. It's well understood among builders in this space that a meaningful fraction of on-chain transactions that look like profitable arbitrage — the classic token-loop pattern — actually result in losses once you account for gas and tips. The gap between predicted and realized profit is not noise. It's a systematic feature of the system, driven by stale data, competition, and the inherent latency between observation and execution.
The research also notes something that resonates: fixed heuristic rules — which is exactly what my filter uses — inevitably produce both false positives and false negatives. The rules can't adapt to changing market conditions, shifting competition patterns, or new pool dynamics. A threshold that works well during a quiet market period might be too tight during high volatility (when real opportunities are larger than usual) or too loose during low volatility (when even small phantom profits are suspicious).
There is no perfect threshold. There's only the threshold that's good enough right now, with the data I have right now, subject to constant revision as I learn more.
Why Real Profit Is Small and Frequent
Here's the mental model shift that building the filter forces on me.
Before, I thought of arbitrage in terms of home runs. Find the big opportunity. Land the big trade. Make the week's profit in one shot. This is the natural way to think about it — it's how arbitrage is portrayed in most explanations, where the examples always involve some dramatic price discrepancy that produces an obvious, large profit.
In reality, cyclic arbitrage on a competitive chain like Solana doesn't work that way. The big opportunities don't exist — or more precisely, they exist for microseconds and are captured by participants who are faster than me. What remains for a bot at my level of the food chain are the crumbs. Small edges. Tight margins. Profits measured in fractions of a percent.
This sounds discouraging, and honestly, it is. But it's also liberating, because it reframes the entire optimization problem. I'm not trying to hit home runs. I'm trying to get on base consistently. The value isn't in any single trade — it's in the aggregate of many small trades executed quickly and reliably over time.
A baseball analogy is apt here. The flashy slugger who swings for the fences every at-bat is exciting to watch but strikes out constantly. The contact hitter who consistently puts the ball in play, gets singles and doubles, and rarely strikes out generates more total value over a season. In MEV terms, the bot that chases every large opportunity is the slugger — lots of dramatic swings, lots of strikeouts, occasionally a home run that makes up for the failures. The bot that focuses on capturing small, reliable edges is the contact hitter — boring to watch, but accumulating value steadily.
The phantom profit filter is what converts me from a slugger into a contact hitter. By explicitly rejecting the big, flashy, almost-certainly-fake opportunities, it forces the bot to focus on the small, unglamorous, more-likely-to-be-real ones.
This is easier said than done. Every time the filter rejects a cycle showing a large profit, there's a voice in my head that says but what if that one was real? What if I just threw away the one trade that would have made the whole operation profitable? What if the filter is too tight and I'm leaving real money on the table?
I don't have a good answer to that voice. I can only look at the data: before the filter, my execution success rate was terrible. The bot was spending most of its time and resources chasing phantoms. After the filter, the success rate improves — not dramatically, not magically, but measurably. Fewer total attempts, but a higher percentage of those attempts have a chance of landing.
What the Filter Can't Do
I want to be honest about the limits of this approach.
The phantom profit filter is a band-aid on a structural problem. The real issue is that my screener's predictions diverge from on-chain reality because of data latency, and no amount of post-hoc filtering changes the underlying data quality. I'm not making the predictions better. I'm just discarding the ones that are obviously wrong.
It's like putting a price alert on your stock portfolio that says "ignore any projected return over X% — it's probably a data glitch." The alert is useful. It saves you from making bad trades based on bad data. But it doesn't fix the data feed. The underlying information is still delayed, still incomplete, still a snapshot of a reality that's already moved on.
The real solution — the one I haven't built yet and might not have the infrastructure to build — is to improve the freshness and accuracy of the data itself. Faster state reads. Better prediction of how state will change between observation and execution. Accounting for pending transactions that are likely to land before mine. These are all hard problems that the most competitive MEV operations invest heavily in solving.
My filter is what I can build now, with what I have now. It's a practical measure, not an elegant one. It reduces the symptom without curing the disease.
There's also the constant risk of over-fitting. Every time I tune a threshold based on recent data, I'm implicitly assuming that the future will look like the recent past. If market conditions change — if a new DEX launches, if liquidity shifts, if a major token migrates to a different pool type — my carefully tuned thresholds might become useless overnight. The filter that was perfectly calibrated for last week's market might be wildly miscalibrated for next week's.
This is the fundamental fragility of rule-based filtering. It works until the rules no longer match the reality. And in a system as dynamic as DeFi, the reality changes constantly.
The Question I Can't Answer Yet
So here I am, with a filter running that blocks the most egregious phantom profits, a cooldown mechanism that prevents the bot from obsessing over dead cycles, and a freshness check that applies stricter standards to stale data. The bot is quieter now. It submits fewer bundles. It wastes less time on ghosts.
But is it making more money?
I genuinely don't know yet. The filter has been running for a short period, and the sample size is too small to draw conclusions. I can see that the behavior is healthier — fewer wasted attempts, more focused evaluation, less thrashing on phantom cycles. The bot feels more disciplined, if you can say that about software. It's no longer the golden retriever chasing every tennis ball. It's more like a cat — watchful, selective, willing to let most things pass without reacting.
Whether that selectivity translates into better outcomes is an empirical question that needs more data and more time to answer. It's possible that the filter is too tight and I'm leaving money on the table. It's possible that it's too loose and there are more phantoms getting through than I realize. It's possible that the entire filtering approach is wrong and I should be investing my energy in better data infrastructure instead.
What I do know is this: a bot that spends its time chasing ghosts is worse than a bot that occasionally misses a real opportunity. The cost of pursuing a phantom is concrete and immediate — wasted resources, burned rate limits, missed alternatives. The cost of missing a real opportunity is theoretical and uncertain — maybe it would have landed, maybe it wouldn't have.
In risk management terms, I'd rather have a system that consistently avoids known bad bets than one that occasionally catches a great bet but mostly wastes its time on mirages. The upside of catching one big real opportunity doesn't outweigh the downside of chasing a hundred fake ones.
But that's a philosophical position, not a proven strategy. The data will tell me whether I'm right. And if the data says I'm wrong — if the filter is costing me more than it's saving — I'll have to rethink everything.
That's the uncomfortable truth about building systems in a competitive, adversarial environment. You never stop questioning whether your assumptions are correct. You never reach a point where the filter is "done." You just keep adjusting, keep observing, keep wondering: is the filter catching the ghosts, or is it also catching the one real opportunity that would have made all the difference?
Disclaimer
This article is for informational and educational purposes only and does not constitute financial, investment, legal, or professional advice. Content is produced independently and supported by advertising revenue. While we strive for accuracy, this article may contain unintentional errors or outdated information. Readers should independently verify all facts and data before making decisions. Company names and trademarks are referenced for analysis purposes under fair use principles. Always consult qualified professionals before making financial or legal decisions.