Solana TX Byte Anatomy
I'm staring at a raw transaction as a hex dump. Two hundred and forty-six characters of seemingly random letters and numbers. Somewhere inside this blob is a signature, a list of accounts, a set of instructions, and the entirety of a swap operation. I've been building transactions for months, but I've never actually cracked one open and looked at the individual bytes. Today I'm doing exactly that — tearing a Solana transaction down to its atoms.
The reason is practical. My bot keeps bumping into a wall, and the wall is exactly 1,232 bytes tall.
The Carry-On Bag
If you've ever flown a budget airline in the United States, you know the drill. There's a metal sizer at the gate — a rigid frame that your carry-on bag must fit inside. It doesn't matter that your bag is soft-sided and could technically squish. It doesn't matter that you've flown a hundred times with a bag this size. If it doesn't fit the frame, it doesn't board the plane. No exceptions, no negotiations, no upgrades. The frame is the frame.
Solana transactions have a frame, and it measures exactly 1,232 bytes. Not 1,233. Not "roughly 1,200." Exactly 1,232. And unlike airline carry-on policies, which seem to vary by gate agent mood and lunar phase, this limit is enforced with absolute mechanical precision. A transaction that's 1,232 bytes gets processed. A transaction that's 1,233 bytes gets rejected. The network doesn't care about the contents.
The number itself isn't arbitrary. It comes from the physical constraints of the internet's plumbing.
IPv6, the protocol that carries most modern internet traffic, defines a minimum Maximum Transmission Unit of 1,280 bytes. This is the smallest packet size that every IPv6-compliant network device is guaranteed to handle without fragmentation. Below this, packets might get through. Above this, they definitely get through. At exactly 1,280 bytes, the protocol guarantees delivery as a single unit.
But you don't get all 1,280 bytes for your data. Every network packet needs headers — the addressing information that tells routers where to send it. An IPv6 header takes 40 bytes. An additional fragment header takes 8 bytes. That's 48 bytes of overhead before a single byte of actual payload data.
1,280 minus 48 equals 1,232.
That's it. That's the origin of Solana's transaction size limit. It's not a design decision that some engineer made while weighing trade-offs on a whiteboard. It's a physical constraint inherited from the network layer. It's like asking why shipping containers are a specific width — the answer is that railroad tracks are a specific width, and railroad tracks are a specific width because of historical horse cart dimensions. The constraint propagates upward from the physical layer.
This means every Solana transaction must fit inside a single network packet. No splitting across packets. No reassembly on the other end. One packet, one transaction. This is how Solana achieves its throughput — by eliminating the overhead of packet reassembly and the complexity of handling partial deliveries.
I've known this number for months. What I haven't done until now is figure out exactly how those 1,232 bytes are allocated.
Two Halves of the Package
A Solana transaction is a FedEx package. There's the shipping label and there's the contents. The label proves who sent it and provides routing information. The contents are the actual instructions.
In transaction terms:
Transaction = Signatures + Message
The signatures are the label — cryptographic proof of who authorized this transaction. The message is the contents — everything the network needs to execute the instructions.
Let me start with the signatures, because they're the most predictable piece.
Signatures: The Notary Stamps
Every Solana transaction begins with a signatures array. The format is simple: a length prefix followed by the signature data.
The length is encoded as a compact-u16 (more on this encoding later), and for most transactions it's a single byte with value 01 — one signer. Each signature is 64 bytes of Ed25519 cryptographic data. This is the mathematical proof that the owner of a specific private key authorized this transaction.
For a typical single-signer transaction:
- 1 byte for the length prefix
- 64 bytes for the signature itself
- Total: 65 bytes
Sixty-five bytes. That's already 5.3% of my 1,232-byte budget consumed just to prove who I am. It's like having to show your driver's license before you can say anything — and the license is oversized.
If a transaction requires multiple signers (multisig wallets, co-signing operations), each additional signer adds another 64 bytes. Two signers: 129 bytes. Three signers: 193 bytes. The authentication overhead scales linearly and unforgivingly.
For my arbitrage bot, it's always one signer — my wallet. So 65 bytes, locked in, non-negotiable. I have 1,167 bytes remaining for everything else.
The Message: Where Everything Happens
The remaining bytes form the message — the actual content of the transaction. The message contains four sections, packed sequentially:
- Header (3 bytes)
- Account keys array (variable)
- Recent blockhash (32 bytes)
- Instructions array (variable)
Let me take these apart one by one.
The Message Header: Three Bytes of Permission Design
Three bytes. The entire permission model for a Solana transaction is encoded in three bytes. This might be the most elegant piece of engineering in the entire protocol.
- Byte 0:
num_required_signatures— how many of the listed accounts must have signed this transaction - Byte 1:
num_readonly_signed— of the signed accounts, how many are read-only - Byte 2:
num_readonly_unsigned— of the unsigned accounts, how many are read-only
That's it. Three numbers, three bytes, and the entire permission matrix is defined.
Think of it like a VIN number on a car. A VIN is seventeen characters, and each position encodes specific information — country of manufacture, brand, engine type, model year, assembly plant, serial number. You can decode an entire vehicle's identity from that string if you know the schema. The message header works the same way: three tiny numbers encode the read/write permissions and signing requirements for every account in the transaction.
Here's why this matters. Consider a transaction with six accounts. From the header alone, the runtime knows:
- Accounts at positions 0 and 1 must have provided signatures (num_required_signatures = 2)
- Account at position 1 is signed but read-only (num_readonly_signed = 1)
- Accounts at positions 4 and 5 are unsigned and read-only (num_readonly_unsigned = 2)
- Everything else (positions 2 and 3) is writable and unsigned
No per-account permission flags. No permission bitmasks. No JSON permission objects. Three bytes for the entire transaction. The compression comes from the account ordering convention, which I'll explain in the next section.
Three bytes consumed. 1,164 remaining.
Account Keys: Where the Bytes Pile Up
This is where transactions get fat. Fast.
The account keys array starts with a compact-u16 length prefix (usually 1 byte), followed by a sequence of 32-byte public keys. Every account that any instruction in the transaction touches must be listed here. Every program that gets invoked. Every token account that gets read or written. Every system account, every config account, every authority.
Each account is a Pubkey — 32 bytes. This is a cryptographic address derived from an Ed25519 key pair. It's not compressible. It's not shortenable. Thirty-two bytes, take it or leave it.
The ordering within the array isn't random. It follows a strict convention:
- Writable + Signed accounts come first
- Read-only + Signed accounts come next
- Writable + Unsigned accounts follow
- Read-only + Unsigned accounts come last
This ordering is what makes the three-byte header work. Because accounts are sorted by permission category, the header's three numbers create clean partition boundaries. The runtime doesn't need to check each account individually — it just needs to know where the boundaries fall.
Now let me do the math that keeps me up at night.
A simple token swap on a single DEX touches roughly 10-12 accounts: the user's wallet, the token accounts for both sides, the pool's reserve accounts, the pool authority, the AMM program, the Token program, and a few more depending on the specific DEX. That's 10 accounts times 32 bytes = 320 bytes. Add the length prefix: 321 bytes.
My bot doesn't do simple single-hop swaps. It does cyclic arbitrage — three-hop cycles through three different pools. Each hop adds its own set of accounts. Some accounts overlap (my wallet, the Token program, SOL accounts), but many don't. A three-hop cycle through three different DEX protocols can easily require 25-30 unique accounts.
Thirty accounts times 32 bytes = 960 bytes. Just for account addresses. Out of 1,232 total.
That's 77.9% of my entire transaction budget consumed by addresses alone. Before a single instruction byte. Before the blockhash. Before the program data that tells the DEX what to actually do.
This is the wall I keep hitting. This is why I'm dissecting the byte structure today. Understanding exactly where the bytes go is the first step toward figuring out how to reclaim them.
Recent Blockhash: 32 Bytes of Freshness
Between the account keys and the instructions sits a 32-byte recent blockhash. It's the hash of a recently produced block on the Solana network, and it serves two purposes.
First, replay protection. Without a blockhash, someone who intercepted a valid signed transaction could resubmit it later. The blockhash ties the transaction to a specific moment in the chain's history, making it unreplayable after that moment passes.
Second, expiration. A transaction's blockhash must correspond to a block produced within the last 150 slots — roughly 60 seconds. If the blockhash is older than that, the transaction is rejected outright, regardless of whether the instructions are valid. This is Solana's equivalent of a check with an expiration date. The bank won't cash a check that's too old, and the network won't process a transaction with a stale blockhash.
Thirty-two bytes, fixed, non-negotiable. It's a flat tax on every transaction. At this point I've consumed:
- Signatures: 65 bytes
- Header: 3 bytes
- Account keys (1 byte count prefix + 30 × 32 bytes): 961 bytes
- Blockhash: 32 bytes
- Subtotal: 1,061 bytes
I have 171 bytes left for the actual instructions. The things that tell the network what to do. This is like packing a shipping container and realizing that after the packing materials, the insurance documents, and the customs forms, you have about 14% of the container's volume left for actual merchandise.
Instructions: Index References That Save Bytes
The instructions array is where Solana's designers got clever about byte conservation. An instruction, at its core, needs three things: which program to call, which accounts to pass, and what data to include. Without any optimization, each instruction would need to repeat the 32-byte program address and the 32-byte address of every account it touches. This would be catastrophically wasteful.
Instead, instructions use index references into the account keys array.
Each instruction is structured as:
- program_id_index (1 byte): the index of the program in the account keys array
- account_indices array: compact-u16 length + 1 byte per index
- data array: compact-u16 length + raw instruction data
The program_id_index is a single byte pointing to the position of the target program in the account keys array. Instead of repeating the 32-byte program address, the instruction says "call the program at position 7 in the account list." One byte instead of thirty-two. That's a 97% savings on program identification.
The account indices work the same way. Instead of listing full 32-byte addresses for every account an instruction needs, it lists their positions in the account keys array. Each position is a single byte. An instruction that touches 8 accounts uses 8 bytes for account references instead of 256 bytes. Another 97% savings.
This is like how ZIP codes work. Instead of writing "the geographic region encompassing these specific coordinates with these boundaries," you write five digits. The ZIP code is a reference into a lookup table. Everyone agrees on the table, so the reference is unambiguous and much shorter than the full information it represents.
The instruction data — the actual payload that tells the program what to do — is the one part that can't be compressed through references. It contains the discriminator (usually 8 bytes for Anchor programs, sometimes 1 byte for native programs), the amount to swap, slippage parameters, or whatever else the program expects. This data is protocol-specific and varies widely.
For a typical DEX swap instruction:
- program_id_index: 1 byte
- account_indices length: 1 byte
- account_indices: ~8-12 bytes (one per account the instruction touches)
- data length: 1 byte
- data: ~9-17 bytes (discriminator + amount + minimum out)
Total per instruction: roughly 20-30 bytes. For a three-hop cycle with three swap instructions, that's 60-90 bytes of instruction data. It fits within my remaining 171 bytes. Barely.
But here's the critical insight: the instruction data is efficient because it references the account keys array. The account keys array is where the real cost lives. Every unique account I add to the transaction costs 32 bytes in the array, regardless of how many instructions reference it. And the instructions themselves are cheap precisely because they avoid repeating that information.
The bottleneck isn't instructions. It's accounts.
compact-u16: Squeezing Length Prefixes
I've mentioned compact-u16 several times. It's a variable-length encoding for unsigned integers that Solana uses for array lengths and counts throughout the transaction format. Understanding it reveals how carefully the protocol designers thought about byte conservation.
Standard encoding for a 16-bit unsigned integer uses 2 bytes, always. Whether the value is 0 or 65,535, it takes the same space. compact-u16 uses a variable-length encoding:
- Values 0-127: 1 byte. The high bit is 0, indicating no continuation byte.
- Values 128-16,383: 2 bytes. The first byte's high bit is 1, indicating a continuation byte follows.
- Values 16,384-65,535: 3 bytes. Both leading bytes have high bits set.
In practice, almost every array in a Solana transaction has fewer than 128 elements. The number of signatures is usually 1. The number of accounts is typically under 40. The number of instructions is typically under 10. The length of instruction data is almost always under 128 bytes.
This means compact-u16 saves exactly 1 byte per length field compared to a fixed 2-byte encoding. One byte doesn't sound like much, but count the length fields in a transaction: the signature array length, the account keys array length, the instructions array length, and within each instruction, the account indices length and the data length. A three-instruction transaction has at least 9 length fields. That's 9 bytes saved by using compact-u16 instead of fixed-width encoding.
Nine bytes out of 1,232. Less than 1%. But in a world where I'm scraping for every available byte, that 1% matters. It's the difference between a transaction that barely fits and one that barely doesn't.
This is the kind of engineering that only makes sense at scale. Saving 9 bytes on a web application's HTTP request would be absurd. Saving 9 bytes on a structure that gets created and processed millions of times per day, with a hard size cap that determines whether the operation succeeds or fails — that's worth an entire custom encoding scheme.
A Complete Byte Map
Let me lay out a concrete example. A minimal three-hop cyclic arbitrage transaction with one signer and 28 unique accounts:
| Section | Calculation | Bytes |
|---|---|---|
| Signature count | compact-u16(1) | 1 |
| Signature | Ed25519 | 64 |
| Header | 3 fixed bytes | 3 |
| Account count | compact-u16(28) | 1 |
| Account keys | 28 x 32 | 896 |
| Recent blockhash | SHA-256 hash | 32 |
| Instruction count | compact-u16(3) | 1 |
| Instruction 1 | program index + accounts + data | ~28 |
| Instruction 2 | program index + accounts + data | ~28 |
| Instruction 3 | program index + accounts + data | ~28 |
| Total | ~1,082 |
That leaves about 150 bytes of headroom. Sounds comfortable until I add a fourth instruction (maybe a tip transfer), a few more accounts for a different DEX type, or any program that requires extra accounts for authority derivations or oracle feeds. Twenty-nine accounts becomes thirty-three accounts. Four more accounts times 32 bytes = 128 bytes. Suddenly I'm at 1,210 bytes with 22 bytes of headroom, and any additional complexity pushes me over.
This is why different DEX protocols have different "costs" in my system. A swap through a protocol that requires 8 accounts per instruction is cheaper, byte-wise, than a swap through one that requires 12. When my cycle detector evaluates possible routes, the byte cost is an invisible filter. Some cycles that are mathematically profitable are physically impossible — they simply don't fit in 1,232 bytes.
Why This Matters for MEV
Every byte of transaction space is a competitive resource. Here's the arithmetic that makes this tangible.
My bot scans for cyclic arbitrage opportunities — price discrepancies across three pools that can be exploited by swapping in sequence. The more pools it can consider, the more potential cycles it finds. The more accounts a cycle requires, the closer the transaction gets to the 1,232-byte limit. Hit the limit, and the opportunity might as well not exist.
This creates a direct link between byte efficiency and revenue. A bot that can fit the same arbitrage operation into fewer bytes has access to a wider range of opportunities. It can consider cycles involving more complex DEX protocols. It can include more precise slippage parameters. It can add tip instructions without blowing the size limit.
Here's where Address Lookup Tables enter the picture. Solana's versioned transactions (v0) support a mechanism where frequently used account addresses can be stored on-chain in a lookup table. Instead of including the full 32-byte address in the transaction, the transaction references the lookup table and provides a 1-byte index into it. Thirty-two bytes becomes one byte — the same index trick that instructions use for program IDs, but applied to the entire account list.
The savings are dramatic. Take my 28-account example. If 20 of those accounts are stored in a lookup table, those 20 accounts cost 1 byte each instead of 32. That's 20 bytes instead of 640 bytes — a savings of 620 bytes. The transaction drops from 1,082 bytes to roughly 462 bytes. Suddenly I have room for 40+ accounts, more instructions, larger data payloads. Cycles that were physically impossible become trivially feasible.
But lookup tables aren't free. They cost rent to maintain on-chain. The addresses in them must be registered in advance. If a pool's accounts change, the table needs updating. Managing lookup tables is its own operational overhead — a cost paid in engineering time and on-chain rent to buy byte efficiency in transactions.
This trade-off — operational complexity for byte savings — is one of the fundamental decisions in MEV infrastructure. Every serious searcher manages lookup tables. The question is how aggressively, how many, and for which accounts.
The NFL Roster Analogy
Here's how I think about the 1,232-byte constraint after spending today pulling transactions apart byte by byte.
An NFL team has a 53-player active roster limit. It doesn't matter how talented your 54th player is. It doesn't matter that having 54 players would make your team better. The rule is 53. Every roster decision happens within that constraint — which positions to double up on, which specialist roles justify a roster spot, which players can play multiple positions and thus save a spot.
The 1,232-byte limit works the same way. Every design decision I make about my bot's transaction construction happens within this constraint. Which accounts can be shared across instructions? Which programs can I avoid invoking by restructuring the operation? Which accounts can I move into a lookup table? Can I pack two operations into fewer instructions?
The teams that win in the NFL aren't necessarily the ones with the most talented players in isolation. They're the ones that build the best roster under the 53-player constraint. They find players who fill multiple roles. They structure their depth chart to maximize coverage with minimum roster spots.
The bots that win in MEV aren't necessarily the ones with the best math or the fastest connections. They're the ones that build the best transactions under the 1,232-byte constraint. They find ways to minimize account count. They structure their instructions to maximize operations with minimum bytes.
What I'm Seeing Differently
After today's dissection, the transaction format looks less like an opaque blob and more like a packing problem. Every section has a cost. Every byte has a purpose. The design reveals priorities — cryptographic integrity first (signatures and blockhash consume nearly 8% of every transaction), then permission modeling (3 bytes, beautifully minimal), then the raw tension between account addressing (expensive) and instruction encoding (cheap, thanks to index references).
The 1,232-byte limit isn't a limitation to work around. It's a constraint to design within. Like the carry-on sizer at the gate, it forces a specific kind of discipline. You don't bring what you want; you bring what fits. And the skill is in knowing what fits, what can be compressed, and what you can store in the overhead bin before you board — the on-chain equivalent of Address Lookup Tables.
I'm looking at my bot's transactions differently now. Every account in the keys array is a question: does this need to be here, or can it be referenced from a lookup table? Every instruction is a question: can I restructure this to touch fewer accounts? Every DEX protocol is a question: how many accounts does this swap actually require, and is there a byte-cheaper alternative?
Twelve hundred and thirty-two bytes. That's the arena. That's the ring. Everything I build has to fight within it. The question I'm sitting with now is whether understanding the byte structure at this level changes how I think about which cycles to pursue in the first place — whether the byte cost of a route should factor into the profitability calculation alongside AMM math, fee estimation, and competition analysis.
It's a small number, 1,232. It's remarkable how much complexity you can fit inside it, and how quickly it fills up when you're not paying attention.
Disclaimer
This article is for informational and educational purposes only and does not constitute financial, investment, legal, or professional advice. Content is produced independently and supported by advertising revenue. While we strive for accuracy, this article may contain unintentional errors or outdated information. Readers should independently verify all facts and data before making decisions. Company names and trademarks are referenced for analysis purposes under fair use principles. Always consult qualified professionals before making financial or legal decisions.