Gas pattern obfuscation: why sybil wallets look identical to chain analysis
Gas pattern obfuscation: why sybil wallets look identical to chain analysis
Most operators running multi-wallet setups spend their obfuscation budget in the wrong places. They rotate proxies, stagger funding, use different browsers through tools like GoLogin or Multilogin, even buy dedicated SIM cards for each account. Then they execute all fifty wallets through the same Python script with the same hardcoded gas parameters and wonder why they get wiped in the airdrop eligibility review.
Gas metadata is one of the most reliable clustering signals chain analysis firms and protocol teams use. It is underappreciated because gas feels like infrastructure, not identity. You set a gas price, the transaction goes through, you move on. But every choice you make, including the choices you delegate to your RPC provider or your wallet library, leaves a fingerprint. When that fingerprint repeats across dozens of wallets, it is not a coincidence that any decent Dune query can surface.
The stakes are real and rising. Protocols like LayerZero, Arbitrum, and Optimism have all run explicit sybil filtering before token distributions, with some publishing their methodology or running community bounty programs to surface clusters. The methodology keeps getting more sophisticated. If you are still treating gas as a pure throughput concern rather than an operational security concern, this piece is for you.
background and prior art
Blockchain clustering is not new. Chainalysis published foundational research on address clustering using common-input heuristics for Bitcoin over a decade ago, and the same firm’s Reactor product has been applied to Ethereum wallet graph analysis for years. The core insight from Bitcoin clustering, that wallets controlled by the same entity tend to behave similarly because they share code, configuration, and operational patterns, transfers directly to EVM chains.
What changed with the airdrop farming era is who is doing the analysis and why. It is not just law enforcement or exchange compliance teams anymore. Protocol teams and their token distribution advisors run their own clustering pipelines, often open-source or built on top of Dune Analytics, specifically to protect their distributions from sybil exploitation. The LayerZero sybil bounty program in 2024 was notable because they externalised this to the community, paying $0.01 per sybil address identified by third-party researchers. That program surfaced millions of addresses, many caught precisely because of gas pattern clustering.
The academic framing here is “transaction fingerprinting.” A 2020 paper from researchers at University College London demonstrated that wallet software, not just wallet owners, creates identifiable on-chain signatures through transaction construction choices. EIP-1559, introduced in the London hard fork in August 2021, actually made this worse for operators because it added two additional fields, maxFeePerGas and maxPriorityFeePerGas, that scripted wallets tend to set in highly uniform ways.
the core mechanism
To understand why gas creates fingerprints, you need to understand what choices go into a transaction’s gas parameters and where those choices come from.
For a standard EIP-1559 transaction on Ethereum mainnet or any EVM-compatible L2, the relevant fields are:
gasLimit // maximum gas units the tx can consume
maxFeePerGas // absolute cap on fee per gas unit (wei)
maxPriorityFeePerGas // tip to validator per gas unit (wei)
The baseFee is set by the network per block and is deterministic given the block number. Your wallet does not choose it. What your wallet (or script) chooses is the maxFeePerGas and maxPriorityFeePerGas, and how it chooses them is entirely determined by how you wrote your code or configured your wallet.
RPC provider gas estimation clustering. When your script calls eth_gasPrice or eth_maxPriorityFeePerGas on your RPC endpoint, you get back the provider’s current estimate. Infura, Alchemy, and QuickNode each have slightly different estimation algorithms, and they update on different schedules. If fifty wallets all call the same Infura endpoint within the same thirty-second window to get their gas price, they will get identical or near-identical values. When all fifty transactions land in the same few blocks with exactly 1.5 gwei priority fee because that is what Infura was returning at 14:32 UTC, that is a cluster.
Hardcoded gas values in scripts. This is more common than it should be. A script might set maxPriorityFeePerGas = web3.utils.toWei('2', 'gwei') because that worked during testing and no one changed it. Every wallet using that script submits exactly 2 gwei tip regardless of network conditions. On a busy day when most users are bidding 3-4 gwei, a cohort of wallets all bidding exactly 2 gwei stands out in the mempool data and in post-hoc analysis.
Gas limit patterns. For simple ETH transfers, the gas limit is always 21000. That is fixed by the protocol. But for contract interactions, the gas limit is an estimate. MetaMask and other consumer wallets call eth_estimateGas and add a buffer, often 20-30%, which results in rounded-up but non-uniform values. Scripted wallets often hardcode gas limits: gasLimit: 200000 for all contract interactions, or use library defaults that are consistent across all instances. A cohort where every wallet uses exactly 150000 gas for a bridge deposit, regardless of whether the estimate would have been 87000 or 134000, is trivially identifiable.
Timing correlation. This is not technically a gas parameter but it amplifies every other signal. When you run a script across fifty wallets sequentially with 5-second delays, those fifty transactions land in a narrow block window. Combine that with identical gas parameters and the clustering is overwhelming. Even with random delays, if the distribution of delay times is uniform random between 5 and 60 seconds, a statistical analysis of inter-transaction timing across your wallet set will surface the pattern versus organic human usage which follows a completely different distribution.
Nonce trajectories. A fresh wallet starts at nonce 0. If all your wallets were created on the same day, funded from the same source, and have executed the same sequence of actions, they all have the same nonce at the same point in the campaign. Nonce 7 after interacting with the same sequence of five protocols is a fingerprint on its own.
The combination of these signals is what makes gas pattern analysis powerful. A single signal might be ambiguous. Five signals aligned across a wallet cohort is not.
worked examples
Example 1: the Arbitrum One bridge cluster
Consider fifty wallets all created in December 2022, funded with ETH from the same Binance withdrawal address pattern (consecutive withdrawal nonces), then bridged to Arbitrum One via the official bridge at arbitrum.io. The bridge contract interaction requires a specific function call with a gas estimate that typically lands between 95,000 and 105,000 gas.
If the script hardcodes gasLimit: 120000 and uses Alchemy’s gas price API called at the same time for each wallet, the resulting transactions show: same gas limit (120000), same maxPriorityFeePerGas (whatever Alchemy returned in that window, say 1.7 gwei), transactions spread across a twenty-minute window, funded from addresses with a clear common-origin cluster. A Dune query looking for wallets with gasLimit = 120000 on the Arbitrum bridge contract, grouped by funding source proximity, surfaces this in minutes.
Example 2: the zkSync Era interaction with varied timing but uniform fees
An operator runs 30 wallets through zkSync Era in February 2024. They correctly stagger execution over 48 hours. But they use a single .env file value for GAS_PRICE_MULTIPLIER = 1.2 applied to whatever the RPC returns. Because zkSync Era’s L2 fee mechanism works differently from Ethereum mainnet, this multiplier produces the same exact fee value for many transactions because the base fee does not fluctuate much on low-traffic days.
Twenty-eight of the thirty wallets show identical gasPerPubdata values (a zkSync-specific parameter that scripted wallets almost always hardcode) and identical fee token approval amounts. The two-day spread disguises the timing correlation, but the zkSync-specific parameter clustering makes the cohort visible to anyone running analysis on that field. This is the kind of detail that consumer wallet interfaces handle automatically with per-session variation, but scripts do not.
Example 3: the LayerZero OFT interaction cluster
A cohort of 100 wallets interacts with a LayerZero OFT bridge across multiple chains in Q1 2024. The operator correctly randomises amounts (varying between 0.01 and 0.05 ETH per transaction). They incorrectly use a fixed _adapterParams value across all transactions. In LayerZero’s OFT standard, _adapterParams encodes the destination gas limit and airdrop amount. Using the same _adapterParams hex string across 100 wallets is a direct cluster signal, visible at the event log level, not even requiring gas price analysis. Combined with the fact that all 100 wallets were funded from a tornado cash successor contract within the same week, the cluster is high-confidence.
This example illustrates that gas parameters are not the only metadata signal, but they compound with every other indicator.
edge cases and failure modes
Failure mode 1: per-wallet RPC endpoints that share the same backend.
Some operators rotate RPC endpoints across wallets thinking this adds entropy. But if they are all pointing to Alchemy’s free tier or Infura’s standard endpoints, they are hitting the same gas estimation backend with only superficial URL differences. The returned values are identical. Using different RPC providers across wallet cohorts, not just different endpoints from the same provider, is required for this to create actual signal divergence.
Failure mode 2: gas randomisation that is too uniform.
Adding random noise to gas values is the right instinct but the wrong implementation is common. random.uniform(1.5, 2.5) applied to priority fee in gwei produces a flat uniform distribution, which is itself not how human wallets behave. Real human wallets produce something closer to a bimodal distribution: most use the wallet’s suggested value (which clusters tightly), and a minority manually override to a round number. A cohort with perfectly uniform random gas values between 1.5 and 2.5 gwei is statistically distinguishable from organic usage by the shape of the distribution alone. Drawing from a log-normal or Pareto distribution would better approximate organic behaviour.
Failure mode 3: ignoring L2-specific parameters.
Each L2 has its own gas mechanism that introduces additional fields beyond standard EIP-1559. Optimism and Base have l1Fee components. zkSync has gasPerPubdata. Starknet has entirely different fee structures. Scripted wallets that do not handle these correctly either use protocol defaults consistently (creating a cluster) or fail transactions (creating a different kind of pattern, repeated failed transactions followed by success). Reading the Ethereum Foundation’s gas documentation gives you the Ethereum-layer picture, but each L2’s documentation is mandatory reading before scripting interactions on that chain.
Failure mode 4: consistent gas/value ratio.
Chain analysis can look at the ratio of gas spent to value transacted. If a cohort of wallets consistently spends $0.15-0.20 in gas to bridge $10-15 of ETH, and this ratio is tight across all wallets, it suggests a script with a fixed transaction template. Organic users show much higher variance in this ratio because they are making different decisions about urgency, amount, and timing.
Failure mode 5: gas address reuse through multicall.
Some operators use multicall contracts to batch interactions across wallets from a single transaction. This is visible on-chain as a single transaction that emits events for many wallets, making the cluster trivially obvious. More subtly, even without multicall, if the same msg.sender (a deployer or coordinator address) appears in the transaction ancestry of many wallets, that is a funding graph cluster.
Counter-strategy context. Operators who have successfully avoided gas fingerprinting in production generally treat each wallet’s gas configuration as a separate concern, sourcing gas estimates from different providers at different times, applying noise drawn from distributions that approximate organic behaviour, and testing their transaction metadata against Dune queries before running at scale. Some practitioners discuss browser-based signing workflows specifically to get consumer wallet gas estimation behaviour. The antidetect and multi-account operations space has written about this from the browser angle, and sites like antidetectreview.org document what the tooling looks like at the infrastructure layer, though the on-chain gas discipline is a separate concern that browser tools do not handle by default.
what we learned in production
Running multi-wallet operations at any meaningful scale for long enough, the lesson that took me the longest to internalise is that chain analysis is retrospective and comprehensive. You are not trying to hide in real-time. You are trying to make sure that when a protocol runs their Dune query six months after the activity, your wallets do not surface in the same cluster. This changes how you think about what matters. Timing gaps that feel significant when you are running the script, like waiting a week between wallet creation and first interaction, are almost irrelevant if the wallets share gas parameters, because the clustering query does not care about calendar time. Conversely, getting gas parameters right from day one creates genuine entropy that persists permanently in the transaction history.
The second lesson is that the signal threshold for protocols actually matters. Not every protocol runs rigorous clustering. Some do a coarse funding graph analysis and stop there. Others run full gas parameter analysis with statistical clustering algorithms. Knowing which tier of analysis a given protocol’s team is likely to run, based on their team size, their stated commitment to sybil resistance, and whether they have partnered with a firm like Nansen for distribution analytics, tells you how much obfuscation budget to spend. A small protocol with two engineers is not running zkSync-specific parameter clustering. A well-funded L2 doing a nine-figure token distribution almost certainly is, or they are paying someone to do it for them.
The operational checklist that emerged from this for my own setups: each wallet cohort uses a different RPC provider with rate-limited per-wallet API keys, gas values are sourced fresh per transaction with log-normal noise added to the tip component, gas limits use the actual eth_estimateGas result with a randomised buffer between 10-40% rather than a hardcoded value, and L2-specific parameters are set per-chain from the chain’s own documentation defaults rather than library fallbacks. None of this is technically complex. All of it is easy to miss when you are focused on the application layer.
For related context on how wallet hygiene intersects with broader operational security in airdrop farming, the wallet setup and hygiene guide on this site covers the baseline practices, and the LayerZero airdrop retrospective has specific notes on what the sybil bounty program actually caught. If you are new to thinking about on-chain identity separation as a concept, the on-chain identity separation primer is worth reading first.
references and further reading
-
Ethereum Foundation: Gas and Fees - the canonical reference for how gas works at the protocol level, including EIP-1559 mechanics.
-
EIP-1559: Fee market change for ETH 1.0 chain - the original specification document, relevant for understanding exactly what fields were added and how the fee burn mechanism works.
-
Dune Analytics - the primary platform where most protocol teams and independent researchers run their clustering queries. Searching for “sybil” in the Dune query explorer surfaces dozens of real methodologies being used in production.
-
Nansen - wallet labelling and analytics platform used by protocol teams for distribution analysis. Their smart money and entity labelling data feeds into many distribution eligibility reviews.
-
multiaccountops.com/blog/ - practitioner-level writing on multi-wallet operations infrastructure, including RPC management and scripting patterns relevant to the gas obfuscation problem.
Written by Xavier Fok
disclosure: this article may contain affiliate links. if you buy through them we may earn a commission at no extra cost to you. verdicts are independent of payouts. last reviewed by Xavier Fok on 2026-05-19.