RPC provider fingerprinting and how it sells your sybils out

most operators i talk to spend serious money on residential proxies, antidetect browsers, and separate hardware profiles. they rotate IPs obsessively, use different email domains, and stagger their on-chain timing. then they pipe every wallet through the same Infura project ID.

that one mistake undoes most of the work. RPC providers sit between your wallet software and the blockchain, and they see everything your node client sends, including IP addresses, user agents, timing metadata, and the API key that ties it all together. when you use a single API key across fifty wallets, you have not created fifty separate identities. you have created one identity that controls fifty addresses, and logged that fact in a third-party database outside your control.

the stakes are not hypothetical. major airdrop distributions in 2023 and 2024 used multi-signal clustering approaches that explicitly included off-chain infrastructure signals, not just on-chain behavior. understanding what RPC providers actually collect, and how that data can be used against you, is now table stakes for anyone running at scale.

background and prior art

the JSON-RPC interface for Ethereum nodes has been standardized since the early days of the network. the official Ethereum JSON-RPC API specification defines a set of methods, eth_sendRawTransaction, eth_call, eth_getBalance, and so on, that clients use to read and write chain state. originally, operators were expected to run their own nodes. that is expensive and operationally complex, so managed RPC services filled the gap.

Infura launched in 2016 and was acquired by ConsenSys. Alchemy followed in 2019 with a developer-focused product. QuickNode, Ankr, Blast, and others entered afterward. by 2021, a significant fraction of Ethereum traffic was routed through a small number of managed endpoints. the centralization risk this creates for the network is a separate conversation, but the privacy implications for operators are immediate.

EIP-1193, “Ethereum Provider JavaScript API,” standardized how browser-injected wallets like MetaMask expose an RPC endpoint to dapps. what this means operationally is that when you use MetaMask connected to Infura (its default), every eth_call your browser makes to a dapp goes through Infura’s infrastructure with your browser’s user agent attached. the dapp frontend itself may not see that metadata, but Infura does.

the core mechanism

let me walk through the actual data points an RPC provider collects on a per-request basis, and then explain how those points combine into a fingerprint.

API key

every managed RPC endpoint issues you a project ID or API key. your endpoint URL looks something like https://mainnet.infura.io/v3/YOUR_PROJECT_ID. every request to that URL is attributed to that key for billing and rate-limiting purposes. if you use one key across multiple wallets, every eth_sendRawTransaction from wallet A and wallet B is logged against the same project ID. from the provider’s perspective, they were submitted by the same operator.

IP address and ASN

the source IP of the HTTP request is logged. providers need this for rate limiting and abuse detection. even if you are behind a proxy, the proxy’s IP is what lands in the log. if you are using a datacenter proxy, the ASN will identify it as such. if you reuse the same residential exit node across multiple sessions, those sessions are linked.

user agent and library fingerprint

wallet software and web3 libraries emit user agent strings. ethers.js v5, ethers.js v6, web3.js, viem, and wagmi all have identifiable request patterns. some libraries include version strings in their HTTP headers. MetaMask embeds version identifiers in its RPC calls. if all your wallets run the same version of the same library making requests in the same sequence, the pattern is detectable.

request timing and sequencing

this is the subtle one. when you run a script that iterates through a list of wallets and performs the same operation on each, the request logs show a machine-readable pattern: identical method sequences separated by nearly-identical intervals, all from the same IP, against the same API key. even if the wallets themselves are unrelated on-chain, the off-chain request pattern screams automation.

the combining step

on its own, a shared IP is weak signal. a shared API key is stronger. a shared API key plus identical request sequencing is very strong. these signals compound. an RPC provider does not need to publish a clustering methodology for this data to be operationally dangerous. it exists in their logs, and those logs are subject to their privacy policy, law enforcement requests, and whatever commercial data arrangements they have with analytics partners.

here is a minimal example of what a log entry might look like internally:

timestamp: 2026-03-14T08:42:11.342Z
method: eth_sendRawTransaction
project_id: abc123xyz
source_ip: 185.220.101.47
user_agent: ethers/6.11.1
wallet_derived: 0x4f3eA...

multiply that across fifty wallets with sequential timestamps and one project ID. the clustering writes itself.

worked examples

example 1: the shared Infura key

a friend running a mid-scale farming operation in early 2024 was using a single Infura free-tier key across all wallets in a particular farming cluster. the cluster had 40 wallets, each with a separate MetaMask profile on a separate antidetect browser instance with a separate residential proxy exit. on-chain, the wallets had no overlapping transactions or funding paths. they had been funded through separate CEX withdrawal chains with significant delay.

the protocol’s eligibility check excluded the entire cluster. post-mortem analysis pointed to the fact that all 40 wallets had submitted eth_sendRawTransaction calls from the same Infura project ID within a 48-hour window. the provider-side log was the link the on-chain analysis could not find.

infura’s free tier was the tell. a paid Alchemy Growth plan at $49/month with a separate key per cluster would have been cheaper than losing a 40-wallet allocation.

example 2: viem request signature fingerprinting

viem, the TypeScript library that underlies wagmi and much of the modern dapp stack, has a distinct request pattern. it batches certain calls differently than ethers.js does. specifically, viem uses eth_call with a specific encoding for ERC-20 balance checks that differs from older libraries.

if you run a cluster of wallets all using the same viem version and the same wrapper script, the RPC provider sees identical method sequences: eth_chainId, eth_getBlockByNumber, eth_call (balance), eth_estimateGas, eth_sendRawTransaction in that order, with timing that reflects the same async/await pattern in your code.

this is not hypothetical. blockchain analytics firms like Chainalysis and Nansen have published on the use of infrastructure signals for entity clustering, even if they do not detail the exact RPC-level features they use. the pattern is known.

example 3: QuickNode endpoint reuse across protocol interactions

QuickNode’s paid plans let you create multiple endpoints within one account. some operators create one endpoint per chain but reuse it across all their wallets on that chain. QuickNode’s endpoint analytics dashboard shows per-endpoint request volume and calling patterns. the endpoint itself is the linking identifier.

consider a scenario: you are farming three protocols on Arbitrum. all three use the same QuickNode Arbitrum endpoint. the protocol teams, or their analytics vendors, could in theory correlate wallet activity against the endpoint’s traffic window to identify which wallets are co-located behind the same infrastructure account. this requires either QuickNode cooperation or a data breach, but the structural vulnerability exists.

the mitigation here is straightforward: one endpoint per wallet cluster, or per protocol. QuickNode’s pricing for dedicated endpoints starts at $49/month per endpoint for their Build plan. for a 50-wallet cluster farming meaningful allocations, that is not a significant cost.

edge cases and failure modes

rotating IPs but keeping the API key

this is the most common mistake. operators treat the IP as the primary identifier and the API key as a billing detail. it is the opposite. the IP rotates. the API key is static. if you rotate residential IPs every session but never change the project ID, you have not improved your linkability situation. you have just made the IP-based signal noisy while leaving the API key signal clean.

fix: generate a new API key (or a new provider account) per wallet cluster. some providers allow multiple projects under one billing account. use that. others require separate accounts entirely.

self-hosted nodes with a shared exit

running your own node is the canonical solution. if you run a go-ethereum or erigon node locally, there is no API key and no provider-side logging. but if your wallet software connects to that node through a single network exit point, the IP correlation problem still exists for any counterparty that logs the source of on-chain transactions.

this matters less for pure RPC fingerprinting since the provider is yourself, but it matters for anything that logs the IP of the submitting node, such as mempool monitors, MEV infrastructure, or protocol-level analytics.

library version uniformity

if every wallet in your cluster runs the exact same version of MetaMask or ethers.js, the user agent strings are identical. when those identical strings appear in provider logs alongside the same API key and similar timing, the library version becomes a corroborating signal.

mitigation: vary library versions across clusters, or use a self-hosted RPC relay that strips or normalizes user agent headers before forwarding to the upstream provider.

timing correlation through latency patterns

even with separate API keys, if your wallets all make requests during the same narrow time windows (because your script runs on a cron every hour at :00), and those wallets interact with the same protocol at those times, the timing correlation is exploitable. protocol-side analytics can observe that a set of wallets reliably become active within the same two-minute window, which is a clustering signal independent of RPC provider logs.

randomize your execution windows. a uniform distribution across a 4-6 hour window is much harder to cluster than a tight burst. the antidetect browser guides over at antidetectreview.org’s blog go into more depth on timing randomization at the browser session level, which applies to RPC timing as well.

decentralized RPC is not automatically private

services like Pocket Network and Lava Network offer decentralized RPC where requests are routed through a distributed set of node operators rather than a single company. this eliminates the single-point API key correlation problem. but the individual node operators in the network still see the source IP of the request, and depending on the protocol’s relay design, some metadata may be aggregated. decentralized does not mean zero-knowledge. treat these services as better than centralized but not as a complete solution.

what we learned in production

the most reliable change we made was moving to a one-project-ID-per-cluster model with automated provisioning. when you spin up a new wallet cluster, the provisioning script also creates a new RPC provider project and assigns that endpoint to the cluster’s config. this costs marginally more in management overhead but eliminates the static API key as a linking vector.

the second change was adding a lightweight RPC relay layer between the wallets and the upstream provider. the relay is a small proxy running locally (or on a VPS per cluster) that accepts local RPC calls, strips identifying headers, optionally randomizes timing with a small jitter, and forwards to the upstream endpoint. this handles the user agent normalization problem and gives you a single place to rotate upstream providers without touching wallet configurations. there are open-source implementations of this pattern, though most operators build simple nginx or node.js proxies. for a good baseline on the proxy infrastructure side, proxyscraping.org’s blog has useful coverage of proxy chaining setups that translate well to the RPC relay use case.

the third, harder lesson: self-hosted nodes are worth the operational investment at any meaningful scale. a full Ethereum archive node is impractical for most operators, but a full node on the chains that matter most (Arbitrum, Base, Optimism) is achievable. each of those chains has a lighter resource footprint than mainnet. running your own node means your RPC traffic never enters a third-party log. the cost of a dedicated server with enough SSD for a full node on an L2 is roughly $80-150/month depending on provider, which is competitive with running multiple paid RPC plan subscriptions across your clusters. for related context on multi-account infrastructure setups, multiaccountops.com’s blog has operator-level writeups that are worth reading alongside this.

a note on what this does not solve: on-chain clustering by wallet behavior remains a separate problem. RPC provider fingerprinting is one signal in a multi-signal detection system. cleaning up your RPC hygiene removes one layer of exposure but does not address gas funding patterns, on-chain timing correlation, or transaction graph analysis. those are covered separately in the wallet clustering detection methods deep-dive and the airdrop sybil checklist on this site.

the detection surface is wide. treat each signal source as independent. RPC provider logs are one of the easiest to control, which is why it is frustrating to see operators neglect it while spending money on more complex mitigations.

references and further reading

Ethereum JSON-RPC API specification - the canonical reference for the RPC methods your wallet software calls. understanding what methods your toolchain emits is the first step to understanding your fingerprint surface.
EIP-1193: Ethereum Provider JavaScript API - the standard that defines how browser wallets expose RPC to dapps. relevant for understanding the browser-based attack surface.
Alchemy API reference overview - Alchemy’s documentation on their API structure, including rate limiting and project ID scoping. useful for understanding what metadata is captured at the API layer.
QuickNode endpoint statistics documentation - documents what per-endpoint analytics QuickNode exposes to account holders, which gives a sense of what is logged.
Infura privacy policy - ConsenSys’s privacy policy, which governs what Infura collects and how it may be shared. reading provider privacy policies is tedious but operationally relevant.

also relevant: the residential proxy setup guide for web3 operations on this site covers the IP-layer hygiene that complements the RPC-layer hygiene discussed here.

Written by Xavier Fok

disclosure: this article may contain affiliate links. if you buy through them we may earn a commission at no extra cost to you. verdicts are independent of payouts. last reviewed by Xavier Fok on 2026-05-19.