The Graph and Subgraphs: How Ethereum Data Gets Indexed
Mar, 10 2026
Blockchain data is public. Every transaction, every token transfer, every NFT sale - it’s all out there on Ethereum. But here’s the problem: finding it is like searching for one specific grain of sand on a beach the size of Texas. You could write a script that scans every block from day one, but that would take hours, cost thousands in compute, and break every time the chain reorganizes. That’s where The Graph is a decentralized protocol that indexes and queries blockchain data, turning raw Ethereum events into structured, searchable APIs. Also known as Web3’s Google, it was launched in 2020 to solve the fundamental issue that while Ethereum data is open, it’s nearly impossible to query efficiently without heavy infrastructure.
What Exactly Is a Subgraph?
A subgraph is the heart of The Graph. Think of it as a custom API you build to pull out exactly the data you care about - like all NFT transfers for a specific collection, or every swap on Uniswap over the last 30 days. Subgraphs aren’t just filters. They’re structured definitions of how to turn raw blockchain events into clean, relational data you can query with GraphQL.
Every subgraph has three core parts:
- A manifest file (in YAML): This tells The Graph which smart contracts to watch, which events to listen for (like Transfer or Mint), and which handler functions to run when those events happen.
- A GraphQL schema: This defines the shape of your data. You say things like “User has an address, owns NFTs, and made 12 trades.” The Graph turns this into a database table structure.
- Mapping code (in AssemblyScript): This is where the magic happens. You write functions that take an event - say, someone transferring an NFT - and turn it into a User entity, an NFT entity, and link them together.
For example, when a Transfer event fires on a CryptoPunks contract, your mapping function might:
- Load the existing NFT record (by ID).
- Check if the sender is a known user - if not, create one.
- Update the NFT’s owner field.
- Record the timestamp and transaction hash.
After that, you can query it like: query { nfts(where: {owner: "0x..."}) { id, uri } } - and get results in milliseconds.
How The Graph Works Behind the Scenes
The Graph doesn’t run on one server. It’s a network of independent node operators called Indexers are node operators who stake Graph Tokens (GRT) to index subgraphs and serve queries, earning fees and rewards. These Indexers compete to serve data faster and more reliably. When a dApp like a NFT marketplace needs to show all past sales for a collection, it doesn’t call the smart contract directly. It asks The Graph’s network: “Give me all NFT transfers for contract X since block 18 million.”
Indexers earn money in two ways:
- Query fees: Users pay in GRT to get data.
- Indexing rewards: The protocol distributes new GRT to Indexers who serve high-quality, in-demand subgraphs.
But there’s skin in the game. If an Indexer serves wrong data - say, fake NFT ownership records - they lose part of their staked GRT. This economic incentive keeps the network honest. Curators also play a role: they stake GRT to signal which subgraphs are trustworthy. High-curating signals mean more traffic and rewards for the Indexer.
This is why The Graph is decentralized. No single company controls the data. Alchemy or Infura might offer similar APIs, but they’re centralized. If Alchemy goes down, your dApp breaks. If one Indexer on The Graph fails, others pick up the load.
Why Developers Love (and Hate) The Graph
Let’s talk real-world impact.
A developer at an NFT marketplace told me their frontend used to take 18 seconds to load because it was calling the Ethereum node directly to fetch 10,000+ past sales. Each call was slow. They switched to The Graph. Now it loads in under 1.5 seconds. That’s not a tweak - that’s a product-level upgrade.
Uniswap uses 12 subgraphs. Aave uses 8. Curve uses 5. These aren’t small projects. They’re foundational to DeFi. Without The Graph, building dashboards, analytics tools, or even simple wallet interfaces would be exponentially harder.
But it’s not all smooth sailing.
First, you have to learn AssemblyScript - a TypeScript-like language that compiles to WebAssembly. It’s not JavaScript. It’s stricter. You can’t use Node.js libraries. You have to manually handle memory. One developer on Reddit spent three days just getting their first subgraph to deploy because of a mismatched entity type.
Second, chain reorganizations happen. Ethereum occasionally restructures its last few blocks. If your mapping doesn’t handle this correctly, you might end up with duplicate or missing data. The Graph’s documentation has guides for this, but it’s not beginner-friendly.
Third, the upfront cost. You’re trading long-term performance for short-term complexity. With a centralized API, you get data in minutes. With The Graph, you spend days building, testing, and deploying a subgraph. But once it’s live? It scales. Automatically.
The Graph in Numbers: Adoption and Scale
As of early 2026, The Graph indexes data across 15 blockchains - but Ethereum still drives 68% of all subgraph deployments. Over 1,100 projects rely on it. That includes:
- DeFi protocols (65% of subgraphs)
- NFT platforms (22%)
- DAO tools and governance dashboards (8%)
- Gaming and identity apps (5%)
The Graph’s token (GRT) has a circulating supply of over 8.5 billion. Daily queries have grown from 850 million in Q1 2023 to over 1.2 billion today. That’s not just traffic - it’s infrastructure.
And the trend is clear: the decentralized web needs this layer. Centralized APIs are fast, but they’re single points of failure. The Graph’s network is slow to build, but it’s resilient. It’s censorship-resistant. It’s owned by its users.
What’s Next for The Graph
The Graph isn’t standing still. In 2023, it launched the “Accelerate Program” - a $25 million fund to support subgraph development on emerging chains like Arbitrum, Polygon, and zkSync. In Q2 2024, it’s sunsetting its hosted service. That means no more free, centralized indexing. All subgraphs must move to the decentralized network. It’s a bold move - one that forces developers to fully embrace decentralization.
Future upgrades include SP1 integration - a zero-knowledge proof system that will let The Graph prove data integrity without revealing the full dataset. Imagine proving you own an NFT without showing your wallet address. That’s the next frontier.
Still, challenges remain. Critics point out that if query volume drops during a bear market, Indexers might not earn enough to cover costs. The token economics are tied to usage. If adoption slows, the incentive structure could strain.
But right now, the data doesn’t lie. The Graph is the backbone of Web3 data. It’s not flashy. It doesn’t make headlines. But every time you check your NFT collection, view a DeFi position, or track a DAO vote - you’re using The Graph.
Is The Graph Right for You?
If you’re building a dApp and need to show historical data - trades, balances, ownership, events - then yes. The Graph is the only scalable solution. If you’re just experimenting, maybe start with a centralized API like Alchemy. But if you want your project to last, to be trustless, to be truly decentralized - then learning subgraphs isn’t optional. It’s essential.
Start with The Graph Studio. Build a simple subgraph for a test contract. Watch how events turn into queries. You’ll see why this isn’t just another tool - it’s the plumbing behind the future of blockchain apps.
What is The Graph used for?
The Graph is used to index and query blockchain data efficiently. It allows developers to build decentralized applications (dApps) that can retrieve historical on-chain data - like token transfers, NFT ownership, or DeFi trades - without scanning every block. Instead, it serves structured data through GraphQL APIs, making dApps faster and more reliable.
How do subgraphs work on Ethereum?
Subgraphs on Ethereum are defined by a manifest file, a GraphQL schema, and mapping code written in AssemblyScript. The manifest tells The Graph which smart contracts and events to monitor. The schema defines the data structure. The mapping code processes each event - like a token transfer - and converts it into stored entities (e.g., User, NFT). Indexers then serve these entities via GraphQL queries.
Do I need to know AssemblyScript to use The Graph?
Yes, if you’re building your own subgraph. AssemblyScript is required to write mapping functions that process blockchain events. It’s similar to TypeScript but compiled to WebAssembly. However, if you’re only using existing subgraphs (like Uniswap’s), you don’t need to write code - just query the GraphQL endpoint.
What’s the difference between The Graph and Alchemy?
The Graph is decentralized: data is indexed and served by a network of staked node operators, making it censorship-resistant and trustless. Alchemy offers centralized APIs - fast, easy to use, but controlled by one company. If Alchemy goes offline, your app breaks. If one indexer on The Graph fails, others take over. The Graph is slower to set up but more resilient long-term.
Can I build a subgraph for any Ethereum contract?
Yes, as long as you have the contract’s ABI (Application Binary Interface) and it emits events. Even contracts without events can be indexed, but it’s harder - you’d need to scan transaction data manually. The Graph works best with contracts that emit clear, well-documented events like Transfer, Mint, or Burn.
What happens if The Graph network goes down?
It won’t go down like a centralized service. The Graph runs on hundreds of independent Indexers. If one fails, others continue serving data. Even if many go offline, the network self-corrects: subgraphs with high stake and curator signals get prioritized. The protocol’s economic design ensures redundancy. There’s no single point of failure.