Original Title: How I Learned to Stop Worrying & Love Execution Sharding
Video Link: https://www.youtube.com/watch?v=A0OXif6r-Qk
Speaker: Scott Sunarto (@smsunarto) on Research Day
Article Edited by: Justin Zhao (@hiCaptainZ)
Hello everyone, I am Scott (@smsunarto), the founder of Augus Labs (@ArgusLabs_). Today, I want to discuss a topic that we haven't touched on for a while. As roll-ups have become mainstream, our discussions about execution sharding have not been as frequent as those about data sharding. So, let's revisit this somewhat overlooked topic—execution sharding.
This will be a light conversation. I know everyone has been listening to complex concepts all day, so I will try to make this discussion as practical as possible. I have prepared a suitable slide design for this talk.
For those who don't know me, interestingly, I am known as the anime girl on Twitter. I also missed my college graduation ceremony to come here, which made my parents very unhappy. Currently, I am the founder of Argus Labs. We consider ourselves a gaming company rather than an infrastructure or cryptocurrency company. One of my biggest frustrations is that everyone in the crypto gaming space wants to build tools, but no one wants to create content or applications. We need more applications that users can actually use.
Previously, I co-created Dark Forest (@darkforest_eth) with my smart friends Brian Gu (@gubsheep) and Alan Luo (@alanluo_0). Brian is now running 0xPARC (@0xPARC), and he is much smarter than I am.
Today's discussion will focus on execution sharding, but in a context where most people are not familiar with discussions about execution sharding. We usually discuss execution sharding in the context of Layer 1, such as Ethereum sharding or Near sharding. But today, I want to change the context. Let's think about what sharding would look like in a roll-up environment.
The fundamental question here is, why would a gaming company want to build its own roll-up, and what can we learn from World of Warcraft to design roll-ups? Additionally, we will explore how the design space of roll-ups far exceeds the current reality.
To answer these questions, let's go back to 2020 when the idea of Dark Forest was first conceived. We asked ourselves, what if we created a game where every game action is an on-chain transaction? At that time, this premise seemed absurd, and it still does for many today. But it was an interesting hypothesis, so we built it, and Dark Forest was born.
Dark Forest is a fully on-chain space exploration MMORTS game based on Ethereum, powered by ZK-Snarks. Back in 2020, ZK was not as popular as it is today, as there was almost no documentation available. The only available documentation for Circom was Jordi Baylina's (@jbaylina) Google Docs. Despite the challenges, we learned a lot from the process, and Dark Forest is a manifestation of those learnings.
Dark Forest is a larger experiment than we imagined. We have over 10,000 players who have spent trillions of gas, and the game is filled with chaos, with people backstabbing each other on-chain. The most fascinating aspect of Dark Forest and on-chain gaming is its platforming nature. By having a fully on-chain game, you open the door to a design space for emergent behaviors, allowing people to build smart contracts that interact with the game, as well as alternative clients and game modes, such as Dark Forest Arena and GPU miners.
However, with great power comes great responsibility. When we launched Dark Forest on what is now known as Gnosis Chain (formerly xDai), we ultimately filled the entire block space of the chain. This made the chain essentially unusable for anything else, including DeFi, NFTs, or any other xDAI-related activities.
So, what now? Are we at a dead end? Will fully on-chain games never become a reality? Or are we going to go back to making games that only have JPEGs on-chain and convince people that money grows on trees? The answer is that we let software do the work. Many of us have a very rigid view of blockchains and roll-ups, as if there is not much room for improvement. But I disagree. We can experiment and discover new possibilities.
We asked ourselves a question: if we were to design a blockchain from scratch solely for games, what would it look like? We need high throughput, so we need to scale reads and writes. Most blockchains are designed to handle a lot of writes. Transactions per second (TPS) is a metric that people boast about, but reading is equally important. If you can't read from the blockchain node, how do you know where the players are? This is actually the first bottleneck we discovered in blockchain construction.
Dark Forest encountered a problem where full nodes were heavily utilized, and I/O exploded because we needed to read data from the on-chain state. This led to thousands of dollars in server costs, which the xDAI team generously covered for us. However, this is not ideal in the long run. We need high throughput not only for transactions written per second but also for reads, such as fetching data from the blockchain itself.
We also need a horizontally scalable blockchain to avoid the Noisy Neighbor problem. We don't want a popular game to suddenly start having issues on the blockchain, stopping all work. We also need flexibility and customizability so that we can modify the state machine to suit game design. This includes having a game loop that is self-executing, and so on.
Last but not least, for those unfamiliar with online game architecture, this may be a bit vague, but we need a high tick rate. Ticks are the atomic units of time in the game world. In the context of blockchain, we have blocks as atomic units of time. In games, we have ticks. This is almost analogous when you build a fully on-chain game, where the tick or block generation speed of your blockchain equals the tick of the game itself.
So, what we need is a blockchain that is high throughput, horizontally scalable, flexible, customizable, and has a high tick rate. Such a design is necessary to meet the needs of a blockchain designed from scratch for games.
If you have a higher tick rate or more blocks per second, the game will feel more responsive. Conversely, if your tick rate is low, the game will feel sluggish. One key thing to remember is that if there is a delay in block generation, you will feel a noticeable delay in the game. This is a terrible experience. If you've ever dealt with angry players shouting at their computers because they lost a game, that's an absolutely terrible situation.
Currently, our rollups generate one block per second, which corresponds to one tick. If we want to have cooler games, we need a higher tick rate. For example, Minecraft, a simple pixel art game, has 26 ticks per second. We have a long way to go to build games that are as responsive as Minecraft.
One possible solution is to deploy our own rollup. While it superficially seems to solve the problem, it doesn't actually address the root cause of the issue. For example, you would have higher write throughput, but it wouldn't fully meet the needs of the game. Of course, if your game has a hundred players, that would be sufficient. However, if you want to build a game that requires higher throughput, there will be very strict limitations due to the current I/O methods in construction.
In terms of reading, you don't really get a performance boost. You still need to rely on indexers. You don't have true horizontal scalability. If you try to launch a new rollup to horizontally scale your game, you will disrupt your existing smart contract ecosystem. Marketplaces deployed by players will not be able to work with other chains you launch to horizontally scale the game. This will raise many issues.
Finally, achieving a high tick rate and blocks per second is still somewhat challenging. While we can push for it, we might get two blocks per second, maybe three, but that's really the farthest these blockchains can go because there are a bunch of things like re-marshalling that heavily depend on computational cycles.
To address this issue, we looked back to the early 21st century and the late 1990s when online games like MMOs were just emerging. They had a concept called sharding. This is not a new concept; it has existed in the past. The term "sharding" we use in database architecture actually comes from a reference in Ultima Online. They were the first to use the term "sharding" to explain their different servers.
So, how does sharding work in games? It is not a one-size-fits-all solution. It is a tool in the toolbox, and how you adapt it to your game depends on the specific circumstances. For example, the first sharding construct is what I like to call location-based sharding. A good mental model is to imagine a Cartesian coordinate system divided into four quadrants, each with its own game shard. Every time you want to cross a shard, you send a communication to another shard saying, "Hey, I want to move there," and then you are teleported to your shard, leaving your previous player body behind. By doing this, you distribute the server workload across multiple physical instances instead of forcing one server to do all the computations for the entire game world. The second construct is now more popular. It is called multi-universe sharding, where you have multiple game instances mirroring each other. You can choose any shard you want to go to, and it is load-balanced by default, so that no server becomes overcrowded.
Now, the key question is, how do we bring this concept into rollups? This is why we created the World Engine. The World Engine is our flagship infrastructure, essentially a sharding sorter designed for launches. Compared to many sharding sorter designs we've seen in previous discussions, our design is different and better suited to our needs. Our optimization direction is: A, throughput, B, we want to ensure that there are no locks preventing runtime to ensure that the tick rate and block time are as efficient as possible, so it is synchronous by default, and we design the sorter in a way that is partially sorted rather than enforcing total ordering (where each transaction needs to happen after another transaction).
The key components here are that we have two main things. We have EVM-based sharding, which is like a pure EVM chain where players can deploy smart contracts, combine with the game, create markets with taxes, and so on. It functions like a normal chain, right? Like one block per second or something, just enough for you to do all your typical devices and market activities.
The secret ingredient here is that we also use a game shard, which is essentially a mini-blockchain designed as a high-performance game server. We have a bring-your-own-implementation interface, so you can customize this shard according to your preferences. You can build your own shard and inject it into the base shard. You just need to implement a standard set of interfaces, similar to what you are familiar with in Cosmos, which has an ABC interface. You can basically integrate this into a similar specification to bring your own shard into the World Engine stack.
The key is that we have a high tick rate that we currently cannot achieve with the existing sharding constructs. This is where I want to introduce Cardinal. Cardinal is the first game shard implementation of the World Engine. It uses an entity-component-system (ECS) with a data-oriented architecture. This allows us to parallelize the game and increase the throughput of game computations. It has a configurable tick rate of up to 20 ticks per second. For blockchain folks here, that's 20 blocks per second.
We can also geo-locate it to reduce latency. For example, you might have a sorter in the US, and then someone in Asia has to wait 300 milliseconds of latency for the transaction to reach the sorter. This is a huge problem in gaming because 300 milliseconds is a long time. If you try to play an FPS game with 200 milliseconds of latency, that's basically, you're already dead.
Another key point that is also very important to us is that it is self-indexing. We no longer need external indexers. We don't need those frameworks to cache game states. This also allows us to build more real-time games without latency issues caused by indexers still trying to catch up with the sorter blocks.
We also have a plugin system that allows people to parallelize ZK verification, etc. The best part, at least for me, is that you can write your code in Go. No more needing to use Solidity to make your game work. If you've ever tried to build a blockchain game with Solidity, that was a nightmare.
However, the key point of our sharding construct is that you can build anything as a shard. They are basically an infinite design space for what a shard can be.
Suppose you don't like writing your game code in Go; you can completely choose another way. However, we are developing a Solidity game shard that allows you to implement games in Solidity, providing the possibility of writing code while retaining many advantages of Cardinal. You can also create an NFT minting shard with a unique memory pool and sorting construct to solve the Noisy Neighbor problem similar to basic minting. You can even create a game identity shard that represents your game identity with NFTs, allowing you to trade game identities easily through NFTs instead of sharing private keys.
This is an advanced architecture, and I won't go into too many in-depth details today due to time constraints. The key is that we allow EVM smart contracts to combine with game shards through custom pick and pass. We created a wrapper around Geth that allows them to communicate with each other, opening up a lot of design space in both directions. We are synchronous by default and can interoperate seamlessly between shards without locking.
Our shared sorter is unique because it does not use a shared sequence construct that prioritizes global ordering of atomic bundles, which requires a locking mechanism and leads to issues like blocking the main thread, resulting in unstable tick rates and block times, causing delays in games. It also limits the block time for each shard and requires various cryptoeconomics and constructs to prevent denial of service. There is also a big problem that I haven't seen mentioned in many VCR sorter constructs: if you have different shards that depend on each other and cause deadlocks, how should you resolve that? With asynchronous design, this is not a problem because everyone is doing what they want to do and then letting it go.
In fact, cross-shard atomic bundles and roll-ups are often unnecessary. For our use case, we don't need anything that requires atomic bundles, and we don't think this is something we should design our Roll-Ups around in terms of use case purity. This also brings many other interesting features. For example, each game shard can have a separate DA layer for the base chain. For instance, you can use the base shard to push data to Ethereum, while the game shard can push data to Celestia (similar to a data availability committee). You can also reduce the hardware requirements for running full nodes because you can run the base shard Geth full node separately without running the game shard node, making it easier to integrate with things like Alchemy.
To summarize, I want to be candid here that many people hope their constructs can solve all problems, but we are not. We believe our construct is useful for us, but it may not be suitable for your use case. Assuming our construct can apply to everyone is unrealistic. For us, it meets our needs, providing high throughput, horizontal scalability, flexibility, and high tick rates, but it does not cure cancer. If you need a DeFi protocol that requires synchronous composability, then this construct may not be suitable for you.
Overall, I sincerely believe in the concept of human-centric blockchain architecture. By designing around specific user roles and use cases, you can make better trade-offs instead of trying to solve everyone's problems. The Renaissance has arrived, and everyone can design their own Roll-Ups to meet their specific needs instead of relying on generic solutions. I think we should embrace the Cambrian explosion. Don't build roll-ups like one-size-fits-all layer ones because it doesn't work to solve the same problems at all. Personally, I look forward to seeing more people explore more Roll-Up design spaces that are tailored to use cases. For example, what would a Roll-Up designed specifically for asset exchange look like? Would it be intent-based? What would a Roll-Up designed specifically for on-chain CLOBs (Central Limit Order Books) look like? With that, I will hand the microphone over to MJ. Thank you for your invitation.
English Version:
https://captainz.xlog.app/Why-does-Argus-Build-FOC-Gaming-INFRA-Using-Sharding