Exploring the Challenges of Building a zkEVM. ELI5 Edition.
Introduction
Zero-Knowledge Technology isn’t a new concept and in fact, it has been built and effective-- most privacy blockchains use Zero-Knowledge Technology.
However, the difficulty is when it’s being applied to smart contracts. That’s something new and tasking, it’s a stretch per se but it has to be done.
In this article, we’ll look at the difficulties involved in building a zkEVM
Building a zkEVM is a challenging task due to the design of the Ethereum Virtual Machine (EVM) and the requirements of Zero-knowledge proof systems. An EVM is the backbone of the Ethereum blockchain, it allows the foundation of the creation, compilation, and execution of smart contracts (computer codes that facilitate the exchange of money and info).
Deficiencies in Stack-based Architecture
One problem is the EVM's stack-based architecture, which can increase the difficulty of proving computation.
A stack is a data structure consisting of homogeneous elements and uses the last in first out (LIFO) principle- the last data added to a stack will be the first to be removed.
Stacks are used to store data in a very specific way: the last piece of data that was added to the stack will be the first one to be removed. It supports a word size of 256 bits. It Allows EVM to facilitate native hashing and elliptic curve operations which ensures funds are safu.
The EVM executes a program by pushing data onto the stack, performing operations on that data, and then popping the data off the stack when it is no longer needed.
“ELI5, please!!”
An Ethereum Virtual Machine (EVM) is a special computer that is used to run programs called smart contracts on the Ethereum blockchain.
Imagine that you have a stack of cupcakes. You can think of each cupcake as a piece of data.
The EVM is like a robot that can take these cupcakes, put them on the stack, and do different things with them. For example, the EVM can take two cupcakes from the top of the stack, add them together, and put the result back on the stack.
The EVM follows a rule called "last in, first out" (LIFO). This means that the last cupcake you put on the stack will be the first one the EVM takes off. So if you put a cupcake with the number 5 on the stack, and then a cupcake with the number 3 on top of it, the EVM will take off the 3 first and then the 5.
The EVM can do many different things with the cupcakes on the stack, but it can only do a few specific things. It can add, subtract, multiply, and divide the numbers on the cupcake, for example. It can also compare the numbers and do different things depending on whether they are the same or different.
The EVM is very good at following instructions and doing simple math, which makes it perfect for running smart contracts on the Ethereum blockchain.
ZK projects using a Register-based Model over Stack Architecture
To mitigate the stack architecture issue, some zkVMs, such as ZkSync's zkEVM and StarkWare's StarkNet, use a register-based model instead.
There are a few key differences between register-based and stack-based architectures that can influence the choice of one over the other:
1. Speed: Register-based architectures generally have faster access to data because the registers are small and have high-speed memory locations. In contrast, stack-based architectures require more instructions to access data on the stack, which can be slower.
2. Complexity: Register-based architectures tend to be more complex than stack-based architectures because they require more registers to store data and more instructions to manipulate the data in the registers.
3. Code size: Stack-based architectures can often generate smaller code sizes than register-based architectures because they require fewer instructions to manipulate data on the stack.
4. Ease of use: Stack-based architectures are generally easier to use because they have a simple instruction set and do not require the programmer to manage the use of registers.
In general, register-based architectures are more suitable for high-performance computing applications that require fast access to data, while stack-based architectures are more suitable for simplicity and ease of use. The choice of one over the other depends on the specific requirements of the application.
Opcode Issues
The EVM also uses special opcodes (a special instruction that tells the EVM to do something) for certain operations, such as error handling and program execution, which adds complexity to the proving process.
An opcode is a special instruction that tells the Ethereum Virtual Machine (EVM) to do something. The EVM uses opcodes to perform different operations, such as adding two numbers together or comparing two numbers to see if they are the same.
Sometimes, the EVM needs to use special opcodes to do things that are a little more complicated. For example, it might use a special opcode to handle an error, like if something goes wrong while it is running a smart contract. Or it might use a special opcode to stop a smart contract from running if it takes too long or if it is using too many resources.
Using these special opcodes can make it a little harder to prove that the EVM is working correctly. This is because it is more difficult to understand exactly what the EVM is doing when it is using these special opcodes.
Keccak hashing functions and a Merkle Patricia Disadvantage
The EVM's storage layout, which relies on Keccak hashing functions and a Merkle Patricia Trie, also poses challenges for zk proof construction due to the high proving overhead.
Some zkVMs, such as ZkSync, have replaced the KECCAK256 function to reduce this overhead.
The Ethereum Virtual Machine (EVM) uses a special way to store data called a Merkle Patricia Trie. This is a kind of tree structure that is used to organize data in a way that is fast to search and easy to update.
The EVM also uses something called Keccak hashing functions to help keep track of the data in the Merkle Patricia Trie. A hashing function is a way to turn some data into a shorter string of characters called a hash. The EVM uses Keccak hashing functions to turn data in the Merkle Patricia Trie into hashes.
Sometimes, it can be a little tricky to prove that the EVM is working correctly when it is using the Merkle Patricia Trie and Keccak hashing functions. This is because it can take a lot of work to check all of the hashes and make sure that they are correct.
ZKP Generation & Hardware
Generating zero-knowledge proofs is a resource-intensive process that requires specialized hardware and significant investment in time, money, and effort.
For further reading on Zero-Knowledge proof generation and hardware, click on the link below
Conclusion
Despite these challenges, recent advances in zero-knowledge technology have made it possible to mitigate some of these issues and have led to renewed interest in developing zkEVM solutions.