Solidity Storage Variables with Ethers.js | by insurgent | Jul, 2022

Use the Ethers.js library to access strings, dynamic arrays, mappings, structs, and byte-packed variables

Photo by Teng Yuhong on Unsplash

*Scroll to second-half of article for Ethers.js code examples or visit this GitHub repository to view the full code and test files.

Data on the Ethereum Virtual Machine (EVM) is organized using a Modified Merkle Patricia Trie data structure. Each block on the blockchain references four tries: the [global] State trie, storage trie, transaction trie, and receipt trie. The state trie contains EOA (externally owned account) data as a mapping of addresses to ETH balances, whereas smart contract data is stored in the storage trie that points to the state trie.

Smart contract data in the storage trie represents the persistent state of contracts and can be changed with transactions that update the global state. Within a solidity smart contract, dynamic variables are stored in storage, which is persistent. Any variables initialized in the memory are temporal and will be erased before the next external function call is executed. Furthermore, constant variables, which cannot be modified, do not use storage space and, further, use less gas.

Smart contracts in the Ethereum Virtual Machine (EVM) each have their own permanent storage space that contains 32-byte slots in a mapping of key-value pairs (key and value are both 32-bytes).

Fixed-Size Variables of 32-bytes

Fixed-size 32-byte variables, such as strings, uint256, and int256, are assigned an individual storage slot in the order they are listed in the smart contract. In the StorageLayoutOne contract, the constant variable hello does not have a storage slot because it cannot be modified. Variables numOne, goodbye, and num use storage slots 0x0, 0x1, and 0x2, respectively.

contract StorageLayoutOne {​ ​ ​​string constant hello = "hello world"; // no storage
​ ​ ​​uint256 numOne = 1; // slot 0x0
​ ​ ​​string goodbye = "goodbye world"; // slot 0x1
​ ​ ​​int256 num; // slot 0x2
}

Fixed-Size Variables < 32-bytes

Fixed-sized variables that are less than 32-bytes will be byte-packed into a single storage slot when possible. In the StorageLayoutTwo contract, variables lock, byteX, bytesYand bytesZ will all be packed into slot 0x0 (1+1+4+16 = 22 bytes). The next variable, bytesA, is stored in slot 0x1, because it cannot fit in the previous. Finally, variables bytesB and bytesC are packed into slot 0x2.

contract StorageLayoutTwo {​ ​ ​​bool lock; // slot 0x0
​ ​​byte byteX; // slot 0x0
​ ​ ​​bytes4 bytesY; // slot 0x0
​ ​ ​​bytes16 bytesZ; // slot 0x0
​​​ ​ ​​bytes28 bytesA; // slot 0x1​​​ ​ ​​bytes16 bytesB; // slot 0x2
​​​ ​ ​​bytes16 bytesC; // slot 0x2
}

It should be noted that the EVM operates on 32-bytes, so using variables that are less than 32-bytes may result in higher gas costs due to additional conversion operations. However, byte-packing offsets this by allowing the EVM compiler to combine multiple read and write operations on variables within the same storage slot. So, it is important to group less-than-32-byte variables together in the most efficient manner to reduce overall gas costs.

Dynamically-Sized Variables

Dynamically-sized variables, such as dynamic arrays and mappings, that may exceed 32-bytes are hashed into collision-resistant storage locations using the keccak-256 hash algorithm, which pseudorandomly elects a position within a range of 2²⁵⁶ storage slots. In case you are wondering, 2²⁵⁶ =

115792089237316195423570985008687907853269984665640564039457584007913129639936

Due to this expansive storage space (more available slots than stars in the known universe), the EVM can assign storage locations without allocating storage, since each key assignment is light-years away from any other. The EVM does not keep track of unassigned slots and querying one will simply return zero.

Dynamic arrays begin at a storage location determined by the hash of the slot. Subsequent array items are located adjacent to the previous item. When the array items are 16-bytes or less, byte-packing rules apply.

Mappings, contrary to arrays, disperse their data by first concatenating the mapping key with the storage slot and then hashing it to a unique slot.

In the StorageLayoutThree contract, arrayOfNumsslot 0x0 (padded out to 32-bytes) is hashed to find the storage location of the first item, arrayOfNums[0]. For userBalancesthe address of a user is concatenated with slot 0x1 and hashed to provide the location of the value.

contract StorageLayoutThree { ​​ ​ uint[] public arrayOfNums; // slot 0x0 => keccak256(0x0)​ ​ ​​mapping(address => uint256) public userBalances;
​ // slot 0x1 => keccak256(key + 0x1)
}

Furthermore, mappings are often nested inside other mappings and may contain structs. Likewise, arrays can be nested inside other arrays, and it’s also possible to nest mappings inside arrays. When dealing with nested data structures, the data locations can be found with nested keccak-256 hashes. Examples of this are provided in the coding tutorial lower in this article.

Non-Storage Variables

Constants, Enums, Struct definitions, Events, and user defined Errors do not use storage space.

The Ethers-JavaScript library provides many useful tools to interact with smart contracts on the Ethereum blockchain, including utilities to directly access storage variables, which will be explained with code examples below. The full code can be found at this GitHub repository complete with a smart contract example and unit test files.

To install Ether.js, type the following command into the terminal of your project’s root directory:

npm install ethers

Define Reusable Ethers.js Constants

Here’s a few constant definitions to add to the top of our JavaScript file(s) that will help to write cleaner code:

Strings

A 32-byte storage slot can hold up to 32 characters of a string, so if the string you are accessing is longer than 32 characters, it will require reading data from multiple contiguous slots. For strings of 32 characters of less, use the getShortStr function, and for more than 32 characters, use the getLongStr function.

Numbers

256 bit numbers occupy the full storage slot, so they do not require any bit shifting, and should be used as a default integer type unless byte-packing is advantageous.

The getUint256 function will be used in the mappings functions below.

Maps

Unlike strings and numbers, which only require storage slot and contract address arguments, mappings require an addition key argument. The key plus the storage slot are hashed to find the location of the value that corresponds to the key.

For example, a mapping of [EOA] addresses to [uint256] balances would take the arguments: storage slot, contract address, and EOA address. The slot is sliced ​​by 2 to remove the “0x”, since the key already contains “0x” indicating a hexadecimal number.

Mappings to Structs

Initialized structs store data similar to how an array stores data; contagiously and byte-packed when applicable. So, after finding the hash of the slot + key, the type must be determined, since structs often hold various types of data. For this function, I limited type to string, bytes, or number. Finally, we need to elect an attribute in the struct, which is found like an item in an array. The item number will correspond to the order of struct attributes.

Note, this function does not handle byte-packed structs, but assumes struct attributes to occupy full storage slots.

Mappings to Nested Mappings in Structs

This last mapping function deals with a mapping inside a struct inside a mapping. Comparing the arguments to the previous example, “type” has been replaced with “nestedKey.” We don’t need a type, since a mapping maps one type to another type, meaning the final value will be of one type. In this case, the final value is assumed to be a Uint256.

The new argument, nestedKey, refers to key of the mapping within the struct that is within another mapping.

Byte-Packed Slots

Byte-packing occurs when contiguous variables that a less than 32-bytes are packed into a single slot. It is good practice to define all less than 32-byte variables together at the top of the smart contract in the most efficient byte-packing manner to maximize storage space and save on gas.

It’s easy to byte-pack when writing a Solidity smart contract as long as you can do basic math. However, accessing byte-packed variables is a bit trickier, since it requires bit-shifting through the storage slot to find a specific variable.

JavaScript has a difficult time with large numbers, which is why this function is recursively called for byte sizes larger than 6-bytes and returns a concatenated result of the full variable.

The getBytePackedVar function will be used in dynamic array function below.

Dynamic Arrays

Arrays are hashed to a storage location where all items are located contiguously. Once the head of the array is located, subsequent items can be found by slot or bit shifting. If items are 16 bytes or less, the storage slots will be byte-packed.

Understanding solidity smart contract storage is important to write gas efficient, secure, and data-optimized code. Ethers.js provides many useful methods that can be used to access storage variables within a smart contract’s persistent state. Using or modifying the code examples provided above on your own smart contracts will help you better understand the storage level of the EVM.

Leave a Comment