50% of the time, it’s mixed-case all the time | Understanding Ethereum Checksums
Have you ever wondered why your Ethereum address on different wallets are shown as lower case, upper case, and sometimes a combination of both?
now to make matter more complicated, take this example- try to send ETH on Testnet via Metamask to this address:
you should be able to send to this wallet without any problem. However, let’s change it up and try to send fund to this address:
see the difference? it’s the same address, but with the last character is capitalized. No matter, it should be just another wallet address right? Aren’t all 40 characters length hexadecimal valid Ethereum addresses? we should be able to transfer to it without any trouble
At this point, Metamask throw an error. The input address is invalid. Why is that?
Let’s explore the principle behind checksum addresses. Generally speaking, checksum is a method to verify data integrity of the blob of data that are transmitted on the internet. In our case, we want to ensure that the address that we have copied and pasted (or scanned from a QR code) is not malformed in anyway.
Take the previous example again, if we modify the address
0xc1912fEE45d61C87Cc5EA59DaE31190FFFFf232d even with a slight change, the Checksum parser on a wallet will quickly throw an error. This is very helpful in protecting our users from any human error, The more input validation the better. Imagine if someone has sent all their ETH to a wrong address because they made a typo!
How does it work?
Let’s take a deeper dive into Ethereum specific checksum. Simply put, checksum on an address is achieved through a particular method of encoding. By utilizing hash function (in our case, Keccak256) and compare each hexadecimal character to the corresponding hash. In layman’s term:
For each index of hexadecimal character in an address, if the corresponding index of the character in the hash is bigger than 7 in base 16 integer, make that character an uppercase letter.
Well that sounds confusing, let’s build a checksum encoder from scratch, so that we can get a better understanding ourselves! The implementation will be in Golang, utilizing all standard library no additional package from Go-Ethereum so that we can get better clarity without unnecessary abstractions.
Let’s see what’s happening here:
- first we sanitize the address by removing ‘0x’ and make the entire address lowercase
- we decode the string into hexadecimal bytes
- we apply keccak256 hash on the hexadecimal byte data, that gives us the 256 bit hash
- now we iterate through the index ‘i’ of the address,
- we check the index ‘i’ of the corresponding hash, if the integer representation of the hexadecimal is over 7, we apply the uppercase letter to the new address
- join all the results and return with a ‘0x’ prefix
This is all simple really, but what sort of benefits do we get?
Given that the corresponding hash string will be deterministically different for all combinations of hexadecimal characters, we could utilize a function like
isChecksum() above. The wallet that handles input validation such as Metamask runs a similar check on our input box to ensure that the input checksum-ed address contains a typo. Consider the test cases below as an example:
TIPS: library such as web3.js has a built-in function that we can use for free: via web3.utils.toChecksumAddress() 
Aside from the functional benefits, this particular type of checksum encoding is nice for compatibility sake. Any non-Ethereum hexadecimal parsers can trivially consume the checksum.
For more details on the specification, please check out EIP55 that documents how the checksum works
The code example above are provided on GitHub
When building a client application that validates recipient addresses, always use checksums! your user will thank you.