Building a Blockchain in Python

TK Kaushik Jegannathan 23 Jun, 2022 • 6 min read

This article was published as a part of the Data Science Blogathon.

Introduction on Blockchain

Web3 is the latest buzzword in the world of technology. Web3 revolves around the concept of a decentralized web, primarily built using blockchain. Blockchain has been around for a while now and came into the limelight because of Bitcoin. Many get confused with the terms bitcoin and blockchain and consider them to be the same, but bitcoin is the best implementation of blockchain technology. A blockchain is a type of decentralized database that is immutable, persistent, and tamper-proof that contains data as a block encrypted using hashing algorithms. The advantage of blockchain technology is that any type of data can be stored in it, but it is mostly used to store transaction details and acts as a digital ledger. The idea of a decentralized architecture in both Web3 and blockchain makes sure that the data is not owned by a single person or an entity, which has been the downside of Web2.

Each block in a blockchain is unique and contains a hash value, which can be used for differentiating each block. Fingerprinting is the concept used for linking blocks on a blockchain. As new blocks keep getting added to the end of the blockchain, the hash of the penultimate block is used for building the hash for a new block, which makes the blocks in a blockchain tamper-proof.

In this article, we will be building a simple blockchain in python that will store some text information from users. The blockchain technology used in the industry is far more complex than the blockchain that we will be building in this article, but this is enough to understand blockchain technology from an implementation perspective.

Each block on our blockchain will have the following properties:

  1. Index – This is an auto-incremented number used to recognize a block in the blockchain.
  2. Sender – The user who created the block on the blockchain.
  3. Timestamp – The time at which the block was created.
  4. Previous hash – The hash value of the preceding block in the chain. This is useful in verifying the integrity of a blockchain by fingerprinting and linking the blocks in the blockchain.
  5. Hash – This is a hash generated using all the above-mentioned properties present in a block. This property uniquely identifies a block in a blockchain.
  6. Nonce – This is a number that helps in creating the hash as per the difficulty requirement set for the blockchain.

The first block of a blockchain is called the genesis block. From the above explanation, it can be derived that the preceding hash cannot be extracted from the blockchain as the blockchain is empty. In such a case, the preceding hash is generated using some secret specified by the creator of that blockchain. This ensures that all the blocks in a blockchain have a similar structural schema. Each blockchain has a difficulty level associated with it. It specifies the number of digits that need to be 0 in the hash. To satisfy this condition, we have the nonce property, which is a whole number that helps in generating the hash with the specified number of preceding zeros. Since the hashing algorithm used in most blockchain technology is SHA256, it is almost impossible to find the nonce by pre-calculating the hash value. Trial-and-error is the only way to calculate the nonce, which makes it computationally expensive and time-consuming. We need to run a for loop to calculate the nonce value. The process of guessing the nonce that generates the hash as per the requirements is called blockchain mining and is computationally expensive and time-consuming but is necessary to add a block to the blockchain. We will set the difficulty level to 4 for our blockchain and so the first 4 letters of each hash should have ‘0000’. For mining bitcoin, the difficulty level is set to 30, and mining each block in bitcoin roughly takes 10 minutes.

blockchain Hashing algorithm

Now that we have a basic understanding of the blockchain that we will be building in this article, let’s get our hands dirty and start building.

Importing Modules

There is nothing fancy here; all the modules used in building the blockchain are native python modules and can be directly imported without having to install them using pip. We will be using the hashlib module for performing SHA256 hashing while the time module will be useful to fetch the block generation time.

import hashlib  
from time import time
from pprint import pprint

Building the Blockchain

We will define a class called blockchain with two properties, namely blocks and secret. The blocks property will store all the blocks on the blockchain while the secret variable will be used for building the previous hash for the genesis block. We will define three functions, namely create_block, validate_blockchain, and show_blockchain. The create_block function will be used to create a new block and append it to the block’s property in the blockchain. The properties of each block explained earlier will be implemented here. The nonce that satisfies the blockchain requirement of having four zeros preceding each hash will be calculated. The validate_blockchain function will be used to validate the integrity of the blockchain. This means that it will check the fingerprinting on each block on the blockchain and tell us if the blockchain is stable or not. Each block should contain the correct hash of the previous block. If there are any discrepancies, it is safe to assume that someone has meddled with the blocks on the blockchain. This property makes blockchains immutable and tamper-proof. Finally, the show_blockchain function will be used to display all the blocks on the blockchain.

class blockchain():
    def __init__(self):
        self.blocks = []
        self.__secret = ''
        self.__difficulty = 4 
        # guessing the nonce
        i = 0
        secret_string = '/*SECRET*/'
        while True:
            _hash = hashlib.sha256(str(secret_string+str(i)).encode('utf-8')).hexdigest()
            if(_hash[:self.__difficulty] == '0'*self.__difficulty):
                self.__secret = _hash
                break
            i+=1
    def create_block(self, sender:str, information:str):
        block = {
            'index': len(self.blocks),
            'sender': sender,
            'timestamp': time(),
            'info': information
        }
        if(block['index'] == 0): block['previous_hash'] = self.__secret # for genesis block
        else: block['previous_hash'] = self.blocks[-1]['hash']
        # guessing the nonce
        i = 0
        while True:
            block['nonce'] = i
            _hash = hashlib.sha256(str(block).encode('utf-8')).hexdigest()
            if(_hash[:self.__difficulty] == '0'*self.__difficulty):
                block['hash'] = _hash
                break
            i+=1
        self.blocks.append(block)
    def validate_blockchain(self):
        valid = True
        n = len(self.blocks)-1
        i = 0
        while(i<n):
            if(self.blocks[i]['hash'] != self.blocks[i+1]['previous_hash']):
                valid = False
                break
            i+=1
        if valid: print('The blockchain is valid...')
        else: print('The blockchain is not valid...')
    def show_blockchain(self):
        for block in self.blocks: 
            pprint(block)
            print()

Now that we have built the blockchain class, let’s use it to create our blockchain and add some blocks to it. I will add 3 blocks to the blockchain and will validate the blockchain and finally print the blocks and look at the output.

Python Code:

We can see the blocks present on the blockchain and that the validate_blockchain function returns true. Now let’s meddle with our blockchain and add a new block somewhere in-between the blocks of the blockchain and run the validate_blockchain function to see what it returns.

block = {
    'index': 2,
    'sender': 'Arjun',
    'timestamp': time(),
    'info': 'I am trying to tamper with the blockchain...'
}
block['previous_hash'] = b.blocks[1]['hash']
i = 0
while True:
    block['nonce'] = i
    _hash = hashlib.sha256(str(block).encode('utf-8')).hexdigest()
    if(_hash[:4] == '0'*4):
        block['hash'] = _hash
        break
    i+=1
b.blocks.insert(2, block)
b.show_blockchain()
b.validate_blockchain()

This is the output we get.

{'hash': '0000bfffcda53dc1c98a1fbaeab9b8da4e410bbcc24690fbe648027e3dadbee4',
 'index': 0,
 'info': 'Python is the best programming language!!',
 'nonce': 91976,
 'previous_hash': '000023ae8bc9821a09c780aaec9ac20714cbc4a829506ff765f4c82a302ef439',
 'sender': 'Ram',
 'timestamp': 1654930841.4248617}
{'hash': '00006929e45271c2ac38fb99780388709fa0ef9822c7f84568c22fa90683c15f',
 'index': 1,
 'info': 'I love cybersecurity',
 'nonce': 171415,
 'previous_hash': '0000bfffcda53dc1c98a1fbaeab9b8da4e410bbcc24690fbe648027e3dadbee4',
 'sender': 'Vishnu',
 'timestamp': 1654930842.8172457}
{'hash': '000078a974ba08d2351ec103a5ddb2d66499a639f90f9ae98462b9644d140ca9',
 'index': 2,
 'info': 'I am trying to tamper with the blockchain...',
 'nonce': 24231,
 'previous_hash': '00006929e45271c2ac38fb99780388709fa0ef9822c7f84568c22fa90683c15f',
 'sender': 'Arjun',
 'timestamp': 1654930848.2898204}
{'hash': '0000fe124dad744f17dd9095d61887881b2cbef6809ffd97f9fca1d0db055f2a',
 'index': 2,
 'info': 'AI is the future',
 'nonce': 173881,
 'previous_hash': '00006929e45271c2ac38fb99780388709fa0ef9822c7f84568c22fa90683c15f',
 'sender': 'Sanjay',
 'timestamp': 1654930845.594902}
The blockchain is not valid...

We can see that the validate_blockchain function returns false because there is some mismatch in the fingerprinting and hence the integrity of our blockchain has been compromised.

Conclusion on Blockchain using Python

In this article, we discussed the following:

  • What is blockchain and how to build a blockchain using Python?
  • Properties of blockchain
  • Fingerprinting in blockchain
  • Difficulty level and nonce in blockchain
  • Building our own blockchain
  • Tampering with the blocks
  • Checking for the integrity of the tampered blockchain

To continue this project further, the blockchain can be hosted and deployed as a REST API server on the cloud that can be used by the users to store information on the blockchain. Obviously, our blockchain is not distributed for the sake of simplicity. If you are really interested in using blockchain technology for your database, feel free to look at BigchainDB, which is a decentralized blockchain database. It provides support for both python and nodejs. Alternatively, GunDB is a popular graph-based decentralized database engine being used in web3 applications in recent times.

That’s it for this article (building blockchain using Python). Hope you enjoyed reading this article and learned something new. Thanks for reading and happy learning!

 The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Related Courses

image.name
0 Hrs 70 Lessons
5

Introduction to Python

Free