Unit1 - Subjective Questions
CSC203 • Practice Questions with Detailed Answers
Explain the need for Distributed Record Keeping compared to traditional centralized databases.
Distributed record keeping addresses several critical limitations of centralized databases. The key needs include:
- Elimination of Single Point of Failure: In centralized systems, if the main server goes down, the entire system halts. Distributed systems replicate data across nodes, ensuring high availability.
- Trust and Transparency: Centralized systems require trust in a single authority (intermediary). Distributed ledgers allow multiple parties who do not trust each other to agree on the state of the data.
- Censorship Resistance: It is difficult for a single entity or government to shut down or alter records in a truly distributed network.
- Immutability: Once recorded, data in a distributed blockchain is extremely difficult to alter without the consensus of the network, preventing fraud.
Define the Byzantine Generals Problem and explain its significance in the context of distributed systems.
Definition: The Byzantine Generals Problem is a logical dilemma that illustrates how a group of independent entities can agree on a specific action (consensus) even if some of them are unreliable or malicious (Byzantine faults).
The Scenario:
- A group of generals surrounds a city.
- They must attack or retreat simultaneously to win.
- They communicate via messengers.
- Some generals (traitors) may send conflicting messages to confuse the loyalists.
Significance in Blockchain:
It models the challenge of reaching consensus in a decentralized network where nodes (generals) might fail or act maliciously. A system is considered Byzantine Fault Tolerant (BFT) if it can function correctly provided that less than of the nodes are malicious (i.e., , where is total nodes and is faulty nodes).
What are Hash Pointers? How do they differ from standard pointers used in data structures?
Hash Pointers are a fundamental building block of blockchain technology.
Definition: A hash pointer is a data structure that points to where some information is stored together with a cryptographic hash of the information. It is represented as a tuple: .
Difference from Standard Pointers:
- Standard Pointer: Only tells you where the data is (memory address). It offers no guarantee that the data hasn't changed.
- Hash Pointer: Tells you where the data is AND allows you to verify that the data has not been tampered with. If the data at the address changes, the hash derived from it will not match the hash stored in the pointer.
This property allows blockchains to be 'tamper-evident'.
Discuss the Scalability Problems associated with traditional Consensus Algorithms like Proof of Work (PoW).
Consensus algorithms ensure agreement, but they often face the Scalability Trilemma (balancing Security, Decentralization, and Scalability). Major problems include:
- Low Throughput: Bitcoin (using PoW) handles approximately 7 transactions per second (TPS), whereas Visa handles thousands. The consensus mechanism requires time for block propagation and validation.
- High Latency: Confirmation times are long. In Bitcoin, a block is mined every 10 minutes, and typically 6 confirmations (1 hour) are required for finality.
- Storage Bloat: As the ledger grows, every node must store the entire history, increasing storage requirements.
- Network Overhead: Propagating every transaction and block to every node creates a bottleneck as the network size () increases, leading to or communication complexity depending on the protocol.
Describe Satoshi Nakamoto's core concept for a blockchain-based cryptocurrency. How did it solve the Double Spending problem?
Satoshi Nakamoto introduced Bitcoin in 2008 as a peer-to-peer electronic cash system. The core concept combined cryptography, distributed networks, and game theory.
Core Concepts:
- Decentralized Ledger: A public record of all transactions.
- Proof of Work (PoW): A mechanism to make adding blocks costly, preventing spam and rewriting history.
- Longest Chain Rule: Nodes always accept the longest chain of blocks as the valid truth.
Solving Double Spending:
Double spending occurs when a user spends the same digital token twice. Nakamoto solved this by:
- Publicly announcing all transactions to the network.
- Timestamping transactions into blocks.
- Requiring computational work (PoW) to finalize a block.
- If an attacker tries to double spend, they would have to redo the work for that block and all subsequent blocks faster than the rest of the network combined ( attack), which is economically infeasible.
Explain the concept of Merkle Trees and their utility in Blockchain technology.
A Merkle Tree is a binary tree in which every leaf node is labelled with the cryptographic hash of a data block (e.g., a transaction), and every non-leaf node is labelled with the hash of the labels of its child nodes.
Structure:
If we have transactions and :
- Compute and .
- The parent node is .
Utility in Blockchain:
- Efficiency: It allows the network to summarize all transactions in a block into a single Merkle Root, which is stored in the block header.
- Lightweight Verification (SPV): A user can verify if a specific transaction is included in a block by downloading only the block headers and a small 'Merkle path' (logarithmic complexity, ), rather than the entire blockchain.
What are the three essential properties of Cryptographic Hash Functions used in Blockchain?
To be useful in blockchain (like SHA-256), hash functions must satisfy three specific properties:
-
Collision Resistance:
It should be computationally infeasible to find two different inputs and such that . This ensures data integrity. -
Pre-image Resistance (Hiding):
Given a hash output , it should be computationally infeasible to find any input such that . This secures the original data (one-way function). -
Avalanche Effect (Puzzle Friendliness):
A small change in the input (even a single bit) should produce a completely different, unpredictable output hash. This property is crucial for Proof of Work algorithms.
Distinguish between Crash Faults and Byzantine Faults in distributed computing.
These terms describe how nodes in a distributed system can fail.
| Feature | Crash Faults | Byzantine Faults |
|---|---|---|
| Behavior | The node simply stops working or disconnects. It does not send bad data. | The node may behave arbitrarily. It can send conflicting information to different nodes, delay messages, or act maliciously. |
| Difficulty | Easier to handle. The system only needs to detect the silence/absence of the node. | Harder to handle. The system must verify the correctness of data despite active attempts to deceive it. |
| Protocols | Handled by protocols like Paxos or Raft. | Handled by BFT protocols (e.g., PBFT) or PoW. |
| Threshold | Can tolerate up to failure (). | Typically tolerates less than failure (). |
Derive or Explain the condition for Proof of Work mining mathematically.
In Proof of Work (PoW), miners compete to find a value called a nonce that, when hashed with the block header, produces a hash below a specific target value.
Variables:
- : Cryptographic Hash Function (e.g., SHA-256).
- : The content of the block header (transactions, timestamp, previous hash).
- : The nonce (random number).
- : The Target value (determined by network difficulty).
The Condition:
Miners must find such that:
Since the output of is uniformly distributed, the probability of finding such a nonce is . By adjusting , the network ensures the average time to find a block remains constant (e.g., 10 minutes for Bitcoin). Lowering increases difficulty.
Explain the role of Digital Signatures in Blockchain. Which two properties must a digital signature scheme satisfy?
Role in Blockchain:
Digital signatures mimic physical signatures on digital documents. In cryptocurrencies, they are used to prove ownership of assets. When a user sends a transaction, they sign it with their Private Key. The network validates it using the user's Public Key.
Required Properties:
- Unforgeability: It must be computationally infeasible for anyone to generate a valid signature for a message without knowing the private key.
- Non-repudiation: The signer cannot deny having signed the message later, as the signature is mathematically bound to both the message and their public identity.
Standard used: Elliptic Curve Digital Signature Algorithm (ECDSA).
What technologies were borrowed or integrated to create Blockchain technology?
Blockchain was not a single invention but an integration of existing technologies:
- Hash Pointers & Merkle Trees (1979): Ralph Merkle invented these for verifying data integrity efficiently.
- Cryptography (Hash Functions & Digital Signatures): Borrowed from public-key cryptography (RSA, Elliptic Curves) for security and identity.
- Consensus Algorithms (BFT): Concepts from distributed computing (1980s) regarding how to reach agreement in faulty networks.
- Linked Timestamping (1991): Haber and Stornetta proposed linking document hashes to certify when they were created.
- Proof of Work (1997/2002): Originally 'Hashcash' by Adam Back, designed to prevent email spam, repurposed by Nakamoto for mining.
- P2P Networks: File sharing architectures like BitTorrent.
Explain the Longest Chain Rule and how it resolves forks in a blockchain.
The Problem (Forks):
Due to network latency, two miners might find a valid block at roughly the same time. Different parts of the network might receive different blocks first, creating a temporary split (fork) in the chain.
The Longest Chain Rule:
- When a node observes two conflicting chains, it implicitly trusts the chain that has the most cumulative computational work (usually the longest one).
- Miners will try to extend the longest chain they know of.
- Eventually, one chain will grow longer than the other. The blocks in the shorter chain become 'orphaned' or 'stale' blocks and are discarded.
This mechanism ensures that the network eventually converges on a single history of truth (eventual consistency).
Discuss the Scalability Trilemma in the context of distributed ledgers.
The Scalability Trilemma, proposed by Vitalik Buterin, states that it is difficult for a blockchain to simultaneously achieve all three of the following properties:
- Decentralization: The system relies on a large number of participants without a central authority. (Benefit: Censorship resistance).
- Security: The system can resist attacks from a large percentage of malicious power ( attacks).
- Scalability: The system can handle a high throughput of transactions (TPS) and low latency.
Trade-off:
- Bitcoin/Ethereum focus on Decentralization and Security, resulting in low Scalability.
- High-speed chains (like Solana or EOS) often compromise partially on Decentralization (fewer validators) to achieve Scalability.
Describe the Structure of a Block in a typical blockchain.
A block typically consists of two main parts: the Header and the Body.
1. Block Header: Contains metadata. Key fields include:
- Version: Block version number.
- Previous Block Hash: A hash pointer to the previous block (links the chain).
- Merkle Root: A single hash representing all transactions in the block.
- Timestamp: Current time.
- Difficulty Target: The value the hash must be below.
- Nonce: The variable used in Proof of Work.
2. Block Body:
- Transaction Counter: Number of transactions.
- Transactions: The actual list of transaction data (inputs and outputs).
The block header is what is hashed during mining, not the entire large body.
What is a Sybil Attack and how does Blockchain prevent it?
Sybil Attack Definition:
An attack where a single adversary controls multiple fake identities (nodes) on a peer-to-peer network to gain disproportionate influence (e.g., to outvote honest nodes in consensus).
Prevention in Blockchain:
Blockchain prevents this not by checking ID cards (which requires centralization), but by making identity creation expensive.
- Proof of Work (PoW): Influence is tied to computational power (CPU/Electricity). To create a 'Sybil' army, the attacker must buy physical hardware, which is cost-prohibitive.
- Proof of Stake (PoS): Influence is tied to financial ownership (coins held). An attacker must buy a massive amount of the currency to attack the network.
Define Consensus in distributed systems and list the properties a consensus algorithm must satisfy.
Definition: Consensus is the process by which a network of unreliable nodes reaches an agreement on a specific data value or state of the ledger.
Key Properties:
- Agreement: All non-faulty nodes must agree on the same value.
- Validity: If all non-faulty nodes propose the same value , then the agreed-upon value must be (prevents agreeing on garbage).
- Termination (Liveness): All non-faulty nodes eventually decide on a value (the algorithm doesn't run forever).
- Integrity: Every node decides at most one value.
How does the blockchain achieve Immutability using Hash Pointers? Explain with a diagrammatic concept.
Blockchain achieves immutability through the linked-list structure using hash pointers.
Mechanism:
- Block contains the hash of Block .
- Block contains the hash of Block , and so on, back to the Genesis Block.
Tamper Evidence:
If an attacker modifies data in Block :
- The hash of Block changes.
- Block contains the old hash of , so it is now invalid.
- To fix , the attacker must re-hash it, which changes 's hash.
- This cascades all the way to the tip of the chain.
Because of Proof of Work, redoing all this hashing for the entire chain is computationally impossible for an attacker, rendering the history immutable.
What is Digital Cash? Briefly trace the history of Digital Cash before Bitcoin.
Digital Cash: A system that allows the secure storage and exchange of value electronically, ideally preserving the anonymity and bearer-instrument nature of physical cash.
History before Bitcoin:
- 1983 (David Chaum): Invented Blind Signatures, allowing for untraceable payments. Founded DigiCash (1990). It failed because it was centralized (banks had to sign the cash).
- 1997 (Adam Back): Created Hashcash, a PoW system for spam control (later used in Bitcoin mining).
- 1998 (Wei Dai): Proposed b-money, a distributed crypto-currency concept involving PoW, but lacked a consensus detail.
- 1998 (Nick Szabo): Proposed Bit Gold, which used puzzle solutions as value. It struggled with the double-spending problem.
Bitcoin succeeded by combining these ideas and solving the double-spending problem without a central server.
Explain the 51% Attack concept. Why is it considered a major threat to Blockchain integrity?
Definition: A 51% attack occurs when a single miner or mining pool controls more than of the network's total hashing power (hash rate).
Capabilities of the Attacker:
- Double Spending: They can spend coins, privately mine a secret chain where they didn't spend them, and then release the secret chain. Since it has more hash power, it becomes the longest chain, reversing the original transaction.
- Censorship: They can refuse to include specific transactions in blocks.
Why it's a threat: It undermines the core value proposition of blockchain: trustlessness and immutability. If a single entity rules the chain, it becomes a centralized database.
Compare Permissioned (Private) and Permissionless (Public) Blockchains.
| Feature | Permissionless (Public) | Permissioned (Private/Consortium) |
|---|---|---|
| Access | Open to anyone. Anyone can join, read, write, and mine. | Restricted. Access is granted by a central authority or consortium. |
| Trust | Trustless. Relies on math and game theory (PoW/PoS). | Semi-trusted. Participants are known and vetted. |
| Efficiency | Lower. High redundancy and energy consumption. | Higher. Faster consensus (e.g., PBFT) and high throughput. |
| Immutability | Very high. Almost impossible to tamper with. | Lower. The controlling consortium could collude to revert changes. |
| Examples | Bitcoin, Ethereum. | Hyperledger Fabric, Corda. |