Unit3 - Subjective Questions
CSC203 • Practice Questions with Detailed Answers
Explain the concept of Atomic Broadcast and its relationship with the Consensus problem in distributed systems.
Atomic Broadcast (also known as Total Order Broadcast) is a communication primitive in distributed systems that ensures messages are delivered to all participants reliably and in the same order.
It satisfies two main properties:
- Reliability: If a correct node delivers a message, all correct nodes deliver that message.
- Total Order: If two correct nodes deliver two messages and , they deliver them in the same order (e.g., if Node A delivers before , Node B must also deliver before ).
Relationship with Consensus:
Atomic Broadcast and Consensus are equivalent in asynchronous systems with crash failures.
- To solve Consensus using Atomic Broadcast: Nodes broadcast their proposed values. The first value delivered by the atomic broadcast channel is accepted as the decided value.
- To solve Atomic Broadcast using Consensus: Nodes use consensus to agree on the next message to be delivered in the sequence.
Define the Consensus problem and list the three fundamental properties a consensus algorithm must satisfy.
The Consensus problem requires a set of processes (or nodes) to agree on a single data value. Some processes may fail or be unreliable, so consensus protocols must be fault-tolerant.
A consensus algorithm must satisfy the following three properties:
- Termination (Liveness): Every correct (non-faulty) process eventually decides on a value.
- Agreement (Safety): All correct processes decide on the same value. If process decides and process decides , then .
- Validity (Integrity): If a correct process decides , then was proposed by some process. This prevents the algorithm from deciding on a trivial or arbitrary value.
Differentiate between Crash Faults and Byzantine Faults in the context of distributed computing models.
The distinction lies in the behavior of the faulty nodes:
1. Crash Fault Model (Fail-Stop):
- Behavior: A node halts (stops working) prematurely but does not do anything malicious before stopping.
- Visibility: Once it stops, it sends no further messages. Other nodes may detect this via timeouts.
- Complexity: Easier to solve. Consensus is possible with faults given nodes (in synchronous settings).
2. Byzantine Fault Model:
- Behavior: A node behaves arbitrarily. It may send conflicting messages to different peers, stay silent, delay messages, or collude with other faulty nodes to break the system.
- Visibility: Hard to detect because the node mimics correct behavior while injecting errors.
- Complexity: Much harder to solve. Requires nodes to tolerate Byzantine faults.
Describe the Byzantine Generals Problem. How does it illustrate the challenges of achieving consensus in a decentralized network?
The Byzantine Generals Problem is a logical dilemma that illustrates how a group of distributed processors can achieve consensus despite the presence of unreliable or malicious actors.
The Scenario:
- A group of generals surrounds a city. They must attack or retreat simultaneously; otherwise, they will fail.
- They communicate only via messengers.
- One or more generals may be traitors (Byzantine faults) who try to confuse the others by sending conflicting orders (e.g., telling General A to "Attack" and General B to "Retreat").
The Challenge:
Ideally, all loyal generals should agree on the same plan. However, because traitors can lie about what others said, a general cannot trust a single message blindly.
Relevance to Blockchain:
In a decentralized network like Bitcoin, nodes (generals) must agree on the state of the ledger (the plan). Malicious actors (traitors) may try to spend money twice or fork the chain. A Byzantine Fault Tolerant (BFT) system ensures the network reaches consensus provided that less than of the participants are malicious.
Explain the concept of Collision Resistance in cryptographic hash functions. Why is it impossible to have a perfectly collision-free hash function?
Collision Resistance is a property of a cryptographic hash function whereby it is computationally infeasible to find two distinct inputs, and , such that they hash to the same output.
Why perfect collision freedom is impossible:
This is due to the Pigeonhole Principle.
- A hash function maps an input space of arbitrary size (infinite potential inputs) to an output space of fixed size (e.g., 256 bits).
- Because the number of possible inputs exceeds the number of possible outputs, collisions must exist mathematically.
- However, for a secure hash function (like SHA-256), finding such a collision is so statistically improbable (requiring astronomical computational power) that it is considered "collision resistant" for all practical purposes.
What does it mean for a hash function to be "Puzzle Friendly"? Explain its significance in the context of Bitcoin mining.
Puzzle Friendliness is a specific property of hash functions useful for Proof-of-Work systems.
Definition:
A hash function is puzzle-friendly if, for every possible n-bit output value , if is chosen from a distribution with high min-entropy (is very random), it is computationally infeasible to find an input such that significantly faster than operations.
Significance in Mining:
- In Bitcoin, the "puzzle" is to find a nonce such that the hash of the block header (containing the previous hash and other data acting as ) falls below a certain target.
- Because the hash function is puzzle-friendly, there are no shortcuts or strategies to predict the input that will produce the desired output.
- Miners must use brute force (try random numbers), ensuring that the probability of solving the puzzle is proportional only to the computational power they contribute.
List and explain the three core properties required for a secure Cryptographic Hash Function.
A secure cryptographic hash function must satisfy the following three properties:
-
Pre-image Resistance (One-wayness):
Given a hash output , it should be computationally infeasible to find any input such that . This ensures that data cannot be reverse-engineered from its hash. -
Second Pre-image Resistance (Weak Collision Resistance):
Given a specific input , it should be infeasible to find a different input such that . This ensures that a specific piece of data cannot be substituted with another while keeping the same hash. -
Collision Resistance (Strong Collision Resistance):
It should be infeasible to find any two distinct inputs and such that . This is the strongest property and implies the second pre-image resistance.
Describe the functioning of Digital Signatures using three algorithms: (Gen, Sign, Verify).
Digital signature schemes consist of three probabilistic polynomial-time algorithms:
-
Key Generation ():
- Input: Security parameter (key size).
- Output: A pair of keys .
- is the Secret Key (kept private) used for signing.
- is the Public Key (distributed openly) used for verification.
-
Signing ():
- Input: The secret key and a message .
- Output: A signature .
- Mathematical representation: .
-
Verification ():
- Input: The public key , the message , and the signature .
- Output: A boolean value (True/False).
- Mathematical representation: .
- It returns True if was essentially generated by on message . Otherwise, it returns False.
Discuss the properties of Unforgeability and Non-repudiation in Digital Signatures.
1. Unforgeability (Existential Unforgeability):
This is the primary security guarantee. It implies that an adversary, even with access to the public key and a set of valid message-signature pairs chosen by them, cannot generate a valid signature for a new message that hasn't been signed before. In simple terms, only the holder of the private key can create a valid signature.
2. Non-repudiation:
Non-repudiation ensures that the signer cannot deny having signed the message later. Since the signature can only be generated by the private key (which is assumed to be known only to the owner) and verified by the public key, the existence of a valid signature is mathematical proof that the owner of the private key authorized the message.
What is the Avalanche Effect in hash functions?
The Avalanche Effect is a desirable property of cryptographic algorithms, specifically hash functions and block ciphers.
It states that a very small change in the input data (e.g., flipping a single bit) should cause a drastic, random-looking change in the output (hash digest). Ideally, flipping one bit in the input should change approximately 50% of the bits in the output.
Importance:
- It prevents attackers from predicting how the hash will change based on input modifications.
- It ensures good randomization, which is crucial for passing statistical tests and resisting cryptanalysis.
Define Verifiable Random Functions (VRF). How do they differ from standard pseudo-random number generators?
A Verifiable Random Function (VRF) is a cryptographic primitive that maps an input to a pseudo-random output , while also producing a proof that the output was generated correctly using a specific private key.
Mathematically, a VRF involves:
Difference from Standard PRNGs:
- Verifiability: In a standard PRNG, the observer cannot verify why a specific random number was generated or if it was manipulated. In a VRF, anyone with the corresponding Public Key can use the proof to verify that corresponds to input and was generated by the owner of :
- Unpredictability: Like a PRNG, the output looks random, but unlike a PRNG, it is tied mathematically to a private key, ensuring only the key holder can compute it, but anyone can check it.
Explain the role of Verifiable Random Functions (VRFs) in Blockchain consensus mechanisms like Algorand.
VRFs are used in blockchain consensus (specifically Proof-of-Stake variations like Algorand) to perform Cryptographic Sortition (secret, random selection of leaders).
The Mechanism:
- Secret Selection: Each user privately computes a VRF on a seed (e.g., block hash) using their private key. If the output value falls below a certain threshold (weighted by their stake), they are selected as a block proposer or committee member.
- Unpredictability: Because the private key is needed to compute the VRF, no one knows who is selected until the winner reveals their credential.
- Verification: When the user publishes the block, they include the VRF output and proof. Other nodes can verify using the user's public key that the user was indeed legitimately selected to propose the block.
This prevents Denial of Service (DoS) attacks because the adversary doesn't know who to attack until the block is already proposed.
What are Zero-Knowledge Systems (Zero-Knowledge Proofs)? Explain with the help of the "Ali Baba's Cave" analogy.
Zero-Knowledge Proofs (ZKPs) are cryptographic protocols where a "Prover" convinces a "Verifier" that a statement is true without revealing any information about the statement itself (other than the fact that it is true).
Ali Baba's Cave Analogy:
- Setup: A cave has a circular path with an entrance and a magic door at the back blocking the loop. The door opens only with a secret password.
- The Actor: Peggy (Prover) wants to prove to Victor (Verifier) she knows the password without saying it.
- The Process:
- Victor waits outside. Peggy enters and chooses path A or B randomly.
- Victor walks to the entrance and shouts to Peggy to come out via path A or B (randomly chosen).
- If Peggy knows the password, she can open the door and come out the correct side regardless of where she started.
- If she doesn't know the password, she only has a 50% chance of being on the right side.
- Conclusion: By repeating this multiple times, if Peggy always succeeds, the probability she is guessing approaches zero. Victor is convinced she knows the password, but he never heard the password himself.
List and explain the three fundamental properties of a Zero-Knowledge Proof.
For a protocol to be considered a Zero-Knowledge Proof, it must satisfy:
-
Completeness:
If the statement is true and both the Prover and Verifier follow the protocol correctly, the Verifier will definitely accept the proof. (A honest prover convinces a honest verifier). -
Soundness:
If the statement is false, no cheating Prover can convince the honest Verifier that it is true, except with some negligible probability. (You cannot prove a lie). -
Zero-Knowledge:
If the statement is true, no cheating Verifier learns anything other than the fact that the statement is true. They learn nothing about the secret witness (input) used to generate the proof.
Distinguish between Interactive and Non-Interactive Zero-Knowledge Proofs.
Interactive Zero-Knowledge Proofs:
- Process: Requires a back-and-forth communication (multiple rounds) between the Prover and the Verifier.
- Constraint: The Prover and Verifier must be online simultaneously.
- Example: The basic Ali Baba cave scenario (Victor shouts, Peggy responds repeatedly).
Non-Interactive Zero-Knowledge Proofs (NIZK):
- Process: The Prover generates a single proof message and sends it to the Verifier (or publishes it to a blockchain).
- Mechanism: Often uses the Fiat-Shamir heuristic to replace the random challenges of the verifier with a hash of the protocol's state, simulating the interaction.
- Constraint: Requires a "Common Reference String" (CRS) or trusted setup phase.
- Example: zk-SNARKs (used in Zcash). This is preferred for blockchains as verification can happen anytime without the prover being online.
Describe the requirements for a Byzantine Fault Tolerant (BFT) algorithm in terms of the number of nodes required.
In a distributed system with total nodes, where represents the number of Byzantine (malicious) nodes:
-
Requirement: A consensus algorithm can guarantee safety and liveness only if:
Or conversely, the system can tolerate at most malicious nodes. -
Reasoning:
If the commander sends a value, the nodes must communicate to verify it. If there are traitors, they can lie. To distinguish the truth, the loyal nodes (minimum ) must outnumber the sum of traitors () plus the potential specifically confusing messages generated by traitors to split the vote. The mathematical derivation proves that strict consensus is impossible if or more of the network is Byzantine.
Explain the significance of the Discrete Logarithm Problem in the context of Digital Signatures.
The Discrete Logarithm Problem (DLP) forms the mathematical hardness assumption for many digital signature schemes, such as DSA (Digital Signature Algorithm) and ECDSA (Elliptic Curve DSA, used in Bitcoin).
The Problem:
Given a generator of a finite cyclic group and a value , it is computationally easy to compute given (exponentiation). However, it is computationally infeasible to compute given and (finding the discrete log), provided the group size is large enough.
Significance:
- acts as the Private Key.
- acts as the Public Key.
- Because DLP is hard, an attacker cannot derive the private key from the public key . This ensures that signatures generated by are unforgeable.
How do Hash Pointers ensure data integrity in a blockchain structure?
A Hash Pointer is a data structure that contains two things:
- A pointer to the location of some information (address).
- A cryptographic hash of that information.
Ensuring Integrity:
- In a blockchain, each block contains the hash of the previous block (a hash pointer to the previous block).
- If an attacker modifies data in Block , the hash of Block changes.
- Consequently, the hash pointer stored in Block (which should match Block 's hash) becomes invalid.
- To hide the tampering, the attacker would have to modify Block as well, which changes its hash, requiring a change in Block , and so on up to the "head" of the chain.
- Because of the Collision Resistance and Avalanche Effect of hash functions, any tamper is immediately detectable by checking the hash chain.
Discuss the FLP Impossibility Result briefly and its implication for consensus in asynchronous systems.
The FLP Impossibility Result (named after Fischer, Lynch, and Paterson, 1985) is a fundamental theorem in distributed computing.
Statement:
In an asynchronous distributed system (where there are no upper bounds on message delivery times), it is impossible to achieve a deterministic consensus protocol that guarantees all three properties: Safety, Liveness (Termination), and Fault Tolerance, if even a single node can crash.
Implications:
- It proves that perfect consensus is impossible in purely asynchronous networks with faults.
- Solution in Blockchain: Blockchains circumvent this by relaxing the "asynchronous" assumption (assuming partial synchrony) or by relaxing "deterministic termination" (using probabilistic consensus like Proof of Work, where consensus is never 100% final but becomes exponentially probable over time).
Compare Symmetric Message Authentication Codes (MAC) and Digital Signatures.
1. Key Usage:
- MAC: Uses Symmetric Cryptography. The sender and receiver share the same secret key ().
- Digital Signatures: Uses Asymmetric Cryptography. The sender has a Private Key (), and the receiver uses a Public Key ().
2. Non-Repudiation:
- MAC: Does not provide non-repudiation. Since both parties have the key, the receiver could have forged the message themselves. They cannot prove to a third party who sent it.
- Digital Signatures: Provides non-repudiation. Only the sender has the private key, so only the sender could have created the signature. This is verifiable by any third party.
3. Performance:
- MAC: generally faster to compute (based on hashing or block ciphers).
- Digital Signatures: Computationally more intensive (based on modular exponentiation or elliptic curves).