Unit2 - Subjective Questions

INT312 • Practice Questions with Detailed Answers

1

Explain the core architecture of Hadoop and its primary components.

2

Describe the architecture of the Hadoop Distributed File System (HDFS).

3

Differentiate between NameNode and DataNode in HDFS.

4

Explain the role and functioning of the Secondary NameNode.

5

Describe the data read operation in HDFS with a step-by-step workflow.

6

Explain the data write operation in HDFS.

7

What is Rack Awareness in Hadoop? Why is it important?

8

Explain the architecture of YARN (Yet Another Resource Negotiator).

9

Differentiate between Hadoop 1.x and Hadoop 2.x architecture.

10

Describe the MapReduce programming model and its core phases.

11

Explain the significance of the Shuffle and Sort phase in MapReduce.

12

Discuss the concept of Block Size and Replication in HDFS.

13

How does Hadoop achieve High Availability (HA)? Explain the HA architecture.

14

What are the different execution modes of Hadoop? Describe each briefly.

15

Explain the roles of JobTracker and TaskTracker in Hadoop 1.x.

16

Discuss the function of the ApplicationMaster in YARN.

17

What is a 'Heartbeat' in Hadoop, and why is it important?

18

Explain how fault tolerance is achieved in HDFS.

19

Describe the concept of Data Locality in Hadoop and explain its interaction between HDFS and MapReduce/YARN.

20

Briefly describe the purpose of ZooKeeper, Hive, and Pig in the Hadoop Ecosystem in relation to the core architecture.