Unit3 - Subjective Questions

INT396 • Practice Questions with Detailed Answers

1

Define Hierarchical Clustering and differentiate between Agglomerative and Divisive approaches.

2

Explain the four common linkage methods used in hierarchical clustering: Single, Complete, Average, and Ward's linkage.

3

What is a dendrogram? Explain how it is interpreted to determine the optimal number of clusters.

4

Provide a detailed comparison between k-Means, Hierarchical, and DBSCAN clustering algorithms.

5

Explain the fundamental concepts of -neighborhood and MinPts in the context of DBSCAN.

6

Define and distinguish between Core points, Border points, and Noise points in DBSCAN.

7

Formulate the mathematical expressions for Single Linkage and Complete Linkage. Why is Single Linkage prone to the chaining effect?

8

Explain Ward's Linkage method in agglomerative hierarchical clustering. How does its objective function differ from Single and Complete linkage?

9

Describe the step-by-step algorithm for Agglomerative Hierarchical Clustering.

10

Discuss the main advantages and limitations of Hierarchical Clustering.

11

Provide a step-by-step explanation of how the DBSCAN algorithm works.

12

What are the advantages and disadvantages of using DBSCAN for clustering?

13

Explain the concepts of 'Directly Density-Reachable', 'Density-Reachable', and 'Density-Connected' in DBSCAN.

14

How does Average Linkage strike a balance between Single and Complete Linkage in Hierarchical Clustering?

15

Why is DBSCAN often preferred over k-Means when dealing with datasets that have non-linear or arbitrary shapes?

16

Explain the Divisive Hierarchical Clustering process and why it is computationally more expensive than Agglomerative Clustering.

17

Derive or provide the mathematical representations of cluster distances for Single, Complete, and Average linkage.

18

In a dendrogram, what does the vertical height of a horizontal line connecting two branches represent? Explain with an example.

19

What happens to the DBSCAN clustering output if the value of is chosen too large or too small?

20

Contrast how k-Means and DBSCAN handle outliers and noise in a dataset.