1 $Which of the following best describes the agglomerative approach to hierarchical clustering?$

Hierarchical clustering: Agglomerative vs. Divisive Easy

A.

A bottom-up approach that starts with each data point as its own cluster.

B.

A centroid-based approach that requires specifying in advance.

C.

A top-down approach that starts with all data points in a single cluster.

D.

A density-based approach that identifies core points.

2 $Divisive hierarchical clustering operates by:$

Hierarchical clustering: Agglomerative vs. Divisive Easy

A.

Merging the closest pairs of clusters sequentially.

B.

Assigning points to the nearest centroid.

C.

Identifying dense regions separated by sparse regions.

D.

Starting with all points in one cluster and recursively splitting them.

3 $In the initial step of agglomerative hierarchical clustering on a dataset with points, how many clusters are there?$

Hierarchical clustering: Agglomerative vs. Divisive Easy

A.

$1$

B.

C.

D.

4 $Which linkage method defines the distance between two clusters as the shortest distance between any two points in the clusters?$

Linkage methods (single, complete, average, Ward) Easy

A.

Complete linkage

B.

Single linkage

C.

Average linkage

D.

Ward's method

5 $Complete linkage is based on which of the following metrics?$

Linkage methods (single, complete, average, Ward) Easy

A.

The increase in within-cluster variance.

B.

The minimum distance between points in two clusters.

C.

The average distance of all pairs of points.

D.

The maximum distance between any two points in the clusters.

6 $Which linkage method aims to minimize the total within-cluster variance when merging two clusters?$

Linkage methods (single, complete, average, Ward) Easy

A.

Average linkage

B.

Ward's method

C.

Single linkage

D.

Complete linkage

7 $What is the primary characteristic of average linkage in hierarchical clustering?$

Linkage methods (single, complete, average, Ward) Easy

A.

It explicitly minimizes the sum of squared errors.

B.

It uses the furthest pair of points.

C.

It calculates the average of distances between all pairs of points across two clusters.

D.

It uses the closest pair of points.

8 $Which linkage method is most prone to the 'chaining' effect, where clusters end up being long and uncompact?$

Linkage methods (single, complete, average, Ward) Easy

A.

Single linkage

B.

Average linkage

C.

Ward's method

D.

Complete linkage

9 $What is a dendrogram?$

Dendrogram interpretation Easy

A.

A mathematical formula for distance calculation.

B.

A scatter plot showing density distributions.

C.

A tree-like diagram that records the sequences of merges or splits in hierarchical clustering.

D.

A bar chart of cluster sizes.

10 $How can you determine the final clusters from a dendrogram?$

Dendrogram interpretation Easy

A.

By looking at the x-axis labels only.

B.

By drawing a horizontal line to cut the dendrogram at a desired height.

C.

By calculating the area under the curve.

D.

By measuring the thickness of the lines.

11 $In a standard dendrogram, what does the vertical height (y-axis) of a merge point represent?$

Dendrogram interpretation Easy

A.

The distance or dissimilarity between the two clusters being merged.

B.

The density of the resulting cluster.

C.

The total variance of the dataset.

D.

The number of data points in the merged cluster.

12 $What does the acronym DBSCAN stand for?$

Density-based clustering: DBSCAN fundamentals Easy

A.

Density-Based Spatial Clustering of Applications with Noise

B.

Distance-Based Spatial Clustering of Applications with Noise

C.

Data-Based Sequential Clustering and Normalization

D.

Distribution-Based Spatial Clustering Algorithm for Nodes

13 $Which of the following is a major advantage of DBSCAN?$

Density-based clustering: DBSCAN fundamentals Easy

A.

It forces every single point into a cluster.

B.

It requires the user to specify the number of clusters in advance.

C.

It only works well with linearly separable data.

D.

It can discover clusters of arbitrary shapes.

14 $In DBSCAN, what defines a 'core point'?$

e-neighborhood, MinPts, Noise and border points Easy

A.

A point that is the furthest from the center of a cluster.

B.

A point that has at least MinPts within its -neighborhood.

C.

A point that belongs to multiple clusters.

D.

A point that has zero neighbors.

15 $How does DBSCAN classify a 'border point'?$

e-neighborhood, MinPts, Noise and border points Easy

A.

It has fewer than MinPts within its -neighborhood, but falls within the neighborhood of a core point.

B.

It does not belong to any -neighborhood.

C.

It is randomly assigned to a cluster boundary.

D.

It has more than MinPts within its -neighborhood.

16 $What is a 'noise point' in the context of DBSCAN?$

e-neighborhood, MinPts, Noise and border points Easy

A.

A point that is the exact center of a cluster.

B.

A point that connects two distinct clusters.

C.

A point that is neither a core point nor a border point.

D.

A point that satisfies the MinPts condition perfectly.

17 $In DBSCAN, what does the parameter (epsilon) represent?$

e-neighborhood, MinPts, Noise and border points Easy

A.

The maximum distance (radius) used to define the neighborhood of a point.

B.

The threshold for hierarchical cluster merging.

C.

The minimum number of points required to form a cluster.

D.

The total number of clusters to form.

18 $Unlike Hierarchical clustering and DBSCAN, what must be provided as an explicit input parameter to the standard k-Means algorithm?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Easy

A.

The minimum number of points (MinPts)

B.

The number of clusters

C.

The radius

D.

The linkage criteria

19 $Which clustering algorithm inherently identifies outliers and explicitly leaves them unclustered?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Easy

A.

DBSCAN

B.

Agglomerative Hierarchical

C.

k-Means

D.

Ward's Hierarchical

20 $Which clustering method generates a tree-like hierarchy of clusters that does not require an initial assumption about the number of clusters?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Easy

A.

k-Means

B.

K-Medoids

C.

DBSCAN

D.

Hierarchical clustering

21 $In the context of hierarchical clustering on a dataset with points, which of the following accurately describes the initial and final states of the Divisive approach?$

Hierarchical clustering: Agglomerative vs. Divisive Medium

A.

Initial state: clusters; Final state: clusters containing 1 point each

B.

Initial state: 1 cluster containing points; Final state: clusters containing 1 point each

C.

Initial state: clusters containing 1 point each; Final state: 1 cluster containing points

D.

Initial state: 1 cluster containing points; Final state: clusters based on density

22 $Which linkage method defines the distance between two clusters as the maximum distance between any single point in the first cluster and any single point in the second cluster, tending to produce tightly bound, spherical clusters?$

Linkage methods (single, complete, average, Ward) Medium

A.

Complete linkage

B.

Single linkage

C.

Average linkage

D.

Ward's linkage

23 $An analyst notices that their hierarchical clustering results suffer from the "chaining effect," where clusters are stretched out into long, thin bands. Which linkage method was most likely used?$

Linkage methods (single, complete, average, Ward) Medium

A.

Single linkage

B.

Ward's method

C.

Average linkage

D.

Complete linkage

24 $Unlike single, complete, and average linkage which rely strictly on pairwise distances between points, Ward's method decides which clusters to merge by minimizing what metric?$

Linkage methods (single, complete, average, Ward) Medium

A.

The median distance between all cluster points

B.

The within-cluster sum of squared errors (WCSS)

C.

The between-cluster sum of squared errors

D.

The maximum distance between cluster centroids

25 $When interpreting a dendrogram generated by agglomerative clustering, what does the vertical height (on the y-axis) of a horizontal merge line represent?$

Dendrogram interpretation Medium

A.

The density of the newly formed cluster

B.

The variance explained by the merging of the two clusters

C.

The number of data points in the newly formed cluster

D.

The distance or dissimilarity threshold at which the two clusters were merged

26 $If you draw a horizontal cut line across a dendrogram at a specific height, how is the number of resulting clusters determined?$

Dendrogram interpretation Medium

A.

By calculating the ratio of total height to

B.

By counting the number of vertical lines that intersect the horizontal cut line

C.

By summing the data points below height

D.

By counting the number of horizontal merge lines exactly at height

27 $In the DBSCAN algorithm, what is the primary condition for a data point to be classified as a "core point"?$

Density-based clustering: DBSCAN fundamentals Medium

A.

It must be exactly in the geometric center of a cluster.

B.

It must have a distance less than to all other points in the dataset.

C.

It must have at least number of points (including itself) within its -neighborhood.

D.

It must be reachable from another core point and have fewer than neighbors.

28 $Consider a point in a dataset analyzed via DBSCAN. Point has fewer than points in its -neighborhood, but it falls within the -neighborhood of a core point . How will DBSCAN classify point ?$

Noise and border points Medium

A.

As an outlier

B.

As a noise point

C.

As a border point

D.

As a core point

29 $If DBSCAN finishes running and a point is neither a core point nor reachable from any core point in the dataset, what is its final designation?$

Noise and border points Medium

A.

Noise point

B.

Border point

C.

Centroid

D.

Isolated core point

30 $If you decrease the value of (epsilon) while keeping constant in DBSCAN, what is the most likely effect on the clustering output?$

e-neighborhood, MinPts Medium

A.

Clusters will merge together into larger, single clusters.

B.

Fewer points will be classified as noise.

C.

More points will be classified as noise, and existing clusters may split.

D.

The number of core points will increase.

31 $You are given a dataset containing two concentric ring-shaped clusters. Which clustering algorithm is best suited to correctly identify these two non-linear clusters?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium

A.

Agglomerative clustering with Ward's linkage

B.

Divisive clustering with complete linkage

C.

k-Means clustering

D.

DBSCAN

32 $Which of the following statements accurately compares k-Means and DBSCAN regarding their handling of outliers?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium

A.

Both algorithms assign a specific "noise" label to outliers.

B.

Both algorithms are highly robust to outliers because they use median distances.

C.

DBSCAN incorporates outliers into the nearest border points, whereas k-Means drops them from the dataset.

D.

k-Means forces all points into clusters, shifting centroids due to outliers, whereas DBSCAN ignores isolated points as noise.

33 $When comparing the time complexity of clustering algorithms for a large dataset of points, which of the following is generally true?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium

A.

Standard Agglomerative Hierarchical clustering is generally faster than k-Means .

B.

DBSCAN is always the slowest algorithm regardless of indexing.

C.

k-Means is generally faster than Standard Agglomerative Hierarchical clustering .

D.

Divisive clustering is always faster than k-Means because it splits data linearly.

34 $In DBSCAN, what happens to the clustering model if is set to 1?$

e-neighborhood, MinPts Medium

A.

The algorithm will behave exactly like k-Means.

B.

Every point will be classified as a core point, and points within of each other will form clusters.

C.

Every point will be classified as noise regardless of the value.

D.

The algorithm will fail to execute due to a division by zero error.

35 $Which of the following is a major drawback of standard Agglomerative Hierarchical Clustering?$

Hierarchical clustering: Agglomerative vs. Divisive Medium

A.

It requires the user to specify the number of clusters before running the algorithm.

B.

Once a merge is performed, it cannot be undone in subsequent steps.

C.

It can only identify spherical clusters.

D.

It randomly initializes centroids, leading to non-deterministic results.

36 $Which algorithm does NOT require the user to explicitly declare the desired number of clusters beforehand, but instead infers the number of clusters from the data's properties?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium

A.

k-Medoids

B.

k-Means

C.

DBSCAN

D.

Spectral Clustering

37 $You observe a dendrogram where the longest vertical branches without any horizontal merges occur between a height of 10 and 25. What does this gap suggest about the dataset?$

Dendrogram interpretation Medium

A.

The algorithm encountered a local minimum between heights 10 and 25.

B.

The data has a high amount of noise points.

C.

The clusters formed above height 25 are completely identical to those below height 10.

D.

Cutting the dendrogram anywhere between 10 and 25 will yield a highly stable and natural clustering partition.

38 $In the formal definition of DBSCAN, what is the relationship between "Directly Density-Reachable" and "Density-Reachable"?$

Density-based clustering: DBSCAN fundamentals Medium

A.

They are synonymous terms describing identical spatial relationships.

B.

Directly density-reachable applies only to noise points, while density-reachable applies to core points.

C.

Density-reachable is the transitive closure of directly density-reachable.

D.

Density-reachable is a symmetric relationship, while directly density-reachable is not.

39 $Suppose you are computing the distance between Cluster A (containing 3 points) and Cluster B (containing 4 points) using Average Linkage. How many pairwise distance calculations are averaged to find the distance between A and B?$

Linkage methods (single, complete, average, Ward) Medium

A.

1

B.

2

C.

12

D.

7

40 $A border point in DBSCAN is within the -neighborhood of two different core points, and, which belong to two entirely separate clusters. How does DBSCAN handle the assignment of point ?$

Noise and border points Medium

A.

It duplicates, putting one copy in's cluster and one in's cluster.

B.

It assigns to whichever cluster's core point discovers it first.

C.

It classifies as a noise point because it cannot break the tie.

D.

It assigns to both clusters simultaneously, creating an overlapping clustering.

41 $An exact divisive hierarchical clustering algorithm on a dataset of points requires evaluating all possible bipartite splits of a cluster at each step. What is the worst-case time complexity of creating the first split in this exact divisive approach, assuming no heuristic optimizations like DIANA are used?$

Hierarchical clustering: Agglomerative vs. Divisive Hard

A.

B.

C.

D.

42 $Ward's linkage method aims to minimize the total within-cluster variance. When deciding to merge clusters and with centroids and, the increase in the Sum of Squared Errors (SSE), denoted as, is proportional to the squared Euclidean distance between their centroids. Which of the following defines the exact cost ?$

Linkage methods (single, complete, average, Ward) Hard

A.

B.

C.

D.

43 $Consider the Lance-Williams update formula: . For Complete Linkage, what are the specific values of and ?$

Linkage methods (single, complete, average, Ward) Hard

A.

B.

C.

D.

44 $Single Linkage clustering can be directly derived from a Minimum Spanning Tree (MST) of the data points. If a dataset has points and all pairwise distances are distinct, how can one obtain the exact clusters produced by Single Linkage from the MST?$

Linkage methods (single, complete, average, Ward) Hard

A.

By removing all edges in the MST with weights greater than the median edge weight, recursively times.

B.

It is impossible; Single Linkage handles graph components differently than Kruskal's or Prim's algorithms.

C.

By removing the edges in the MST that have the largest weights.

D.

By finding the subtrees with the minimum total edge weight.

45 $You calculate the cophenetic correlation coefficient to evaluate an agglomerative clustering output. Which of the following scenarios would theoretically yield a cophenetic correlation coefficient of strictly $1.0$?$

Dendrogram interpretation Hard

A.

The original distance matrix exactly satisfies the ultrametric inequality for all triplets of points.

B.

The data points lie exactly on a one-dimensional straight line.

C.

Ward's linkage is used on a dataset where all variables follow a standard normal distribution.

D.

The original distance matrix strictly satisfies the triangle inequality for all points.

46 $A researcher is analyzing a dendrogram generated by a hierarchical clustering algorithm. They observe 'reversals' (or 'inversions'), where a parent node merges at a lower distance height than its child nodes. Which of the following explains this anomaly?$

Dendrogram interpretation Hard

A.

The researcher used a linkage method that violates the space-conserving Lance-Williams constraints, such as Centroid or Median linkage.

B.

Ward's linkage was used, causing the objective function to contract heavily in the initial merges.

C.

The distance metric used was non-Euclidean, such as Cosine distance or Manhattan distance.

D.

The data contains highly dense clusters intermixed with uniform noise, causing scaling issues on the y-axis.

47 $In a dendrogram produced by Complete Linkage, a very long vertical line segment with no horizontal merges branching off indicates:$

Dendrogram interpretation Hard

A.

A violation of the ultrametric inequality during the agglomeration steps.

B.

A substantial range of distance thresholds (cutting heights) over which the number of clusters and their compositions remain unchanged.

C.

That the dataset contains extreme outliers that were forced to merge at a localized point.

D.

That the clusters merged at the bottom of the vertical line are highly highly chained and space-contracting.

48 $In the context of DBSCAN, let be an undirected graph where vertices are data points. An edge exists between and if and . How can the resulting DBSCAN clusters be rigorously defined in graph-theoretic terms?$

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard

A.

They are the connected components of the subgraph induced strictly by the core points, with border points appended to any adjacent core-point component.

B.

They are the strictly bipartite subgraphs where the two sets are core points and border points respectively.

C.

They correspond to all maximal cliques in of size at least MinPts.

D.

They are the strongly connected components of the entire graph including noise points.

49 $In DBSCAN, density-reachability and density-connectedness are foundational concepts. Which of the following statements strictly holds regarding their mathematical properties?$

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard

A.

Neither density-reachability nor density-connectedness exhibit symmetry in datasets containing varying densities.

B.

Both density-reachability and density-connectedness are strictly symmetric and transitive equivalence relations.

C.

Density-reachability is transitive but generally asymmetric, whereas density-connectedness is both symmetric and transitive.

D.

Density-reachability is symmetric but not transitive, whereas density-connectedness is transitive but not symmetric.

50 $Assume a dataset in a 10,000-dimensional space where points are distributed uniformly at random. When applying DBSCAN to this dataset, which of the following phenomena severely hinders its effectiveness due to the Curse of Dimensionality?$

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard

A.

The density-reachability condition becomes symmetric for all points, collapsing the distinction between core and border points.

B.

The algorithm requires computing distance comparisons, exceeding computational limits.

C.

The distance from a point to its nearest neighbor approaches the distance to its farthest neighbor, making it nearly impossible to distinguish dense regions from sparse regions using a fixed .

D.

All points automatically become core points regardless of because the volume of the -sphere expands to encompass the entire space exponentially fast.

51 $Which of the following is an inherent source of non-determinism in the standard DBSCAN algorithm regarding border points?$

Noise and border points Hard

A.

If a border point's distance to a core point is exactly equal to, floating-point instability will randomly assign it to the noise category.

B.

The algorithm randomly selects border points to merge density-reachable clusters, leading to different final cluster counts.

C.

Border points frequently fluctuate between being classified as noise and border points depending on the random initialization of the algorithm's KD-tree.

D.

A border point that falls within the -neighborhood of core points belonging to two distinct clusters will be assigned to whichever cluster is processed first.

52 $Consider a dataset clustered by DBSCAN with parameters and . Point is identified as a border point. If a single noise point located far away from all clusters is removed from the dataset, and DBSCAN is re-run with the identical parameters, which of the following is guaranteed to be true regarding Point ?$

Noise and border points Hard

A.

Point will still be identified as a border point, or it could potentially become a noise point if the distance matrix recalculation shifts the space.

B.

Point will turn into a noise point, because DBSCAN relies on global average density to determine relative border distances.

C.

Point will definitely remain a border point, as the removal of a distant noise point cannot affect the -neighborhoods of the core points near .

D.

Point could be upgraded to a core point because the global density of the dataset decreases.

53 $Let be a dataset. You run DBSCAN with and an arbitrary . Which of the following correctly describes the resulting classifications of points?$

Noise and border points Hard

A.

The classification will perfectly mirror Single Linkage clustering cut at height, resulting in a mix of core and border points.

B.

All points are classified as border points, and there are no core points.

C.

All points are classified as core points; there are absolutely no border points or noise points.

D.

All points are classified as noise points.

54 $Suppose you have a dataset with dense data points in . You require a clustering algorithm that can theoretically operate within or memory and time limits. Which of the following approaches is most feasible without utilizing severe dataset subsampling?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard

A.

Agglomerative Hierarchical Clustering with Single Linkage, using a generic distance matrix.

B.

Agglomerative Hierarchical Clustering with Ward's linkage, using a Lance-Williams matrix update.

C.

DBSCAN, heavily leveraging a spatial index such as an R-tree or KD-tree.

D.

Divisive Hierarchical Clustering using an exact bipartite split algorithm.

55 $You are tasked with clustering a 2D dataset consisting of a dense inner ring and a sparse, expanding outer ring of points (concentric circles of varying densities). Which of the following algorithms and configurations will fundamentally fail to isolate the two rings as distinct clusters?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard

A.

DBSCAN, because a single fixed cannot simultaneously accommodate the dense inner ring (without merging it into noise) and the sparse outer ring (without classifying it as noise).

B.

k-Means with, assuming the centroids are initialized using k-Means++.

C.

OPTICS (Ordering Points To Identify the Clustering Structure), because it strictly inherits DBSCAN's inability to handle varying densities.

D.

Single Linkage Hierarchical Clustering, because it suffers from the chaining effect and cannot separate non-convex shapes.

56 $Inductive clustering models can readily assign out-of-sample data points to existing clusters without re-running the algorithm on the entire combined dataset. Transductive models generally lack this native capability. Categorize k-Means, DBSCAN, and Standard Agglomerative Hierarchical clustering based on these definitions.$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard

A.

All three are inherently Transductive; Inductive extensions require supervised learning.

B.

k-Means is Inductive. DBSCAN and Agglomerative Hierarchical are Transductive.

C.

Agglomerative Hierarchical is Inductive. k-Means and DBSCAN are Transductive.

D.

k-Means and DBSCAN are Inductive. Agglomerative Hierarchical is Transductive.

57 $The MacNaughton-Smith algorithm is a divisive hierarchical approach. Unlike the computationally prohibitive exact divisive method, how does it primarily construct the sequence of splitting?$

Hierarchical clustering: Agglomerative vs. Divisive Hard

A.

By calculating the eigenvectors of the unnormalized Laplacian matrix and splitting at the median.

B.

By repeatedly running k-Means with and picking the cluster with the highest variance to split next.

C.

By starting with all points in one cluster, selecting the point with the highest average dissimilarity to the rest as a 'splinter group', and iteratively moving points to it.

D.

By recursively removing the longest edge in the dataset's Minimum Spanning Tree.

58 $Consider a dataset clustered effectively by k-Means, DBSCAN, and Complete Linkage Hierarchical Clustering using Euclidean distance. If the dataset undergoes an affine transformation where the x-axis is scaled by a factor of, which algorithm's underlying logic will be fundamentally robust to this transformation, assuming parameters are unadjusted?$

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard

A.

Complete Linkage Hierarchical, as relative maximal distances are preserved.

B.

k-Means, because the centroids simply stretch along the x-axis.

C.

DBSCAN, as density remains topologically equivalent despite the scaling.

D.

None of the algorithms are robust to non-uniform scaling when using standard Euclidean distance.

59 $A data scientist decides to construct a k-distance graph to determine the optimal for DBSCAN, setting . They plot the sorted k-distances of all points in descending order. If the graph exhibits two distinct, sharp 'knees' (inflection points) separated by a long plateau, what topological characteristic does the dataset likely possess?$

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard

A.

The dataset suffers from the curse of dimensionality, rendering the distance metric useless.

B.

The dataset consists exclusively of noise points randomly scattered around a single core point.

C.

The dataset is essentially uniformly distributed without any cluster structures.

D.

The dataset contains at least two distinct clusters of significantly different densities.

60 $Average Linkage clustering (UPGMA) avoids the chaining effect of Single Linkage and the extreme sensitivity to outliers of Complete Linkage. What is a strict mathematical requirement for the distance metric between points for Average Linkage to produce an ultrametric tree without reversals?$

Linkage methods (single, complete, average, Ward) Hard

A.

The distance metric must satisfy and, and UPGMA will always guarantee monotonicity.

B.

There is no requirement; UPGMA is space-conserving and inherently monotonic regardless of the strict metric properties of the underlying dissimilarity matrix.

C.

The distance metric must be Euclidean, as Ward's linkage is the only alternative for non-Euclidean spaces.

D.

The distance metric must be identical to the Pearson correlation coefficient.

Unit 3 - Practice Quiz