Unit 3 - Practice Quiz

INT396 60 Questions
0 Correct 0 Wrong 60 Left
0/60

1 Which of the following best describes the agglomerative approach to hierarchical clustering?

Hierarchical clustering: Agglomerative vs. Divisive Easy
A. A density-based approach that identifies core points.
B. A bottom-up approach that starts with each data point as its own cluster.
C. A centroid-based approach that requires specifying in advance.
D. A top-down approach that starts with all data points in a single cluster.

2 Divisive hierarchical clustering operates by:

Hierarchical clustering: Agglomerative vs. Divisive Easy
A. Starting with all points in one cluster and recursively splitting them.
B. Merging the closest pairs of clusters sequentially.
C. Identifying dense regions separated by sparse regions.
D. Assigning points to the nearest centroid.

3 In the initial step of agglomerative hierarchical clustering on a dataset with points, how many clusters are there?

Hierarchical clustering: Agglomerative vs. Divisive Easy
A. $1$
B.
C.
D.

4 Which linkage method defines the distance between two clusters as the shortest distance between any two points in the clusters?

Linkage methods (single, complete, average, Ward) Easy
A. Ward's method
B. Average linkage
C. Single linkage
D. Complete linkage

5 Complete linkage is based on which of the following metrics?

Linkage methods (single, complete, average, Ward) Easy
A. The minimum distance between points in two clusters.
B. The increase in within-cluster variance.
C. The average distance of all pairs of points.
D. The maximum distance between any two points in the clusters.

6 Which linkage method aims to minimize the total within-cluster variance when merging two clusters?

Linkage methods (single, complete, average, Ward) Easy
A. Average linkage
B. Single linkage
C. Complete linkage
D. Ward's method

7 What is the primary characteristic of average linkage in hierarchical clustering?

Linkage methods (single, complete, average, Ward) Easy
A. It calculates the average of distances between all pairs of points across two clusters.
B. It uses the closest pair of points.
C. It explicitly minimizes the sum of squared errors.
D. It uses the furthest pair of points.

8 Which linkage method is most prone to the 'chaining' effect, where clusters end up being long and uncompact?

Linkage methods (single, complete, average, Ward) Easy
A. Ward's method
B. Average linkage
C. Single linkage
D. Complete linkage

9 What is a dendrogram?

Dendrogram interpretation Easy
A. A bar chart of cluster sizes.
B. A scatter plot showing density distributions.
C. A mathematical formula for distance calculation.
D. A tree-like diagram that records the sequences of merges or splits in hierarchical clustering.

10 How can you determine the final clusters from a dendrogram?

Dendrogram interpretation Easy
A. By calculating the area under the curve.
B. By measuring the thickness of the lines.
C. By looking at the x-axis labels only.
D. By drawing a horizontal line to cut the dendrogram at a desired height.

11 In a standard dendrogram, what does the vertical height (y-axis) of a merge point represent?

Dendrogram interpretation Easy
A. The total variance of the dataset.
B. The density of the resulting cluster.
C. The number of data points in the merged cluster.
D. The distance or dissimilarity between the two clusters being merged.

12 What does the acronym DBSCAN stand for?

Density-based clustering: DBSCAN fundamentals Easy
A. Density-Based Spatial Clustering of Applications with Noise
B. Distribution-Based Spatial Clustering Algorithm for Nodes
C. Distance-Based Spatial Clustering of Applications with Noise
D. Data-Based Sequential Clustering and Normalization

13 Which of the following is a major advantage of DBSCAN?

Density-based clustering: DBSCAN fundamentals Easy
A. It can discover clusters of arbitrary shapes.
B. It requires the user to specify the number of clusters in advance.
C. It forces every single point into a cluster.
D. It only works well with linearly separable data.

14 In DBSCAN, what defines a 'core point'?

e-neighborhood, MinPts, Noise and border points Easy
A. A point that has zero neighbors.
B. A point that belongs to multiple clusters.
C. A point that has at least MinPts within its -neighborhood.
D. A point that is the furthest from the center of a cluster.

15 How does DBSCAN classify a 'border point'?

e-neighborhood, MinPts, Noise and border points Easy
A. It does not belong to any -neighborhood.
B. It has more than MinPts within its -neighborhood.
C. It is randomly assigned to a cluster boundary.
D. It has fewer than MinPts within its -neighborhood, but falls within the neighborhood of a core point.

16 What is a 'noise point' in the context of DBSCAN?

e-neighborhood, MinPts, Noise and border points Easy
A. A point that is neither a core point nor a border point.
B. A point that is the exact center of a cluster.
C. A point that connects two distinct clusters.
D. A point that satisfies the MinPts condition perfectly.

17 In DBSCAN, what does the parameter (epsilon) represent?

e-neighborhood, MinPts, Noise and border points Easy
A. The threshold for hierarchical cluster merging.
B. The maximum distance (radius) used to define the neighborhood of a point.
C. The total number of clusters to form.
D. The minimum number of points required to form a cluster.

18 Unlike Hierarchical clustering and DBSCAN, what must be provided as an explicit input parameter to the standard k-Means algorithm?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Easy
A. The linkage criteria
B. The radius
C. The number of clusters
D. The minimum number of points (MinPts)

19 Which clustering algorithm inherently identifies outliers and explicitly leaves them unclustered?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Easy
A. k-Means
B. Ward's Hierarchical
C. DBSCAN
D. Agglomerative Hierarchical

20 Which clustering method generates a tree-like hierarchy of clusters that does not require an initial assumption about the number of clusters?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Easy
A. k-Means
B. Hierarchical clustering
C. K-Medoids
D. DBSCAN

21 In the context of hierarchical clustering on a dataset with points, which of the following accurately describes the initial and final states of the Divisive approach?

Hierarchical clustering: Agglomerative vs. Divisive Medium
A. Initial state: clusters containing 1 point each; Final state: 1 cluster containing points
B. Initial state: clusters; Final state: clusters containing 1 point each
C. Initial state: 1 cluster containing points; Final state: clusters containing 1 point each
D. Initial state: 1 cluster containing points; Final state: clusters based on density

22 Which linkage method defines the distance between two clusters as the maximum distance between any single point in the first cluster and any single point in the second cluster, tending to produce tightly bound, spherical clusters?

Linkage methods (single, complete, average, Ward) Medium
A. Average linkage
B. Single linkage
C. Ward's linkage
D. Complete linkage

23 An analyst notices that their hierarchical clustering results suffer from the "chaining effect," where clusters are stretched out into long, thin bands. Which linkage method was most likely used?

Linkage methods (single, complete, average, Ward) Medium
A. Ward's method
B. Complete linkage
C. Average linkage
D. Single linkage

24 Unlike single, complete, and average linkage which rely strictly on pairwise distances between points, Ward's method decides which clusters to merge by minimizing what metric?

Linkage methods (single, complete, average, Ward) Medium
A. The between-cluster sum of squared errors
B. The maximum distance between cluster centroids
C. The within-cluster sum of squared errors (WCSS)
D. The median distance between all cluster points

25 When interpreting a dendrogram generated by agglomerative clustering, what does the vertical height (on the y-axis) of a horizontal merge line represent?

Dendrogram interpretation Medium
A. The variance explained by the merging of the two clusters
B. The distance or dissimilarity threshold at which the two clusters were merged
C. The number of data points in the newly formed cluster
D. The density of the newly formed cluster

26 If you draw a horizontal cut line across a dendrogram at a specific height , how is the number of resulting clusters determined?

Dendrogram interpretation Medium
A. By summing the data points below height
B. By calculating the ratio of total height to
C. By counting the number of vertical lines that intersect the horizontal cut line
D. By counting the number of horizontal merge lines exactly at height

27 In the DBSCAN algorithm, what is the primary condition for a data point to be classified as a "core point"?

Density-based clustering: DBSCAN fundamentals Medium
A. It must have a distance less than to all other points in the dataset.
B. It must be reachable from another core point and have fewer than neighbors.
C. It must have at least number of points (including itself) within its -neighborhood.
D. It must be exactly in the geometric center of a cluster.

28 Consider a point in a dataset analyzed via DBSCAN. Point has fewer than points in its -neighborhood, but it falls within the -neighborhood of a core point . How will DBSCAN classify point ?

Noise and border points Medium
A. As a core point
B. As an outlier
C. As a noise point
D. As a border point

29 If DBSCAN finishes running and a point is neither a core point nor reachable from any core point in the dataset, what is its final designation?

Noise and border points Medium
A. Noise point
B. Border point
C. Centroid
D. Isolated core point

30 If you decrease the value of (epsilon) while keeping constant in DBSCAN, what is the most likely effect on the clustering output?

e-neighborhood, MinPts Medium
A. Clusters will merge together into larger, single clusters.
B. More points will be classified as noise, and existing clusters may split.
C. Fewer points will be classified as noise.
D. The number of core points will increase.

31 You are given a dataset containing two concentric ring-shaped clusters. Which clustering algorithm is best suited to correctly identify these two non-linear clusters?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium
A. Divisive clustering with complete linkage
B. Agglomerative clustering with Ward's linkage
C. k-Means clustering
D. DBSCAN

32 Which of the following statements accurately compares k-Means and DBSCAN regarding their handling of outliers?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium
A. DBSCAN incorporates outliers into the nearest border points, whereas k-Means drops them from the dataset.
B. Both algorithms are highly robust to outliers because they use median distances.
C. k-Means forces all points into clusters, shifting centroids due to outliers, whereas DBSCAN ignores isolated points as noise.
D. Both algorithms assign a specific "noise" label to outliers.

33 When comparing the time complexity of clustering algorithms for a large dataset of points, which of the following is generally true?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium
A. k-Means is generally faster than Standard Agglomerative Hierarchical clustering .
B. Divisive clustering is always faster than k-Means because it splits data linearly.
C. Standard Agglomerative Hierarchical clustering is generally faster than k-Means .
D. DBSCAN is always the slowest algorithm regardless of indexing.

34 In DBSCAN, what happens to the clustering model if is set to 1?

e-neighborhood, MinPts Medium
A. Every point will be classified as a core point, and points within of each other will form clusters.
B. Every point will be classified as noise regardless of the value.
C. The algorithm will behave exactly like k-Means.
D. The algorithm will fail to execute due to a division by zero error.

35 Which of the following is a major drawback of standard Agglomerative Hierarchical Clustering?

Hierarchical clustering: Agglomerative vs. Divisive Medium
A. Once a merge is performed, it cannot be undone in subsequent steps.
B. It can only identify spherical clusters.
C. It randomly initializes centroids, leading to non-deterministic results.
D. It requires the user to specify the number of clusters before running the algorithm.

36 Which algorithm does NOT require the user to explicitly declare the desired number of clusters beforehand, but instead infers the number of clusters from the data's properties?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Medium
A. DBSCAN
B. k-Means
C. k-Medoids
D. Spectral Clustering

37 You observe a dendrogram where the longest vertical branches without any horizontal merges occur between a height of 10 and 25. What does this gap suggest about the dataset?

Dendrogram interpretation Medium
A. The algorithm encountered a local minimum between heights 10 and 25.
B. The clusters formed above height 25 are completely identical to those below height 10.
C. The data has a high amount of noise points.
D. Cutting the dendrogram anywhere between 10 and 25 will yield a highly stable and natural clustering partition.

38 In the formal definition of DBSCAN, what is the relationship between "Directly Density-Reachable" and "Density-Reachable"?

Density-based clustering: DBSCAN fundamentals Medium
A. Directly density-reachable applies only to noise points, while density-reachable applies to core points.
B. They are synonymous terms describing identical spatial relationships.
C. Density-reachable is the transitive closure of directly density-reachable.
D. Density-reachable is a symmetric relationship, while directly density-reachable is not.

39 Suppose you are computing the distance between Cluster A (containing 3 points) and Cluster B (containing 4 points) using Average Linkage. How many pairwise distance calculations are averaged to find the distance between A and B?

Linkage methods (single, complete, average, Ward) Medium
A. 7
B. 12
C. 1
D. 2

40 A border point in DBSCAN is within the -neighborhood of two different core points, and , which belong to two entirely separate clusters. How does DBSCAN handle the assignment of point ?

Noise and border points Medium
A. It assigns to both clusters simultaneously, creating an overlapping clustering.
B. It assigns to whichever cluster's core point discovers it first.
C. It classifies as a noise point because it cannot break the tie.
D. It duplicates , putting one copy in 's cluster and one in 's cluster.

41 An exact divisive hierarchical clustering algorithm on a dataset of points requires evaluating all possible bipartite splits of a cluster at each step. What is the worst-case time complexity of creating the first split in this exact divisive approach, assuming no heuristic optimizations like DIANA are used?

Hierarchical clustering: Agglomerative vs. Divisive Hard
A.
B.
C.
D.

42 Ward's linkage method aims to minimize the total within-cluster variance. When deciding to merge clusters and with centroids and , the increase in the Sum of Squared Errors (SSE), denoted as , is proportional to the squared Euclidean distance between their centroids. Which of the following defines the exact cost ?

Linkage methods (single, complete, average, Ward) Hard
A.
B.
C.
D.

43 Consider the Lance-Williams update formula: . For Complete Linkage, what are the specific values of and ?

Linkage methods (single, complete, average, Ward) Hard
A.
B.
C.
D.

44 Single Linkage clustering can be directly derived from a Minimum Spanning Tree (MST) of the data points. If a dataset has points and all pairwise distances are distinct, how can one obtain the exact clusters produced by Single Linkage from the MST?

Linkage methods (single, complete, average, Ward) Hard
A. By finding the subtrees with the minimum total edge weight.
B. By removing all edges in the MST with weights greater than the median edge weight, recursively times.
C. By removing the edges in the MST that have the largest weights.
D. It is impossible; Single Linkage handles graph components differently than Kruskal's or Prim's algorithms.

45 You calculate the cophenetic correlation coefficient to evaluate an agglomerative clustering output. Which of the following scenarios would theoretically yield a cophenetic correlation coefficient of strictly $1.0$?

Dendrogram interpretation Hard
A. The original distance matrix exactly satisfies the ultrametric inequality for all triplets of points.
B. The data points lie exactly on a one-dimensional straight line.
C. The original distance matrix strictly satisfies the triangle inequality for all points.
D. Ward's linkage is used on a dataset where all variables follow a standard normal distribution.

46 A researcher is analyzing a dendrogram generated by a hierarchical clustering algorithm. They observe 'reversals' (or 'inversions'), where a parent node merges at a lower distance height than its child nodes. Which of the following explains this anomaly?

Dendrogram interpretation Hard
A. Ward's linkage was used, causing the objective function to contract heavily in the initial merges.
B. The data contains highly dense clusters intermixed with uniform noise, causing scaling issues on the y-axis.
C. The distance metric used was non-Euclidean, such as Cosine distance or Manhattan distance.
D. The researcher used a linkage method that violates the space-conserving Lance-Williams constraints, such as Centroid or Median linkage.

47 In a dendrogram produced by Complete Linkage, a very long vertical line segment with no horizontal merges branching off indicates:

Dendrogram interpretation Hard
A. That the clusters merged at the bottom of the vertical line are highly highly chained and space-contracting.
B. A substantial range of distance thresholds (cutting heights) over which the number of clusters and their compositions remain unchanged.
C. A violation of the ultrametric inequality during the agglomeration steps.
D. That the dataset contains extreme outliers that were forced to merge at a localized point.

48 In the context of DBSCAN, let be an undirected graph where vertices are data points. An edge exists between and if and . How can the resulting DBSCAN clusters be rigorously defined in graph-theoretic terms?

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard
A. They are the strongly connected components of the entire graph including noise points.
B. They correspond to all maximal cliques in of size at least MinPts.
C. They are the strictly bipartite subgraphs where the two sets are core points and border points respectively.
D. They are the connected components of the subgraph induced strictly by the core points, with border points appended to any adjacent core-point component.

49 In DBSCAN, density-reachability and density-connectedness are foundational concepts. Which of the following statements strictly holds regarding their mathematical properties?

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard
A. Both density-reachability and density-connectedness are strictly symmetric and transitive equivalence relations.
B. Density-reachability is symmetric but not transitive, whereas density-connectedness is transitive but not symmetric.
C. Neither density-reachability nor density-connectedness exhibit symmetry in datasets containing varying densities.
D. Density-reachability is transitive but generally asymmetric, whereas density-connectedness is both symmetric and transitive.

50 Assume a dataset in a 10,000-dimensional space where points are distributed uniformly at random. When applying DBSCAN to this dataset, which of the following phenomena severely hinders its effectiveness due to the Curse of Dimensionality?

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard
A. The density-reachability condition becomes symmetric for all points, collapsing the distinction between core and border points.
B. The distance from a point to its nearest neighbor approaches the distance to its farthest neighbor, making it nearly impossible to distinguish dense regions from sparse regions using a fixed .
C. The algorithm requires computing distance comparisons, exceeding computational limits.
D. All points automatically become core points regardless of because the volume of the -sphere expands to encompass the entire space exponentially fast.

51 Which of the following is an inherent source of non-determinism in the standard DBSCAN algorithm regarding border points?

Noise and border points Hard
A. The algorithm randomly selects border points to merge density-reachable clusters, leading to different final cluster counts.
B. A border point that falls within the -neighborhood of core points belonging to two distinct clusters will be assigned to whichever cluster is processed first.
C. If a border point's distance to a core point is exactly equal to , floating-point instability will randomly assign it to the noise category.
D. Border points frequently fluctuate between being classified as noise and border points depending on the random initialization of the algorithm's KD-tree.

52 Consider a dataset clustered by DBSCAN with parameters and . Point is identified as a border point. If a single noise point located far away from all clusters is removed from the dataset, and DBSCAN is re-run with the identical parameters, which of the following is guaranteed to be true regarding Point ?

Noise and border points Hard
A. Point could be upgraded to a core point because the global density of the dataset decreases.
B. Point will still be identified as a border point, or it could potentially become a noise point if the distance matrix recalculation shifts the space.
C. Point will definitely remain a border point, as the removal of a distant noise point cannot affect the -neighborhoods of the core points near .
D. Point will turn into a noise point, because DBSCAN relies on global average density to determine relative border distances.

53 Let be a dataset. You run DBSCAN with and an arbitrary . Which of the following correctly describes the resulting classifications of points?

Noise and border points Hard
A. The classification will perfectly mirror Single Linkage clustering cut at height , resulting in a mix of core and border points.
B. All points are classified as core points; there are absolutely no border points or noise points.
C. All points are classified as noise points.
D. All points are classified as border points, and there are no core points.

54 Suppose you have a dataset with dense data points in . You require a clustering algorithm that can theoretically operate within or memory and time limits. Which of the following approaches is most feasible without utilizing severe dataset subsampling?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard
A. Agglomerative Hierarchical Clustering with Ward's linkage, using a Lance-Williams matrix update.
B. Agglomerative Hierarchical Clustering with Single Linkage, using a generic distance matrix.
C. DBSCAN, heavily leveraging a spatial index such as an R-tree or KD-tree.
D. Divisive Hierarchical Clustering using an exact bipartite split algorithm.

55 You are tasked with clustering a 2D dataset consisting of a dense inner ring and a sparse, expanding outer ring of points (concentric circles of varying densities). Which of the following algorithms and configurations will fundamentally fail to isolate the two rings as distinct clusters?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard
A. Single Linkage Hierarchical Clustering, because it suffers from the chaining effect and cannot separate non-convex shapes.
B. OPTICS (Ordering Points To Identify the Clustering Structure), because it strictly inherits DBSCAN's inability to handle varying densities.
C. k-Means with , assuming the centroids are initialized using k-Means++.
D. DBSCAN, because a single fixed cannot simultaneously accommodate the dense inner ring (without merging it into noise) and the sparse outer ring (without classifying it as noise).

56 Inductive clustering models can readily assign out-of-sample data points to existing clusters without re-running the algorithm on the entire combined dataset. Transductive models generally lack this native capability. Categorize k-Means, DBSCAN, and Standard Agglomerative Hierarchical clustering based on these definitions.

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard
A. All three are inherently Transductive; Inductive extensions require supervised learning.
B. k-Means and DBSCAN are Inductive. Agglomerative Hierarchical is Transductive.
C. k-Means is Inductive. DBSCAN and Agglomerative Hierarchical are Transductive.
D. Agglomerative Hierarchical is Inductive. k-Means and DBSCAN are Transductive.

57 The MacNaughton-Smith algorithm is a divisive hierarchical approach. Unlike the computationally prohibitive exact divisive method, how does it primarily construct the sequence of splitting?

Hierarchical clustering: Agglomerative vs. Divisive Hard
A. By calculating the eigenvectors of the unnormalized Laplacian matrix and splitting at the median.
B. By starting with all points in one cluster, selecting the point with the highest average dissimilarity to the rest as a 'splinter group', and iteratively moving points to it.
C. By repeatedly running k-Means with and picking the cluster with the highest variance to split next.
D. By recursively removing the longest edge in the dataset's Minimum Spanning Tree.

58 Consider a dataset clustered effectively by k-Means, DBSCAN, and Complete Linkage Hierarchical Clustering using Euclidean distance. If the dataset undergoes an affine transformation where the x-axis is scaled by a factor of , which algorithm's underlying logic will be fundamentally robust to this transformation, assuming parameters are unadjusted?

Comparison: k-Means vs. Hierarchical vs. DBSCAN Hard
A. Complete Linkage Hierarchical, as relative maximal distances are preserved.
B. DBSCAN, as density remains topologically equivalent despite the scaling.
C. None of the algorithms are robust to non-uniform scaling when using standard Euclidean distance.
D. k-Means, because the centroids simply stretch along the x-axis.

59 A data scientist decides to construct a k-distance graph to determine the optimal for DBSCAN, setting . They plot the sorted k-distances of all points in descending order. If the graph exhibits two distinct, sharp 'knees' (inflection points) separated by a long plateau, what topological characteristic does the dataset likely possess?

Density-based clustering: DBSCAN fundamentals, e-neighborhood, MinPts Hard
A. The dataset contains at least two distinct clusters of significantly different densities.
B. The dataset is essentially uniformly distributed without any cluster structures.
C. The dataset consists exclusively of noise points randomly scattered around a single core point.
D. The dataset suffers from the curse of dimensionality, rendering the distance metric useless.

60 Average Linkage clustering (UPGMA) avoids the chaining effect of Single Linkage and the extreme sensitivity to outliers of Complete Linkage. What is a strict mathematical requirement for the distance metric between points for Average Linkage to produce an ultrametric tree without reversals?

Linkage methods (single, complete, average, Ward) Hard
A. The distance metric must be Euclidean, as Ward's linkage is the only alternative for non-Euclidean spaces.
B. The distance metric must be identical to the Pearson correlation coefficient.
C. The distance metric must satisfy and , and UPGMA will always guarantee monotonicity.
D. There is no requirement; UPGMA is space-conserving and inherently monotonic regardless of the strict metric properties of the underlying dissimilarity matrix.