1
What is the primary characteristic of Unsupervised Learning?
A. The algorithm uses a feedback loop for rewards
B. The algorithm trains on data without labels
C. The algorithm predicts a continuous numerical value
D. The algorithm trains on labeled data
Correct Answer: The algorithm trains on data without labels
Explanation:
Unsupervised learning deals with input data that does not have corresponding output labels, aiming to find hidden structures.
2
Which of the following is a primary goal of clustering?
A. To reduce the noise in a signal
B. To group similar data points together
C. To classify images into predefined categories
D. To predict future values based on past trends
Correct Answer: To group similar data points together
Explanation:
Clustering aims to partition data such that points within a group are similar to each other and different from points in other groups.
3
In the context of K-Means, what does 'K' represent?
A. The dimension of the features
B. The number of clusters
C. The number of iterations
D. The number of data points
Correct Answer: The number of clusters
Explanation:
K represents the pre-defined number of clusters the algorithm attempts to identify in the dataset.
4
What kind of problem is K-Means designed to solve?
A. Clustering
B. Classification
C. Reinforcement Learning
D. Regression
Correct Answer: Clustering
Explanation:
K-Means is a popular unsupervised algorithm used for clustering analysis.
5
What is a 'centroid' in the K-Means algorithm?
A. An outlier in the dataset
B. The boundary line between clusters
C. The data point furthest from the center
D. The geometric center of a cluster
Correct Answer: The geometric center of a cluster
Explanation:
A centroid represents the mean position of all the data points belonging to a specific cluster.
6
Which distance metric is most commonly used in standard K-Means?
A. Manhattan distance
B. Hamming distance
C. Euclidean distance
D. Cosine similarity
Correct Answer: Euclidean distance
Explanation:
Standard K-Means typically minimizes the within-cluster sum of squared Euclidean distances.
7
What is the first step of the K-Means algorithm?
A. Initialize cluster centroids
B. Assign points to the nearest cluster
C. Calculate the total error
D. Update the centroids
Correct Answer: Initialize cluster centroids
Explanation:
The algorithm begins by selecting K initial centroids, either randomly or using specific heuristics.
8
During the assignment step of K-Means, how is a data point assigned to a cluster?
A. Randomly
B. To the cluster with the highest variance
C. To the cluster with the closest centroid
D. To the cluster with the most points
Correct Answer: To the cluster with the closest centroid
Explanation:
Each data point is assigned to the cluster whose centroid is nearest to it based on the distance metric.
9
What happens during the update step of the K-Means algorithm?
A. Centroids are moved to the mean of their assigned points
B. The number of clusters (K) is increased
C. Points are reassigned to different clusters
D. New data points are added
Correct Answer: Centroids are moved to the mean of their assigned points
Explanation:
After points are assigned, the new position of the centroid is calculated as the average (mean) of all points currently in that cluster.
10
When does the K-Means algorithm stop iterating?
A. When the training error is zero
B. When the centroids do not change significantly
C. When K is equal to N
D. After exactly 10 iterations
Correct Answer: When the centroids do not change significantly
Explanation:
Convergence is reached when centroids stabilize (don't move) or point assignments stop changing.
11
What is the optimization objective (cost function) of K-Means?
A. Maximize Inter-cluster distance
B. Minimize Within-Cluster Sum of Squares (WCSS)
C. Maximize the Silhouette score
D. Minimize the number of clusters
Correct Answer: Minimize Within-Cluster Sum of Squares (WCSS)
Explanation:
K-Means tries to minimize the sum of squared distances between data points and their respective cluster centroids (Inertia).
12
The objective function of K-Means is non-convex. What does this imply?
A. It requires labeled data
B. It may get stuck in a local minimum
C. It cannot be optimized
D. It always finds the global minimum
Correct Answer: It may get stuck in a local minimum
Explanation:
Because the function is non-convex, the final result depends on the initialization, and it is not guaranteed to find the absolute best clustering.
13
Which of the following is a disadvantage of the K-Means algorithm?
A. It is computationally very expensive for small datasets
B. It works only on labeled data
C. It is sensitive to outliers
D. It cannot handle numerical data
Correct Answer: It is sensitive to outliers
Explanation:
Outliers can significantly shift the mean (centroid), affecting the assignment of other points and distorting the clusters.
14
If you set K equal to the number of data points (N), what will the WCSS be?
A. Zero
B. Maximum possible value
C. Infinity
D. Undefined
Correct Answer: Zero
Explanation:
If every point is its own cluster, the distance between the point and its centroid is zero, resulting in a total WCSS of zero.
15
What is the 'Elbow Method' used for?
A. Handling outliers
B. Determining the optimal number of clusters (K)
C. Speeding up convergence
D. Initializing centroids
Correct Answer: Determining the optimal number of clusters (K)
Explanation:
The Elbow Method plots WCSS against K to find the point where adding more clusters yields diminishing returns.
16
In the Elbow Method plot, what is typically on the Y-axis?
A. Inertia or WCSS
B. Time taken
C. Number of clusters (K)
D. Accuracy
Correct Answer: Inertia or WCSS
Explanation:
The Y-axis represents the cost (Within-Cluster Sum of Squares), while the X-axis represents the number of clusters.
17
What is the 'Random Initialization Trap' in K-Means?
A. Choosing K randomly leads to errors
B. Randomly picking centroids can lead to poor local optima
C. The algorithm fails if data is random
D. Random data points cannot be clustered
Correct Answer: Randomly picking centroids can lead to poor local optima
Explanation:
Poor random choices for initial centroids can result in sub-optimal clustering or slower convergence.
18
What is K-Means++?
A. A post-processing step for K-Means
B. A method to choose the optimal K
C. A version of K-Means for supervised learning
D. A smarter initialization technique for K-Means
Correct Answer: A smarter initialization technique for K-Means
Explanation:
K-Means++ initializes centroids to be far apart from each other, improving convergence speed and result quality.
19
How does K-Means++ select the first centroid?
A. It chooses the point furthest from the origin
B. It picks the point with the highest variance
C. It picks one data point uniformly at random
D. It calculates the global mean
Correct Answer: It picks one data point uniformly at random
Explanation:
The first centroid is chosen randomly; subsequent centroids are chosen based on probability proportional to distance squared.
20
What is the difference between Hard Clustering and Soft Clustering?
A. Hard clustering allows overlapping; Soft does not
B. Hard clustering assigns a point to one cluster; Soft assigns probabilities
C. Hard clustering is faster; Soft is slower
D. Hard clustering uses K-Means; Soft uses Decision Trees
Correct Answer: Hard clustering assigns a point to one cluster; Soft assigns probabilities
Explanation:
In hard clustering, a point belongs to exactly one cluster. In soft clustering, a point has a degree of membership to all clusters.
21
Standard K-Means is an example of which type of clustering?
A. Soft Clustering
B. Density-based Clustering
C. Hard Clustering
D. Hierarchical Clustering
Correct Answer: Hard Clustering
Explanation:
Standard K-Means assigns each point to the specific cluster with the nearest centroid, implying binary membership.
22
Which algorithm is a well-known example of Soft Clustering?
A. Fuzzy C-Means
B. Agglomerative Clustering
C. K-Means
D. DBSCAN
Correct Answer: Fuzzy C-Means
Explanation:
Fuzzy C-Means allows data points to belong to multiple clusters with varying degrees of membership.
23
If a data point has a membership vector [0.7, 0.2, 0.1] for 3 clusters, this is an example of:
A. Soft Clustering
B. Outlier Detection
C. Hard Clustering
D. Regression
Correct Answer: Soft Clustering
Explanation:
The vector indicates probabilities or weights of belonging to different clusters, characteristic of soft clustering.
24
What shape of clusters does K-Means typically assume?
A. Elongated shapes
B. Spherical or convex
C. Arbitrary shapes
D. Spirals
Correct Answer: Spherical or convex
Explanation:
Because it relies on Euclidean distance and means, K-Means works best on spherical, convex clusters.
25
Why is feature scaling (standardization/normalization) important in K-Means?
A. It is not important
B. To convert categorical data to numerical
C. To prevent features with larger ranges from dominating the distance metric
D. To ensure the algorithm runs faster only
Correct Answer: To prevent features with larger ranges from dominating the distance metric
Explanation:
Since K-Means uses distance, a feature with a range of 0-1000 will overpower a feature with a range of 0-1 if not scaled.
26
What is the computational complexity of one iteration of K-Means?
A. O(K N d)
B. O(e^N)
C. O(N^2)
D. O(N * log N)
Correct Answer: O(K N d)
Explanation:
Where K is clusters, N is data points, and d is dimensions. It is linear with respect to N.
27
In the Elbow method, the 'elbow' point represents:
A. The point where adding another cluster does not significantly reduce WCSS
B. The point of maximum error
C. The point where K equals 1
D. The point where WCSS becomes zero
Correct Answer: The point where adding another cluster does not significantly reduce WCSS
Explanation:
It indicates the optimal trade-off between minimizing error and minimizing model complexity (number of clusters).
28
Which of the following implies that K-Means has converged?
A. The number of clusters decreases
B. WCSS increases
C. The data becomes labeled
D. The assignment of points to clusters remains unchanged
Correct Answer: The assignment of points to clusters remains unchanged
Explanation:
If point assignments don't change, centroids won't change, and the algorithm has reached a stable state.
29
What is 'Inertia' in the context of Scikit-Learn's K-Means implementation?
A. The time taken to run
B. The sum of squared distances of samples to their closest cluster center
C. The number of iterations
D. The distance between cluster centers
Correct Answer: The sum of squared distances of samples to their closest cluster center
Explanation:
Inertia is the specific term used in Scikit-Learn for WCSS (Within-Cluster Sum of Squares).
30
Which strategy is used to mitigate the local optima problem in K-Means?
A. Use Manhattan distance
B. Decrease the learning rate
C. Increase the number of clusters
D. Run the algorithm multiple times with different initializations
Correct Answer: Run the algorithm multiple times with different initializations
Explanation:
Running the algorithm multiple times (n_init) and choosing the result with the lowest WCSS helps avoid local optima.
31
Can K-Means handle categorical data directly?
A. Yes, using Hamming distance
B. Yes, it works natively
C. Only if the data is ordinal
D. No, it requires numerical data
Correct Answer: No, it requires numerical data
Explanation:
Standard K-Means relies on means and Euclidean distance, which are undefined for categorical data (though K-Modes exists for that).
32
In K-Means++, how is the probability of selecting the next centroid determined?
A. Proportional to the squared distance from the nearest existing centroid
B. Randomly with uniform distribution
C. Based on the density of the points
D. Inversely proportional to the distance from existing centroids
Correct Answer: Proportional to the squared distance from the nearest existing centroid
Explanation:
This ensures that new centroids are likely to be far away from existing ones, spreading them out.
33
What is a 'Voronoi Diagram' in relation to K-Means?
A. A type of soft clustering
B. A method to initialize K
C. A visualization where regions are defined by the closest centroid
D. A plot of the cost function
Correct Answer: A visualization where regions are defined by the closest centroid
Explanation:
The partitions created by K-Means can be visualized as Voronoi cells, separating the space based on distance to centroids.
34
If the clusters in the data are of very different densities and sizes, K-Means will:
A. Merge the clusters
B. Likely fail to identify the correct clusters
C. Perform perfectly
D. Automatically adjust the metric
Correct Answer: Likely fail to identify the correct clusters
Explanation:
K-Means assumes clusters are roughly spherical and of similar size/density; it struggles with varying densities.
35
Which step ensures K-Means is an unsupervised algorithm?
A. Minimizing WCSS
B. Iterating until convergence
C. Not using target labels for training
D. Calculating the mean
Correct Answer: Not using target labels for training
Explanation:
The defining feature is that it structures the data based on intrinsic properties rather than external labels.
36
In the equation for WCSS, what is being squared?
A. The distance between two centroids
B. The number of iterations
C. The distance between a point and its assigned centroid
D. The number of clusters
Correct Answer: The distance between a point and its assigned centroid
Explanation:
WCSS sums the squared Euclidean distances between points and their cluster centers.
37
Why is it often difficult to pick the optimal K using the Elbow method?
A. The plot is always a straight line
B. The 'elbow' might not be sharp or clear
C. It takes too long to compute
D. It requires labeled data
Correct Answer: The 'elbow' might not be sharp or clear
Explanation:
Sometimes the curve is smooth, making the choice of the 'elbow' point subjective.
38
What is the primary role of the 'Coordinate Descent' concept in K-Means?
A. It is the method used to optimize the objective function
B. It calculates the distance
C. It is used to visualize data
D. It is used for initialization
Correct Answer: It is the method used to optimize the objective function
Explanation:
K-Means optimizes the cost function by alternating between two steps (assignment and update), effectively performing coordinate descent.
39
If you perform K-Means on a dataset with 2 distinct well-separated blobs but set K=4, what happens?
A. The algorithm crashes
B. It splits the natural blobs into smaller clusters
C. It finds 2 clusters and ignores the other 2
D. It merges the blobs
Correct Answer: It splits the natural blobs into smaller clusters
Explanation:
The algorithm is forced to find 4 clusters, so it will partition the natural blobs to satisfy the requirement.
40
In Soft Clustering, the sum of membership weights for a single data point across all clusters usually equals:
Correct Answer: 1
Explanation:
The weights represent probabilities or proportions, so they must sum to 1 for a given data point.
41
Which of the following is NOT an application of K-Means?
A. Document Clustering
B. Customer Segmentation
C. Image Compression (Color Quantization)
D. Spam Classification (Supervised)
Correct Answer: Spam Classification (Supervised)
Explanation:
Spam classification is typically a supervised learning task (e.g., Naive Bayes, SVM), not clustering.
42
Does K-Means guarantee finding the global optimum for the WCSS?
A. No, it depends on initialization
B. Yes, if K is small
C. Yes, always
D. Only if using Manhattan distance
Correct Answer: No, it depends on initialization
Explanation:
K-Means converges to a local optimum, which is why multiple initializations are often used.
43
The computational cost of the distance calculation step for one point against K centroids is proportional to:
Correct Answer: K
Explanation:
For one point, you must calculate the distance to each of the K centroids.
44
Which component constitutes the 'model' after training K-Means?
A. The Elbow plot
B. The coordinates of the final centroids
C. The list of outliers
D. The original dataset
Correct Answer: The coordinates of the final centroids
Explanation:
The centroids define the clusters; new data can be assigned to clusters based on these centroid locations.
45
What is the relationship between Within-Cluster variance and Between-Cluster variance in a good clustering?
A. Low within-cluster, High between-cluster
B. Low within-cluster, Low between-cluster
C. High within-cluster, High between-cluster
D. High within-cluster, Low between-cluster
Correct Answer: Low within-cluster, High between-cluster
Explanation:
Good clusters have points tight together (low internal variance) and far apart from other clusters (high external variance).
46
Lloyd's Algorithm is another name for:
A. K-Means Algorithm
B. KNN
C. Hierarchical Clustering
D. DBSCAN
Correct Answer: K-Means Algorithm
Explanation:
Standard K-Means is frequently referred to as Lloyd's algorithm.
47
In the context of image segmentation, what does a pixel represent in K-Means?
A. A centroid
B. A data point
C. A cluster
D. A label
Correct Answer: A data point
Explanation:
Each pixel (often represented by RGB values) is treated as a data point to be clustered based on color similarity.
48
Why might one choose a K value slightly different from the Elbow point?
A. To increase computational cost
B. Based on business requirements or downstream tasks
C. Because the Elbow method is always wrong
D. To maximize WCSS
Correct Answer: Based on business requirements or downstream tasks
Explanation:
Domain knowledge (e.g., needing exactly 3 t-shirt sizes: S, M, L) often overrides the purely mathematical suggestion of the Elbow method.
49
If K=1, the centroid location will be:
A. The origin (0,0)
B. A random data point
C. Undefined
D. The mean of the entire dataset
Correct Answer: The mean of the entire dataset
Explanation:
With one cluster, the centroid minimizes distance to all points, which is the global arithmetic mean.
50
What happens if a cluster becomes empty during K-Means iterations?
A. The empty cluster is usually re-initialized or removed
B. It is ignored and WCSS becomes 0
C. The algorithm stops
D. The K value increases
Correct Answer: The empty cluster is usually re-initialized or removed
Explanation:
Implementations typically handle this by resetting the centroid to a random point or the point furthest from its current centroid.