1Which of the following best defines Unsupervised Learning?
A.Learning where the model predicts a continuous target variable
B.Learning where the model predicts a categorical target variable
C.Learning where the data has no predefined labels or target variables
D.Learning where the model is rewarded or punished based on actions
Correct Answer: Learning where the data has no predefined labels or target variables
Explanation:Unsupervised learning involves training algorithms on data that does not have labels, aiming to find hidden structures or patterns.
Incorrect! Try again.
2In K-Means clustering, what does 'K' represent?
A.The number of iterations
B.The number of features in the dataset
C.The number of clusters the algorithm forms
D.The distance metric used
Correct Answer: The number of clusters the algorithm forms
Explanation:K is a hyperparameter that determines the specific number of centroids (and thus clusters) the algorithm attempts to identify.
Incorrect! Try again.
3What is the primary objective function that K-Means minimizes?
A.Inter-cluster distance
B.Within-Cluster Sum of Squares (WCSS)
C.The number of outliers
D.The Silhouette coefficient
Correct Answer: Within-Cluster Sum of Squares (WCSS)
Explanation:K-Means aims to minimize the variance within each cluster, calculated as the sum of squared distances between data points and their respective cluster centroids.
Incorrect! Try again.
4Which of the following is the first step in the standard K-Means algorithm?
A.Calculate the centroid of all points
B.Assign points to the nearest cluster
C.Randomly initialize K centroids
D.Calculate the total variance
Correct Answer: Randomly initialize K centroids
Explanation:The algorithm begins by selecting K random points from the dataset (or random coordinates) to serve as initial centroids.
Incorrect! Try again.
5The 'Random Initialization Trap' in K-Means refers to:
A.The algorithm running indefinitely without convergence
B.The algorithm selecting the wrong number of K
C.Different initial centroid positions leading to different, suboptimal results
D.The inability to handle categorical data
Correct Answer: Different initial centroid positions leading to different, suboptimal results
Explanation:Because K-Means uses a greedy approach, poor random starting points can cause the algorithm to get stuck in a local minimum rather than the global minimum.
Incorrect! Try again.
6Which technique is commonly used to mitigate the Random Initialization Trap?
A.K-Means++
B.Agglomerative Clustering
C.Principal Component Analysis
D.Gradient Descent
Correct Answer: K-Means++
Explanation:K-Means++ is an initialization algorithm that selects initial centroids far apart from each other, improving the chances of finding the global optimum.
Incorrect! Try again.
7What is the 'Elbow Method' used for in K-Means clustering?
A.Determining the optimal number of clusters
B.Calculating the distance between centroids
C.Visualizing high-dimensional data
D.Stopping the algorithm early
Correct Answer: Determining the optimal number of clusters
Explanation:The Elbow Method plots WCSS against the number of clusters (K). The 'elbow' point represents the K where adding more clusters yields diminishing returns in variance reduction.
Incorrect! Try again.
8In an Elbow Method plot, what variable is typically on the Y-axis?
A.Number of Clusters (K)
B.WCSS (Inertia)
C.Accuracy
D.Computation Time
Correct Answer: WCSS (Inertia)
Explanation:The Y-axis represents the Within-Cluster Sum of Squares (or Inertia), which decreases as the number of clusters increases.
Incorrect! Try again.
9When does the K-Means algorithm stop iterating?
A.When the WCSS becomes zero
B.When the centroids no longer move significantly between iterations
C.When every point is in its own cluster
D.When the number of clusters equals the number of data points
Correct Answer: When the centroids no longer move significantly between iterations
Explanation:Convergence is reached when the assignment of points to clusters does not change and centroids remain stable.
Incorrect! Try again.
10Which of the following is a limitation of K-Means clustering?
A.It is computationally expensive for small datasets
B.It requires the number of clusters to be specified in advance
C.It can only handle binary data
D.It always finds the global optimum
Correct Answer: It requires the number of clusters to be specified in advance
Explanation:Unlike hierarchical clustering, K-Means requires the user to define K before running the algorithm, which is not always known.
Incorrect! Try again.
11Agglomerative Hierarchical Clustering is best described as a approach.
A.Top-down
B.Bottom-up
C.Centroid-based
D.Density-based
Correct Answer: Bottom-up
Explanation:Agglomerative clustering starts by treating each data point as a single cluster and then iteratively merges pairs of clusters until one big cluster remains.
Incorrect! Try again.
12Divisive Hierarchical Clustering is best described as a approach.
A.Top-down
B.Bottom-up
C.Randomized
D.Grid-based
Correct Answer: Top-down
Explanation:Divisive clustering starts with all points in one cluster and recursively splits them into smaller clusters.
Incorrect! Try again.
13What diagram is commonly used to visualize Hierarchical Clustering?
A.Scatter plot
B.Histogram
C.Dendrogram
D.Box plot
Correct Answer: Dendrogram
Explanation:A dendrogram is a tree-like diagram that records the sequences of merges or splits in hierarchical clustering.
Incorrect! Try again.
14In a dendrogram, the vertical axis typically represents:
A.The number of clusters
B.The frequency of data points
C.The Euclidean distance or dissimilarity between clusters
D.The time taken to cluster
Correct Answer: The Euclidean distance or dissimilarity between clusters
Explanation:The height of the U-shaped lines in a dendrogram indicates the distance (dissimilarity) at which clusters were merged.
Incorrect! Try again.
15How do you determine the optimal number of clusters using a dendrogram?
A.Count the number of leaves at the bottom
B.Cut the dendrogram at the point with the longest vertical distance without crossing horizontal lines
C.Choose the height where the first merge occurs
D.It is impossible to determine K from a dendrogram
Correct Answer: Cut the dendrogram at the point with the longest vertical distance without crossing horizontal lines
Explanation:A large vertical distance implies that the clusters formed are distinct and far apart; cutting across this gap defines the natural clusters.
Incorrect! Try again.
16Which linkage method defines the distance between two clusters as the shortest distance between any single point in one cluster and any single point in the other?
A.Complete Linkage
B.Average Linkage
C.Single Linkage
D.Ward's Method
Correct Answer: Single Linkage
Explanation:Single linkage looks for the nearest neighbors across two clusters (minimum distance).
Incorrect! Try again.
17Which linkage method defines the distance between two clusters as the maximum distance between any point in the first cluster and any point in the second?
A.Single Linkage
B.Complete Linkage
C.Centroid Linkage
D.Average Linkage
Correct Answer: Complete Linkage
Explanation:Complete linkage considers the farthest points (maximum distance) to determine the distance between clusters.
Incorrect! Try again.
18Average Linkage calculates the distance between clusters by:
A.Using the distance between the centroids of the clusters
B.Taking the average of all pairwise distances between points in the two clusters
C.Taking the median distance of all points
D.Using the minimum distance between points
Correct Answer: Taking the average of all pairwise distances between points in the two clusters
Explanation:Average linkage computes the mean distance between all pairs of items from the two clusters.
Incorrect! Try again.
19Centroid Linkage measures the distance between clusters based on:
A.The distance between the geometric centers (means) of the clusters
B.The furthest points in the clusters
C.The closest points in the clusters
D.The sum of squared errors
Correct Answer: The distance between the geometric centers (means) of the clusters
Explanation:Centroid linkage calculates the distance between the centroid vector of cluster A and the centroid vector of cluster B.
Incorrect! Try again.
20Which linkage method is most notorious for producing the 'chaining' effect (long, stringy clusters)?
A.Complete Linkage
B.Single Linkage
C.Ward's Method
D.Average Linkage
Correct Answer: Single Linkage
Explanation:Because Single Linkage merges based on the closest points, it can merge distinct clusters if a chain of points connects them, often referred to as the chaining phenomenon.
Incorrect! Try again.
21What is a primary advantage of Hierarchical Clustering over K-Means?
A.It is computationally faster on large datasets
B.It does not require assuming the number of clusters (K) beforehand
C.It works better with high-dimensional data
D.It always uses Manhattan distance
Correct Answer: It does not require assuming the number of clusters (K) beforehand
Explanation:Hierarchical clustering produces a dendrogram that allows the analyst to choose the number of clusters after inspecting the structure, unlike K-Means where K is an input.
Incorrect! Try again.
22Market Basket Analysis is a specific application of which technique?
A.Linear Regression
B.Clustering
C.Association Rule Learning
D.Decision Trees
Correct Answer: Association Rule Learning
Explanation:Market Basket Analysis uses Association Rules to discover patterns in transaction data, such as products frequently bought together.
Incorrect! Try again.
23In the rule {Bread} -> {Butter}, {Bread} is the:
A.Consequent
B.Antecedent
C.Lift
D.Support
Correct Answer: Antecedent
Explanation:The 'if' part of the rule (left-hand side) is the antecedent. The 'then' part (right-hand side) is the consequent.
Incorrect! Try again.
24What does the metric 'Support' measure in Association Rules?
A.The reliability of the rule
B.The frequency with which an itemset appears in the dataset
C.The ratio of the rule's confidence to the expected confidence
D.The correlation between items
Correct Answer: The frequency with which an itemset appears in the dataset
Explanation:Support(A) is the proportion of transactions in the dataset that contain item A.
Incorrect! Try again.
25How is 'Confidence' for the rule A -> B calculated?
A.Support(A & B) / Support(B)
B.Support(A & B) / Support(A)
C.Support(A) / Support(B)
D.Support(B) / Support(A)
Correct Answer: Support(A & B) / Support(A)
Explanation:Confidence measures how often B appears in transactions that contain A. It is the conditional probability P(B|A).
Incorrect! Try again.
26What does a 'Lift' value greater than 1 indicate?
A.The items are independent of each other
B.The items are substitutes (negatively correlated)
C.The presence of the antecedent increases the likelihood of the consequent
D.The rule is invalid
Correct Answer: The presence of the antecedent increases the likelihood of the consequent
Explanation:Lift > 1 implies a positive association; the items are bought together more frequently than would be expected by chance.
Incorrect! Try again.
27If the Lift of a rule A -> B is exactly 1, what does this imply?
A.A and B are perfectly correlated
B.A and B are independent
C.A and B are never bought together
D.The confidence is 100%
Correct Answer: A and B are independent
Explanation:A Lift of 1 means the probability of occurring together is exactly the same as the product of their individual probabilities, implying no relationship.
Incorrect! Try again.
28Which algorithm is most commonly associated with mining frequent itemsets for Association Rules?
A.Apriori
B.Random Forest
C.K-Nearest Neighbors
D.Naive Bayes
Correct Answer: Apriori
Explanation:The Apriori algorithm is the standard algorithm used to identify frequent itemsets and generate association rules.
Incorrect! Try again.
29The Apriori algorithm uses the 'Downward Closure Property'. What does this property state?
A.All supersets of a frequent itemset must be frequent
B.All subsets of a frequent itemset must be frequent
C.If an itemset is infrequent, its subsets must be frequent
D.Support always equals Confidence
Correct Answer: All subsets of a frequent itemset must be frequent
Explanation:This property allows the algorithm to prune the search space. If {A, B} is frequent, then {A} and {B} must also be frequent.
Incorrect! Try again.
30Which of the following is NOT a step in the K-Means algorithm?
A.Assignment of points to the nearest centroid
B.Update of centroids to the mean of assigned points
C.Calculation of distance matrix for all pairs of points
D.Initialization of centroids
Correct Answer: Calculation of distance matrix for all pairs of points
Explanation:K-Means calculates distances from points to centroids, not a full pairwise distance matrix between all points (which is done in Hierarchical clustering).
Incorrect! Try again.
31What is the main reason to scale features (normalize/standardize) before running K-Means?
A.To make the data normally distributed
B.To prevent features with large magnitudes from dominating distance calculations
C.To convert categorical variables to numeric
D.To increase the number of clusters
Correct Answer: To prevent features with large magnitudes from dominating distance calculations
Explanation:Since K-Means relies on Euclidean distance, a feature ranging from 0-1000 will overpower a feature ranging from 0-1 without scaling.
Incorrect! Try again.
32In the context of Association Rules, the 'Consequent' is found on which side of the arrow?
A.Left (IF side)
B.Right (THEN side)
C.Both sides
D.Neither side
Correct Answer: Right (THEN side)
Explanation:In A -> B, B is the consequent (the item implied by the rule).
Incorrect! Try again.
33Which metric would you look at to determine if a high-confidence rule is merely a coincidence because the consequent is very popular?
A.Support
B.Confidence
C.Lift
D.Accuracy
Correct Answer: Lift
Explanation:Lift accounts for the popularity of the consequent. If B is very popular, Confidence(A->B) might be high just by chance, but Lift will correct for this.
Incorrect! Try again.
34If Support(A) = 0.4 and Support(A, B) = 0.2, what is the Confidence(A -> B)?
Explanation:K-Means is affected by outliers (which shift centroids significantly), scaling (distance dominance), and redundant features (inflating importance).
Incorrect! Try again.
36Which clustering method generates a hierarchy of clusters?
A.Partitioning Clustering (K-Means)
B.Hierarchical Clustering
C.Density-Based Clustering
D.Grid-Based Clustering
Correct Answer: Hierarchical Clustering
Explanation:As the name implies, Hierarchical Clustering builds a hierarchy of clusters, usually visualized by a dendrogram.
Incorrect! Try again.
37In the Silhouette Analysis for K-Means, a score close to +1 indicates:
A.The point is overlapping with other clusters
B.The point is assigned to the wrong cluster
C.The point is well-matched to its own cluster and far from neighboring clusters
D.The point is an outlier
Correct Answer: The point is well-matched to its own cluster and far from neighboring clusters
Explanation:The Silhouette score ranges from -1 to +1. A score near +1 means the sample is far away from the neighboring clusters.
Incorrect! Try again.
38Which of the following scenarios is ideal for K-Means clustering?
A.Clusters are non-spherical and have irregular shapes
B.Clusters are of varying densities
C.Clusters are spherical and distinct
D.Clusters contain a lot of noise and outliers
Correct Answer: Clusters are spherical and distinct
Explanation:K-Means assumes clusters are spherical (globular) and roughly the same size due to its reliance on variance and Euclidean distance.
Incorrect! Try again.
39In Association Rule Mining, a 'Frequent Itemset' is an itemset whose support is:
A.Greater than or equal to a minimum support threshold
B.Less than a minimum support threshold
C.Equal to 1
D.Greater than the confidence threshold
Correct Answer: Greater than or equal to a minimum support threshold
Explanation:The first step of Apriori is to filter out itemsets that do not appear frequently enough, defined by the minimum support threshold.
Incorrect! Try again.
40Which distance metric is most commonly used in K-Means?
A.Manhattan Distance
B.Euclidean Distance
C.Cosine Similarity
D.Hamming Distance
Correct Answer: Euclidean Distance
Explanation:The standard K-Means algorithm minimizes variance, which corresponds to minimizing the squared Euclidean distance.
Incorrect! Try again.
41Which linkage method in Hierarchical Clustering aims to minimize the variance within clusters being merged (similar to K-Means)?
A.Single Linkage
B.Complete Linkage
C.Ward's Method
D.Average Linkage
Correct Answer: Ward's Method
Explanation:Ward's method minimizes the increase in total within-cluster variance when merging two clusters.
Incorrect! Try again.
42What is a 'hard' clustering assignment?
A.A data point belongs to multiple clusters with varying probabilities
B.A data point belongs to exactly one cluster
C.The algorithm is hard to implement
D.The clustering is performed on hardware
Correct Answer: A data point belongs to exactly one cluster
Explanation:K-Means and standard Hierarchical clustering perform hard clustering, where a point is assigned exclusively to one specific cluster.
Incorrect! Try again.
43In Market Basket Analysis, if {Milk, Bread} -> {Eggs} has a confidence of 0.7, it means:
A.70% of all transactions contain Eggs
B.70% of customers buy Milk and Bread
C.70% of transactions containing Milk and Bread also contain Eggs
D.Eggs are bought 70% more often with Milk and Bread than expected
Correct Answer: 70% of transactions containing Milk and Bread also contain Eggs
Explanation:Confidence is the conditional probability. Given the antecedent occurred, there is a 70% chance the consequent occurred.
Incorrect! Try again.
44Which of the following is an application of Clustering?
A.Predicting house prices
B.Customer Segmentation
C.Classifying emails as spam or not spam
D.Predicting credit default
Correct Answer: Customer Segmentation
Explanation:Clustering is widely used to group customers with similar behaviors or characteristics without predefined labels.
Incorrect! Try again.
45In the context of K-Means, what is a Centroid?
A.The outlier point in a cluster
B.The arithmetic mean position of all the points in the cluster
C.The point closest to the origin
D.The boundary of the cluster
Correct Answer: The arithmetic mean position of all the points in the cluster
Explanation:The centroid is the geometric center of the cluster, calculated by averaging the coordinates of all points assigned to that cluster.
Incorrect! Try again.
46Which of the following is true regarding the computational complexity of Hierarchical Clustering compared to K-Means for large datasets?
A.Hierarchical is generally faster
B.Hierarchical is generally slower and more memory intensive
C.They have the exact same complexity
D.Hierarchical cannot run on large datasets
Correct Answer: Hierarchical is generally slower and more memory intensive
Explanation:Agglomerative clustering usually has a time complexity of O(n^3) or O(n^2 log n), making it much slower than K-Means (O(nKI)) for large datasets.
Incorrect! Try again.
47When performing K-Means, if you initialize centroids to the same location, what happens?
A.The algorithm works perfectly
B.The algorithm converges in one step
C.The algorithm fails to generate distinct clusters
D.It automatically separates them
Correct Answer: The algorithm fails to generate distinct clusters
Explanation:If centroids start at the same point, they will calculate the same updates and move together, failing to separate the data into K clusters.
Explanation:Lift < 1 indicates that the occurrence of A makes B less likely to occur (e.g., people buying product A rarely buy product B).
Incorrect! Try again.
49What happens to the WCSS as the number of clusters (K) increases towards the total number of data points?
A.It increases
B.It decreases towards zero
C.It remains constant
D.It fluctuates randomly
Correct Answer: It decreases towards zero
Explanation:If every point is its own cluster (K=N), the distance from each point to its centroid is zero, so WCSS becomes zero.
Incorrect! Try again.
50Which step ensures that the K-Means algorithm converges?
A.The randomization of initial points
B.The fact that WCSS decreases or stays the same with every iteration
C.The use of the Elbow method
D.The use of Manhattan distance
Correct Answer: The fact that WCSS decreases or stays the same with every iteration
Explanation:The mathematical property of the K-Means update steps guarantees that the objective function (WCSS) is non-increasing, ensuring convergence (though possibly to a local minimum).
Incorrect! Try again.
Give Feedback
Help us improve by sharing your thoughts or reporting issues.