1 $Which of the following best defines Unsupervised Learning?$

A.

Learning where the model predicts a continuous target variable

B.

Learning where the model predicts a categorical target variable

C.

Learning where the data has no predefined labels or target variables

D.

Learning where the model is rewarded or punished based on actions

2 $In K-Means clustering, what does 'K' represent?$

A.

The number of iterations

B.

The number of features in the dataset

C.

The number of clusters the algorithm forms

D.

The distance metric used

3 $What is the primary objective function that K-Means minimizes?$

A.

Inter-cluster distance

B.

Within-Cluster Sum of Squares (WCSS)

C.

The number of outliers

D.

The Silhouette coefficient

4 $Which of the following is the first step in the standard K-Means algorithm?$

A.

Calculate the centroid of all points

B.

Assign points to the nearest cluster

C.

Randomly initialize K centroids

D.

Calculate the total variance

5 $The 'Random Initialization Trap' in K-Means refers to:$

A.

The algorithm running indefinitely without convergence

B.

The algorithm selecting the wrong number of K

C.

Different initial centroid positions leading to different, suboptimal results

D.

The inability to handle categorical data

6 $Which technique is commonly used to mitigate the Random Initialization Trap?$

A.

K-Means++

B.

Agglomerative Clustering

C.

Principal Component Analysis

D.

Gradient Descent

7 $What is the 'Elbow Method' used for in K-Means clustering?$

A.

Determining the optimal number of clusters

B.

Calculating the distance between centroids

C.

Visualizing high-dimensional data

D.

Stopping the algorithm early

8 $In an Elbow Method plot, what variable is typically on the Y-axis?$

A.

Number of Clusters (K)

B.

WCSS (Inertia)

C.

Accuracy

D.

Computation Time

9 $When does the K-Means algorithm stop iterating?$

A.

When the WCSS becomes zero

B.

When the centroids no longer move significantly between iterations

C.

When every point is in its own cluster

D.

When the number of clusters equals the number of data points

10 $Which of the following is a limitation of K-Means clustering?$

A.

It is computationally expensive for small datasets

B.

It requires the number of clusters to be specified in advance

C.

It can only handle binary data

D.

It always finds the global optimum

11 $Agglomerative Hierarchical Clustering is best described as a approach.$

A.

Top-down

B.

Bottom-up

C.

Centroid-based

D.

Density-based

12 $Divisive Hierarchical Clustering is best described as a approach.$

A.

Top-down

B.

Bottom-up

C.

Randomized

D.

Grid-based

13 $What diagram is commonly used to visualize Hierarchical Clustering?$

A.

Scatter plot

B.

Histogram

C.

Dendrogram

D.

Box plot

14 $In a dendrogram, the vertical axis typically represents:$

A.

The number of clusters

B.

The frequency of data points

C.

The Euclidean distance or dissimilarity between clusters

D.

The time taken to cluster

15 $How do you determine the optimal number of clusters using a dendrogram?$

A.

Count the number of leaves at the bottom

B.

Cut the dendrogram at the point with the longest vertical distance without crossing horizontal lines

C.

Choose the height where the first merge occurs

D.

It is impossible to determine K from a dendrogram

16 $Which linkage method defines the distance between two clusters as the shortest distance between any single point in one cluster and any single point in the other?$

A.

Complete Linkage

B.

Average Linkage

C.

Single Linkage

D.

Ward's Method

17 $Which linkage method defines the distance between two clusters as the maximum distance between any point in the first cluster and any point in the second?$

A.

Single Linkage

B.

Complete Linkage

C.

Centroid Linkage

D.

Average Linkage

18 $Average Linkage calculates the distance between clusters by:$

A.

Using the distance between the centroids of the clusters

B.

Taking the average of all pairwise distances between points in the two clusters

C.

Taking the median distance of all points

D.

Using the minimum distance between points

19 $Centroid Linkage measures the distance between clusters based on:$

A.

The distance between the geometric centers (means) of the clusters

B.

The furthest points in the clusters

C.

The closest points in the clusters

D.

The sum of squared errors

20 $Which linkage method is most notorious for producing the 'chaining' effect (long, stringy clusters)?$

A.

Complete Linkage

B.

Single Linkage

C.

Ward's Method

D.

Average Linkage

21 $What is a primary advantage of Hierarchical Clustering over K-Means?$

A.

It is computationally faster on large datasets

B.

It does not require assuming the number of clusters (K) beforehand

C.

It works better with high-dimensional data

D.

It always uses Manhattan distance

22 $Market Basket Analysis is a specific application of which technique?$

A.

Linear Regression

B.

Clustering

C.

Association Rule Learning

D.

Decision Trees

23 $In the rule {Bread} -> {Butter}, {Bread} is the:$

A.

Consequent

B.

Antecedent

C.

Lift

D.

Support

24 $What does the metric 'Support' measure in Association Rules?$

A.

The reliability of the rule

B.

The frequency with which an itemset appears in the dataset

C.

The ratio of the rule's confidence to the expected confidence

D.

The correlation between items

25 $How is 'Confidence' for the rule A -> B calculated?$

A.

Support(A & B) / Support(B)

B.

Support(A & B) / Support(A)

C.

Support(A) / Support(B)

D.

Support(B) / Support(A)

26 $What does a 'Lift' value greater than 1 indicate?$

A.

The items are independent of each other

B.

The items are substitutes (negatively correlated)

C.

The presence of the antecedent increases the likelihood of the consequent

D.

The rule is invalid

27 $If the Lift of a rule A -> B is exactly 1, what does this imply?$

A.

A and B are perfectly correlated

B.

A and B are independent

C.

A and B are never bought together

D.

The confidence is 100%

28 $Which algorithm is most commonly associated with mining frequent itemsets for Association Rules?$

A.

Apriori

B.

Random Forest

C.

K-Nearest Neighbors

D.

Naive Bayes

29 $The Apriori algorithm uses the 'Downward Closure Property'. What does this property state?$

A.

All supersets of a frequent itemset must be frequent

B.

All subsets of a frequent itemset must be frequent

C.

If an itemset is infrequent, its subsets must be frequent

D.

Support always equals Confidence

30 $Which of the following is NOT a step in the K-Means algorithm?$

A.

Assignment of points to the nearest centroid

B.

Update of centroids to the mean of assigned points

C.

Calculation of distance matrix for all pairs of points

D.

Initialization of centroids

31 $What is the main reason to scale features (normalize/standardize) before running K-Means?$

A.

To make the data normally distributed

B.

To prevent features with large magnitudes from dominating distance calculations

C.

To convert categorical variables to numeric

D.

To increase the number of clusters

32 $In the context of Association Rules, the 'Consequent' is found on which side of the arrow?$

A.

Left (IF side)

B.

Right (THEN side)

C.

Both sides

D.

Neither side

33 $Which metric would you look at to determine if a high-confidence rule is merely a coincidence because the consequent is very popular?$

A.

Support

B.

Confidence

C.

Lift

D.

Accuracy

34 $If Support(A) = 0.4 and Support(A, B) = 0.2, what is the Confidence(A -> B)?$

A.

0.2

B.

0.5

C.

0.8

D.

2.0

35 $K-Means is sensitive to which of the following?$

A.

Redundant features

B.

Outliers

C.

Scaling

D.

All of the above

36 $Which clustering method generates a hierarchy of clusters?$

A.

Partitioning Clustering (K-Means)

B.

Hierarchical Clustering

C.

Density-Based Clustering

D.

Grid-Based Clustering

37 $In the Silhouette Analysis for K-Means, a score close to +1 indicates:$

A.

The point is overlapping with other clusters

B.

The point is assigned to the wrong cluster

C.

The point is well-matched to its own cluster and far from neighboring clusters

D.

The point is an outlier

38 $Which of the following scenarios is ideal for K-Means clustering?$

A.

Clusters are non-spherical and have irregular shapes

B.

Clusters are of varying densities

C.

Clusters are spherical and distinct

D.

Clusters contain a lot of noise and outliers

39 $In Association Rule Mining, a 'Frequent Itemset' is an itemset whose support is:$

A.

Greater than or equal to a minimum support threshold

B.

Less than a minimum support threshold

C.

Equal to 1

D.

Greater than the confidence threshold

40 $Which distance metric is most commonly used in K-Means?$

A.

Manhattan Distance

B.

Euclidean Distance

C.

Cosine Similarity

D.

Hamming Distance

41 $Which linkage method in Hierarchical Clustering aims to minimize the variance within clusters being merged (similar to K-Means)?$

A.

Single Linkage

B.

Complete Linkage

C.

Ward's Method

D.

Average Linkage

42 $What is a 'hard' clustering assignment?$

A.

A data point belongs to multiple clusters with varying probabilities

B.

A data point belongs to exactly one cluster

C.

The algorithm is hard to implement

D.

The clustering is performed on hardware

43 $In Market Basket Analysis, if {Milk, Bread} -> {Eggs} has a confidence of 0.7, it means:$

A.

70% of all transactions contain Eggs

B.

70% of customers buy Milk and Bread

C.

70% of transactions containing Milk and Bread also contain Eggs

D.

Eggs are bought 70% more often with Milk and Bread than expected

44 $Which of the following is an application of Clustering?$

A.

Predicting house prices

B.

Customer Segmentation

C.

Classifying emails as spam or not spam

D.

Predicting credit default

45 $In the context of K-Means, what is a Centroid?$

A.

The outlier point in a cluster

B.

The arithmetic mean position of all the points in the cluster

C.

The point closest to the origin

D.

The boundary of the cluster

46 $Which of the following is true regarding the computational complexity of Hierarchical Clustering compared to K-Means for large datasets?$

A.

Hierarchical is generally faster

B.

Hierarchical is generally slower and more memory intensive

C.

They have the exact same complexity

D.

Hierarchical cannot run on large datasets

47 $When performing K-Means, if you initialize centroids to the same location, what happens?$

A.

The algorithm works perfectly

B.

The algorithm converges in one step

C.

The algorithm fails to generate distinct clusters

D.

It automatically separates them

48 $A Lift value of 0.5 suggests:$

A.

Positive correlation

B.

Independence

C.

Negative correlation (Substitutes)

D.

Strong rule

49 $What happens to the WCSS as the number of clusters (K) increases towards the total number of data points?$

A.

It increases

B.

It decreases towards zero

C.

It remains constant

D.

It fluctuates randomly

50 $Which step ensures that the K-Means algorithm converges?$

A.

The randomization of initial points

B.

The fact that WCSS decreases or stays the same with every iteration

C.

The use of the Elbow method

D.

The use of Manhattan distance

Unit 4 - Practice Quiz