Unit 6 - Practice Quiz

INT423

1 What is the primary objective of a Recommender System?

A. To classify images
B. To predict user preferences for items they have not yet interacted with
C. To cluster similar users without predicting ratings
D. To reduce the dimensionality of a dataset

2 Which of the following best describes Content-Based Filtering?

A. It recommends items based on the preferences of similar users.
B. It recommends items similar to those a user liked in the past based on item features.
C. It uses matrix factorization to find latent features.
D. It relies solely on demographic data.

3 What is the 'Cold Start' problem in recommender systems?

A. The system overheats due to large data processing.
B. The difficulty in recommending items to a new user or recommending a new item due to lack of history.
C. The time it takes to initialize the recommendation engine.
D. The issue where popular items are recommended too often.

4 Which of the following represents 'Explicit Feedback'?

A. A user clicking on a product link.
B. A user watching a video for 10 minutes.
C. A user giving a movie a 5-star rating.
D. A user adding an item to a cart but not buying it.

5 Which of the following represents 'Implicit Feedback'?

A. Writing a text review.
B. Rating a song 4 out of 5.
C. Clicking on an advertisement.
D. Filling out a preference survey.

6 In Collaborative Filtering, what is the core underlying assumption?

A. Users who agreed in the past will tend to agree in the future.
B. Items with similar descriptions are always rated similarly.
C. Users' preferences change randomly over time.
D. The features of the items are more important than user interactions.

7 What is the purpose of Mean Normalization in collaborative filtering?

A. To increase the range of ratings.
B. To handle users who have not rated any items by treating missing ratings as the average.
C. To convert binary labels into continuous variables.
D. To remove the effect of item features.

8 In a binary label system (favs, likes, clicks), how is the target variable usually represented?

A. As a continuous value between 0 and 1.
B. As a discrete set {1, 2, 3, 4, 5}.
C. As 1 for interaction (positive) and 0 for no interaction.
D. As a vector of text keywords.

9 Which similarity measure is commonly used in Content-Based Filtering to compare document vectors?

A. Euclidean Distance
B. Cosine Similarity
C. Manhattan Distance
D. Hamming Distance

10 What is a major disadvantage of Content-Based Filtering?

A. It suffers from the cold start problem for new users.
B. It requires a large number of users to find similarities.
C. It tends to overspecialize and lacks serendipity (surprising recommendations).
D. It cannot handle binary data.

11 User-based Collaborative Filtering involves:

A. Finding items similar to the item the user is viewing.
B. Finding users similar to the target user and recommending what they liked.
C. Finding users who live in the same demographic area.
D. Filtering content based on keywords provided by the user.

12 Item-based Collaborative Filtering involves:

A. Analyzing item descriptions to find keywords.
B. Calculating the similarity between items based on user co-ratings.
C. Clustering users into groups.
D. Using a decision tree to classify items.

13 What is 'Matrix Factorization' in the context of recommender systems?

A. A method to multiply two matrices to get the final ratings.
B. A technique to decompose the user-item interaction matrix into lower-dimensional latent factor matrices.
C. A way to normalize the mean of the matrix.
D. A method to sort the matrix by top-rated items.

14 When using binary labels like 'clicks', which issue is most prominent compared to explicit ratings?

A. Data Scarcity
B. Ambiguity of negative feedback (missing data vs. dislike)
C. High computational cost
D. Lack of user identification

15 Which of the following is a key advantage of Hybrid Recommender Systems?

A. They are computationally cheaper than simple algorithms.
B. They eliminate the need for data collection.
C. They can overcome limitations of individual approaches like the cold start problem.
D. They only require implicit feedback.

16 In a Recommender System, 'Serendipity' refers to:

A. The accuracy of the prediction.
B. The speed of the recommendation engine.
C. The ability to recommend items that are relevant but surprising to the user.
D. The consistency of recommendations over time.

17 If a user has rated Item A (5 stars) and Item B (5 stars), and a second user rated Item A (5 stars), Item-Based CF would likely predict:

A. The second user will dislike Item B.
B. The second user will give Item B a high rating.
C. The second user is a bot.
D. No prediction is possible.

18 Which technique is best suited for a system where users rarely rate items but generate many search queries?

A. User-based Collaborative Filtering
B. Content-Based Filtering
C. Matrix Factorization on explicit ratings
D. Demographic Filtering

19 What is the 'Long Tail' phenomenon in recommender systems?

A. The algorithm takes a long time to converge.
B. A small number of popular items generate most interactions, while many niche items have few interactions.
C. The system requires long user IDs.
D. Recommendations are presented in a long list.

20 How does Mean Normalization help with bias?

A. It removes users who always give 1-star ratings.
B. It adjusts for users who are consistently harsh or generous in their ratings.
C. It ignores the item bias.
D. It converts all ratings to positive integers.

21 Which algorithm is commonly used to train Matrix Factorization models?

A. K-Means Clustering
B. Alternating Least Squares (ALS)
C. Decision Trees
D. Apriori Algorithm

22 A Weighted Hybrid Recommender System works by:

A. Selecting one algorithm randomly.
B. Combining the scores of different recommendation techniques with specific weights.
C. Running algorithms in a sequence where one refines the other.
D. Using content filtering only when collaborative filtering fails.

23 What is a 'Switching' Hybrid System?

A. It switches the user interface based on preferences.
B. It swaps the item ID with the user ID.
C. It chooses a recommendation technique based on the current situation (e.g., data availability).
D. It switches between positive and negative ratings.

24 In the context of binary labels, what is 'Confidence' often associated with?

A. The probability that the user is a human.
B. The strength of the interaction (e.g., frequency of clicks or duration of view).
C. The confidence interval of the error.
D. The percentage of items rated.

25 What is the primary input for a Content-Based Filtering algorithm?

A. A User-Item Rating Matrix.
B. Item Profiles (Features) and User Profiles.
C. Social Network Graphs.
D. Demographic data of all users.

26 Collaborative Filtering generally outperforms Content-Based Filtering in which scenario?

A. When items have rich, structured metadata.
B. When identifying cross-genre or complex patterns that are hard to feature-engineer.
C. When there are no user ratings available.
D. When recommending to a brand new user.

27 Which metric is commonly used to evaluate a Recommender System utilizing explicit ratings?

A. Accuracy
B. Root Mean Squared Error (RMSE)
C. F1-Score
D. Jaccard Index

28 What is the 'Grey Sheep' problem?

A. Users whose opinions do not consistently agree or disagree with any group of people.
B. Items that are black and white images.
C. Users who only rate popular items.
D. The problem of duplicate accounts.

29 In a user-item matrix used for CF, what does 'Sparsity' refer to?

A. The matrix has low rank.
B. Most entries in the matrix are empty (unknown ratings).
C. The matrix is small in size.
D. The ratings are all low numbers.

30 Which feature engineering technique is essential for Content-Based filtering of text documents?

A. Pixel normalization
B. TF-IDF (Term Frequency-Inverse Document Frequency)
C. Fourier Transform
D. Min-Max Scaling of IDs

31 Latent factors in Matrix Factorization usually represent:

A. Explicit categories like 'Action' or 'Comedy'.
B. Hidden characteristics inferred from data patterns.
C. The timestamp of the rating.
D. The user's age and location.

32 If a dataset consists only of 'Purchase' vs 'Non-Purchase', which type of filtering is applied?

A. Explicit Rating CF
B. Implicit Feedback CF
C. Sentiment Analysis
D. Regression Analysis

33 Why is Mean Normalization important when using Cosine Similarity for centered data (Pearson Correlation)?

A. It ensures all vectors have unit length.
B. It transforms the cosine similarity into Pearson Correlation Coefficient.
C. It removes the user ID from the calculation.
D. It speeds up the computation.

34 Which of the following is a limitation of Collaborative Filtering?

A. It requires domain knowledge to engineer features.
B. It cannot recommend items if no one else has rated them (New Item problem).
C. It yields recommendations that are too obvious.
D. It is strictly rule-based.

35 In a Cascade Hybrid System:

A. All recommenders run in parallel.
B. One recommender refines the recommendations given by another.
C. The system cascades into a random selection.
D. The weights of recommenders change dynamically.

36 Which strategy helps in solving the Cold Start problem for a new user?

A. Waiting for the user to rate 100 items.
B. Asking the user to select preferred genres during onboarding.
C. Using Item-Based Collaborative Filtering.
D. Applying Matrix Factorization immediately.

37 What is the primary advantage of Model-Based CF over Memory-Based CF?

A. It is easier to implement.
B. It handles sparsity better and offers faster prediction times.
C. It does not require training.
D. It gives exact results based on neighbors.

38 What does the 'Banana' problem refer to in Recommender Systems?

A. Users buying bananas only once.
B. Recommending items like bananas which are bought frequently but don't indicate distinct taste.
C. The shape of the loss function.
D. A coding error in Python.

39 Which loss function is minimized in standard Matrix Factorization for explicit ratings?

A. Cross-Entropy Loss
B. Squared Error (between actual and predicted rating) + Regularization
C. Hinge Loss
D. Log-Likelihood

40 How does a 'Demographic-based' recommender work?

A. It uses the age, gender, and location of users to find similar groups.
B. It uses satellite imagery.
C. It uses the text content of reviews.
D. It uses only purchase history.

41 Which of the following is an example of a use case for Association Rule Learning in recommendations?

A. Predicting the rating of a movie.
B. Determining if an email is spam.
C. Market Basket Analysis (e.g., 'Frequently Bought Together').
D. Face recognition.

42 Binary cross-entropy is a suitable loss function when:

A. Predicting a star rating from 1 to 5.
B. Predicting the price of a house.
C. Predicting a probability of interaction (click/no-click).
D. Clustering users.

43 What is the scalability challenge in User-Based Collaborative Filtering (Memory-Based)?

A. It requires too much hard drive space.
B. Computing similarity between millions of users in real-time is computationally expensive.
C. It cannot handle text data.
D. It only works with binary data.

44 In the context of Mean Normalization, if a user has not rated any movies, what prediction does the algorithm default to?

A. Zero.
B. The average rating of the specific movie by other users.
C. A random number.
D. The maximum possible rating.

45 What is 'Collaborative Filtering' distinct from?

A. Using machine learning.
B. Analyzing the internal attributes or content of the item.
C. Predicting future behavior.
D. Using matrices.

46 Regularization is added to the cost function in Matrix Factorization to:

A. Make the code run faster.
B. Prevent overfitting by penalizing large values in the feature matrices.
C. Increase the number of latent features.
D. Ensure all ratings are positive.

47 Which of these is a binary label?

A. Rating: 4.5 stars
B. View Duration: 120 seconds
C. Favorite: Yes
D. Review Sentiment: Positive (0.8 score)

48 Comparison: Which method requires domain knowledge for feature extraction?

A. Collaborative Filtering
B. Content-Based Filtering
C. Matrix Factorization
D. User-User KNN

49 Precision@k is a metric used to evaluate:

A. The exact numerical rating accuracy.
B. The proportion of recommended items in the top-k set that are relevant.
C. The time taken to generate k recommendations.
D. The number of users who rated k items.

50 In a Hybrid system, 'Feature Augmentation' refers to:

A. Adding more RAM to the server.
B. Using the output of one recommender as a feature input for another.
C. Increasing the font size of recommendations.
D. Adding random noise to the data.