Unit 6 - Practice Quiz

INT423 50 Questions
0 Correct 0 Wrong 50 Left
0/50

1 What is the primary objective of a Recommender System?

A. To cluster similar users without predicting ratings
B. To predict user preferences for items they have not yet interacted with
C. To classify images
D. To reduce the dimensionality of a dataset

2 Which of the following best describes Content-Based Filtering?

A. It recommends items based on the preferences of similar users.
B. It recommends items similar to those a user liked in the past based on item features.
C. It relies solely on demographic data.
D. It uses matrix factorization to find latent features.

3 What is the 'Cold Start' problem in recommender systems?

A. The issue where popular items are recommended too often.
B. The time it takes to initialize the recommendation engine.
C. The system overheats due to large data processing.
D. The difficulty in recommending items to a new user or recommending a new item due to lack of history.

4 Which of the following represents 'Explicit Feedback'?

A. A user watching a video for 10 minutes.
B. A user clicking on a product link.
C. A user adding an item to a cart but not buying it.
D. A user giving a movie a 5-star rating.

5 Which of the following represents 'Implicit Feedback'?

A. Clicking on an advertisement.
B. Writing a text review.
C. Filling out a preference survey.
D. Rating a song 4 out of 5.

6 In Collaborative Filtering, what is the core underlying assumption?

A. Users who agreed in the past will tend to agree in the future.
B. Users' preferences change randomly over time.
C. The features of the items are more important than user interactions.
D. Items with similar descriptions are always rated similarly.

7 What is the purpose of Mean Normalization in collaborative filtering?

A. To convert binary labels into continuous variables.
B. To handle users who have not rated any items by treating missing ratings as the average.
C. To remove the effect of item features.
D. To increase the range of ratings.

8 In a binary label system (favs, likes, clicks), how is the target variable usually represented?

A. As a continuous value between 0 and 1.
B. As 1 for interaction (positive) and 0 for no interaction.
C. As a vector of text keywords.
D. As a discrete set {1, 2, 3, 4, 5}.

9 Which similarity measure is commonly used in Content-Based Filtering to compare document vectors?

A. Manhattan Distance
B. Hamming Distance
C. Euclidean Distance
D. Cosine Similarity

10 What is a major disadvantage of Content-Based Filtering?

A. It tends to overspecialize and lacks serendipity (surprising recommendations).
B. It cannot handle binary data.
C. It suffers from the cold start problem for new users.
D. It requires a large number of users to find similarities.

11 User-based Collaborative Filtering involves:

A. Finding users who live in the same demographic area.
B. Finding users similar to the target user and recommending what they liked.
C. Filtering content based on keywords provided by the user.
D. Finding items similar to the item the user is viewing.

12 Item-based Collaborative Filtering involves:

A. Using a decision tree to classify items.
B. Clustering users into groups.
C. Analyzing item descriptions to find keywords.
D. Calculating the similarity between items based on user co-ratings.

13 What is 'Matrix Factorization' in the context of recommender systems?

A. A way to normalize the mean of the matrix.
B. A method to sort the matrix by top-rated items.
C. A technique to decompose the user-item interaction matrix into lower-dimensional latent factor matrices.
D. A method to multiply two matrices to get the final ratings.

14 When using binary labels like 'clicks', which issue is most prominent compared to explicit ratings?

A. High computational cost
B. Ambiguity of negative feedback (missing data vs. dislike)
C. Lack of user identification
D. Data Scarcity

15 Which of the following is a key advantage of Hybrid Recommender Systems?

A. They are computationally cheaper than simple algorithms.
B. They can overcome limitations of individual approaches like the cold start problem.
C. They only require implicit feedback.
D. They eliminate the need for data collection.

16 In a Recommender System, 'Serendipity' refers to:

A. The accuracy of the prediction.
B. The consistency of recommendations over time.
C. The ability to recommend items that are relevant but surprising to the user.
D. The speed of the recommendation engine.

17 If a user has rated Item A (5 stars) and Item B (5 stars), and a second user rated Item A (5 stars), Item-Based CF would likely predict:

A. The second user will give Item B a high rating.
B. No prediction is possible.
C. The second user is a bot.
D. The second user will dislike Item B.

18 Which technique is best suited for a system where users rarely rate items but generate many search queries?

A. User-based Collaborative Filtering
B. Content-Based Filtering
C. Matrix Factorization on explicit ratings
D. Demographic Filtering

19 What is the 'Long Tail' phenomenon in recommender systems?

A. The system requires long user IDs.
B. The algorithm takes a long time to converge.
C. A small number of popular items generate most interactions, while many niche items have few interactions.
D. Recommendations are presented in a long list.

20 How does Mean Normalization help with bias?

A. It removes users who always give 1-star ratings.
B. It converts all ratings to positive integers.
C. It adjusts for users who are consistently harsh or generous in their ratings.
D. It ignores the item bias.

21 Which algorithm is commonly used to train Matrix Factorization models?

A. K-Means Clustering
B. Alternating Least Squares (ALS)
C. Decision Trees
D. Apriori Algorithm

22 A Weighted Hybrid Recommender System works by:

A. Combining the scores of different recommendation techniques with specific weights.
B. Using content filtering only when collaborative filtering fails.
C. Running algorithms in a sequence where one refines the other.
D. Selecting one algorithm randomly.

23 What is a 'Switching' Hybrid System?

A. It switches the user interface based on preferences.
B. It chooses a recommendation technique based on the current situation (e.g., data availability).
C. It switches between positive and negative ratings.
D. It swaps the item ID with the user ID.

24 In the context of binary labels, what is 'Confidence' often associated with?

A. The probability that the user is a human.
B. The percentage of items rated.
C. The confidence interval of the error.
D. The strength of the interaction (e.g., frequency of clicks or duration of view).

25 What is the primary input for a Content-Based Filtering algorithm?

A. Demographic data of all users.
B. A User-Item Rating Matrix.
C. Social Network Graphs.
D. Item Profiles (Features) and User Profiles.

26 Collaborative Filtering generally outperforms Content-Based Filtering in which scenario?

A. When identifying cross-genre or complex patterns that are hard to feature-engineer.
B. When there are no user ratings available.
C. When items have rich, structured metadata.
D. When recommending to a brand new user.

27 Which metric is commonly used to evaluate a Recommender System utilizing explicit ratings?

A. Accuracy
B. Root Mean Squared Error (RMSE)
C. F1-Score
D. Jaccard Index

28 What is the 'Grey Sheep' problem?

A. Users who only rate popular items.
B. Items that are black and white images.
C. The problem of duplicate accounts.
D. Users whose opinions do not consistently agree or disagree with any group of people.

29 In a user-item matrix used for CF, what does 'Sparsity' refer to?

A. The matrix is small in size.
B. The matrix has low rank.
C. Most entries in the matrix are empty (unknown ratings).
D. The ratings are all low numbers.

30 Which feature engineering technique is essential for Content-Based filtering of text documents?

A. Min-Max Scaling of IDs
B. Fourier Transform
C. Pixel normalization
D. TF-IDF (Term Frequency-Inverse Document Frequency)

31 Latent factors in Matrix Factorization usually represent:

A. The user's age and location.
B. Hidden characteristics inferred from data patterns.
C. The timestamp of the rating.
D. Explicit categories like 'Action' or 'Comedy'.

32 If a dataset consists only of 'Purchase' vs 'Non-Purchase', which type of filtering is applied?

A. Explicit Rating CF
B. Sentiment Analysis
C. Implicit Feedback CF
D. Regression Analysis

33 Why is Mean Normalization important when using Cosine Similarity for centered data (Pearson Correlation)?

A. It ensures all vectors have unit length.
B. It transforms the cosine similarity into Pearson Correlation Coefficient.
C. It removes the user ID from the calculation.
D. It speeds up the computation.

34 Which of the following is a limitation of Collaborative Filtering?

A. It cannot recommend items if no one else has rated them (New Item problem).
B. It is strictly rule-based.
C. It yields recommendations that are too obvious.
D. It requires domain knowledge to engineer features.

35 In a Cascade Hybrid System:

A. All recommenders run in parallel.
B. One recommender refines the recommendations given by another.
C. The weights of recommenders change dynamically.
D. The system cascades into a random selection.

36 Which strategy helps in solving the Cold Start problem for a new user?

A. Asking the user to select preferred genres during onboarding.
B. Applying Matrix Factorization immediately.
C. Waiting for the user to rate 100 items.
D. Using Item-Based Collaborative Filtering.

37 What is the primary advantage of Model-Based CF over Memory-Based CF?

A. It does not require training.
B. It is easier to implement.
C. It handles sparsity better and offers faster prediction times.
D. It gives exact results based on neighbors.

38 What does the 'Banana' problem refer to in Recommender Systems?

A. Recommending items like bananas which are bought frequently but don't indicate distinct taste.
B. Users buying bananas only once.
C. A coding error in Python.
D. The shape of the loss function.

39 Which loss function is minimized in standard Matrix Factorization for explicit ratings?

A. Cross-Entropy Loss
B. Log-Likelihood
C. Squared Error (between actual and predicted rating) + Regularization
D. Hinge Loss

40 How does a 'Demographic-based' recommender work?

A. It uses only purchase history.
B. It uses the age, gender, and location of users to find similar groups.
C. It uses satellite imagery.
D. It uses the text content of reviews.

41 Which of the following is an example of a use case for Association Rule Learning in recommendations?

A. Face recognition.
B. Predicting the rating of a movie.
C. Determining if an email is spam.
D. Market Basket Analysis (e.g., 'Frequently Bought Together').

42 Binary cross-entropy is a suitable loss function when:

A. Predicting a probability of interaction (click/no-click).
B. Clustering users.
C. Predicting the price of a house.
D. Predicting a star rating from 1 to 5.

43 What is the scalability challenge in User-Based Collaborative Filtering (Memory-Based)?

A. It cannot handle text data.
B. Computing similarity between millions of users in real-time is computationally expensive.
C. It only works with binary data.
D. It requires too much hard drive space.

44 In the context of Mean Normalization, if a user has not rated any movies, what prediction does the algorithm default to?

A. Zero.
B. The average rating of the specific movie by other users.
C. The maximum possible rating.
D. A random number.

45 What is 'Collaborative Filtering' distinct from?

A. Using matrices.
B. Analyzing the internal attributes or content of the item.
C. Using machine learning.
D. Predicting future behavior.

46 Regularization is added to the cost function in Matrix Factorization to:

A. Increase the number of latent features.
B. Prevent overfitting by penalizing large values in the feature matrices.
C. Ensure all ratings are positive.
D. Make the code run faster.

47 Which of these is a binary label?

A. Review Sentiment: Positive (0.8 score)
B. View Duration: 120 seconds
C. Rating: 4.5 stars
D. Favorite: Yes

48 Comparison: Which method requires domain knowledge for feature extraction?

A. User-User KNN
B. Matrix Factorization
C. Content-Based Filtering
D. Collaborative Filtering

49 Precision@k is a metric used to evaluate:

A. The exact numerical rating accuracy.
B. The number of users who rated k items.
C. The time taken to generate k recommendations.
D. The proportion of recommended items in the top-k set that are relevant.

50 In a Hybrid system, 'Feature Augmentation' refers to:

A. Increasing the font size of recommendations.
B. Adding more RAM to the server.
C. Adding random noise to the data.
D. Using the output of one recommender as a feature input for another.