Unit 6 - Practice Quiz

INT234 50 Questions
0 Correct 0 Wrong 50 Left
0/50

1 Which component of the prediction error results from the model's assumptions being too simple to capture the underlying structure of the data?

A. Bias
B. Variance
C. Noise
D. Irreducible Error

2 A model that captures random noise in the training data rather than the intended outputs is said to have:

A. Low Variance
B. High Bias
C. High Bias and High Variance
D. High Variance

3 What is the relationship between model complexity and the bias-variance trade-off?

A. As complexity increases, both bias and variance decrease.
B. As complexity increases, bias decreases and variance increases.
C. As complexity increases, bias increases and variance decreases.
D. As complexity increases, both bias and variance increase.

4 Which of the following describes 'Underfitting' in the context of the bias-variance trade-off?

A. Low Bias, Low Variance
B. High Bias, High Variance
C. High Bias, Low Variance
D. Low Bias, High Variance

5 Mathematical decomposition of the total error of a model consists of:

A. Bias^2 + Variance + Irreducible Error
B. Bias^2 + Variance
C. Bias + Variance + Irreducible Error
D. Bias + Variance

6 Which of the following errors cannot be reduced regardless of how good the model is?

A. Variance Error
B. Bias Error
C. Irreducible Error
D. Systematic Error

7 What is the primary purpose of Cross-Validation?

A. To increase the size of the dataset
B. To eliminate outliers in the data
C. To reduce the dimensionality of the data
D. To assess how the results of a statistical analysis will generalize to an independent data set

8 In K-folds cross-validation, if K equals the number of observations in the dataset (N), this method is known as:

A. Holdout Method
B. Bootstrap
C. Leave-One-Out Cross-Validation (LOOCV)
D. Stratified K-fold

9 Which of the following is a major disadvantage of Leave-One-Out Cross-Validation (LOOCV) compared to K-fold cross-validation (where K=5 or 10)?

A. It is less accurate.
B. It wastes too much training data.
C. It is computationally expensive.
D. It has higher bias.

10 In 5-fold cross-validation, what percentage of the data is used for testing in each iteration?

A. 25%
B. 10%
C. 50%
D. 20%

11 Compared to LOOCV, 10-fold cross-validation typically has:

A. Higher bias and higher variance
B. Lower bias and lower variance
C. Higher bias and lower variance
D. Lower bias and higher variance

12 What is 'Stratified' K-Fold Cross-Validation useful for?

A. Datasets with imbalanced class distributions
B. Time-series data
C. Regression problems with continuous targets
D. Reducing computational time

13 What does 'Bagging' stand for?

A. Binary Aggregating
B. Backward Aggregating
C. Bootstrap Aggregating
D. Boosted Aggregating

14 How does Bagging create different training sets?

A. By selecting only the most difficult instances
B. By splitting the data into K distinct folds
C. By sampling with replacement from the original dataset
D. By sampling without replacement from the original dataset

15 Bagging is particularly effective at reducing which component of error?

A. Bias
B. Noise
C. Variance
D. Computation time

16 In Bagging, how is the final prediction made for a regression problem?

A. Averaging
B. Weighted Voting
C. Selecting the single best model
D. Majority Voting

17 What is the 'Out-of-Bag' (OOB) error in Bagging?

A. The error on the training set
B. The error calculated using data not included in the bootstrap sample
C. The error due to missing values
D. The error calculated using an external validation set

18 Which ensemble method builds models sequentially, where each new model attempts to correct the errors of the previous one?

A. Boosting
B. Random Forests
C. Bagging
D. Cross-Validation

19 Random Forest is an extension of which technique?

A. Boosting
B. K-Means Clustering
C. Bagging
D. Linear Regression

20 In Random Forests, how are features selected for splitting a node?

A. The single best feature from the entire dataset is always chosen.
B. A random subset of features is considered at every split.
C. Features are selected based on user preference.
D. All features are considered at every split.

21 Why are Random Forests generally better than a single Decision Tree?

A. They are faster to train.
B. They reduce overfitting and variance.
C. They are easier to interpret.
D. They provide a linear decision boundary.

22 In the context of Boosting, what is a 'weak learner'?

A. A model with complex architecture
B. A model that performs slightly better than random guessing
C. A model that has 100% accuracy
D. A model with high variance

23 How does AdaBoost (Adaptive Boosting) handle misclassified instances?

A. It decreases their weights.
B. It increases their weights.
C. It keeps their weights constant.
D. It discards them.

24 Which of the following is a key difference between Bagging and Boosting?

A. Bagging uses weighted voting; Boosting uses simple averaging.
B. Bagging trains models in parallel; Boosting trains models sequentially.
C. Bagging increases bias; Boosting increases variance.
D. Bagging uses the whole dataset; Boosting uses a subset.

25 Gradient Boosting improves the model by minimizing:

A. The number of trees
B. The weights of the features
C. A loss function using gradient descent
D. The variance of the data

26 Which algorithm is most likely to overfit if the number of base estimators (iterations) is too large?

A. Boosting
B. Bagging
C. Leave-One-Out CV
D. Random Forest

27 Which hyperparameter in Random Forests controls the number of features to consider when looking for the best split?

A. max_features (mtry)
B. min_samples_leaf
C. n_estimators
D. max_depth

28 If a model has high bias, which ensemble method is most likely to improve performance?

A. Stratified Sampling
B. Bagging
C. Boosting
D. Pruning

29 If a model has high variance, which ensemble method is most likely to improve performance?

A. Gradient Descent
B. Boosting (without regularization)
C. Bagging
D. Linear Regression

30 In K-fold cross-validation, what is the trade-off when increasing K?

A. Bias increases, Variance increases, Computation time increases.
B. Bias decreases, Variance increases, Computation time increases.
C. Bias decreases, Variance decreases, Computation time decreases.
D. Bias increases, Variance decreases, Computation time decreases.

31 What is the typical base learner used in Random Forests?

A. Decision Trees
B. Neural Networks
C. Linear Regression
D. Support Vector Machines

32 Which of the following is NOT a benefit of Random Forests?

A. Provides feature importance estimates
B. Robust to outliers
C. Handles high-dimensional data well
D. Is easily interpretable visually like a single tree

33 When using Bootstrap sampling in Bagging, approximately what fraction of unique observations from the original dataset are included in each sample?

A. 100%
B. 63.2%
C. 33%
D. 50%

34 Which Boosting algorithm uses a learning rate parameter to shrink the contribution of each tree?

A. AdaBoost
B. Gradient Boosting
C. Random Forest
D. Bagging

35 In the bias-variance decomposition, if the total error is high and the training error is also high, the model suffers from:

A. Low Bias
B. High Bias
C. Overfitting
D. High Variance

36 Which cross-validation method involves randomly splitting the data into a training set and a test set without distinct 'folds'?

A. Holdout Method
B. Leave-One-Out CV
C. Bootstrap
D. K-Fold CV

37 What is 'Stacking' in the context of model performance?

A. Using a single Deep Neural Network
B. Combining predictions from multiple different models using a meta-model
C. Adding more features to the data
D. Running Cross-Validation multiple times

38 In Random Forests, increasing the number of trees (n_estimators) typically:

A. Stabilizes the error but increases training time
B. Decreases bias significantly
C. Decreases the computational cost
D. Increases overfitting significantly

39 Which of the following describes the 'Stump' often used in AdaBoost?

A. A tree with only one split (depth = 1)
B. A linear regression model
C. A tree with full depth
D. A random forest with 10 trees

40 XGBoost is a popular implementation of which algorithm?

A. Support Vector Machine
B. K-Nearest Neighbors
C. Gradient Boosting
D. Random Forest

41 In K-fold Cross-Validation, the final performance metric is usually calculated by:

A. Averaging the scores of the K folds
B. Taking the best score among the K folds
C. Summing the scores of the K folds
D. Taking the worst score among the K folds

42 What is the primary motivation for using Cross-Validation over a simple Train/Test split?

A. It automatically tunes hyperparameters.
B. It is faster.
C. It uses less data.
D. It provides a less biased estimate of model performance on unseen data.

43 In the context of bias-variance, a very deep Decision Tree without pruning usually exhibits:

A. High Bias, High Variance
B. Low Bias, High Variance
C. Low Bias, Low Variance
D. High Bias, Low Variance

44 Why does Random Forest usually perform better than Bagging with Decision Trees?

A. It uses more trees.
B. It decorrelates the trees by restricting feature selection.
C. It does not use bootstrap sampling.
D. It uses a different loss function.

45 The process of tuning hyperparameters using Cross-Validation is often called:

A. Forward Selection
B. Grid Search
C. Backpropagation
D. Bagging

46 When N is small (small dataset), which Cross-Validation method is preferred to maximize the data used for training?

A. Holdout (50/50 split)
B. 2-Fold CV
C. Leave-One-Out CV
D. Bootstrap

47 Which technique allows for parallel processing during training?

A. Random Forest
B. AdaBoost
C. Gradient Boosting
D. Recurrent Neural Networks

48 What is the 'Learning Rate' in Boosting?

A. A parameter scaling the contribution of each tree to the final prediction
B. The percentage of data used for training
C. The depth of the trees
D. The speed at which the computer processes data

49 Which of the following is true regarding the bias-variance trade-off in K-Nearest Neighbors (KNN)?

A. K does not affect Bias or Variance.
B. Large K results in High Variance.
C. Small K results in Low Bias and High Variance.
D. Small K results in High Bias.

50 If your training error is 1% and your test error is 20%, your model is likely:

A. Underfitting
B. Overfitting
C. Perfectly balanced
D. Experiencing high bias