Unit 6 - Practice Quiz

INT234 50 Questions
0 Correct 0 Wrong 50 Left
0/50

1 Which component of the prediction error results from the model's assumptions being too simple to capture the underlying structure of the data?

A. Variance
B. Bias
C. Noise
D. Irreducible Error

2 A model that captures random noise in the training data rather than the intended outputs is said to have:

A. High Bias
B. High Bias and High Variance
C. High Variance
D. Low Variance

3 What is the relationship between model complexity and the bias-variance trade-off?

A. As complexity increases, both bias and variance decrease.
B. As complexity increases, both bias and variance increase.
C. As complexity increases, bias increases and variance decreases.
D. As complexity increases, bias decreases and variance increases.

4 Which of the following describes 'Underfitting' in the context of the bias-variance trade-off?

A. Low Bias, Low Variance
B. Low Bias, High Variance
C. High Bias, High Variance
D. High Bias, Low Variance

5 Mathematical decomposition of the total error of a model consists of:

A. Bias^2 + Variance + Irreducible Error
B. Bias + Variance
C. Bias + Variance + Irreducible Error
D. Bias^2 + Variance

6 Which of the following errors cannot be reduced regardless of how good the model is?

A. Variance Error
B. Systematic Error
C. Irreducible Error
D. Bias Error

7 What is the primary purpose of Cross-Validation?

A. To assess how the results of a statistical analysis will generalize to an independent data set
B. To eliminate outliers in the data
C. To increase the size of the dataset
D. To reduce the dimensionality of the data

8 In K-folds cross-validation, if K equals the number of observations in the dataset (N), this method is known as:

A. Stratified K-fold
B. Bootstrap
C. Leave-One-Out Cross-Validation (LOOCV)
D. Holdout Method

9 Which of the following is a major disadvantage of Leave-One-Out Cross-Validation (LOOCV) compared to K-fold cross-validation (where K=5 or 10)?

A. It is computationally expensive.
B. It wastes too much training data.
C. It is less accurate.
D. It has higher bias.

10 In 5-fold cross-validation, what percentage of the data is used for testing in each iteration?

A. 50%
B. 10%
C. 25%
D. 20%

11 Compared to LOOCV, 10-fold cross-validation typically has:

A. Higher bias and lower variance
B. Higher bias and higher variance
C. Lower bias and lower variance
D. Lower bias and higher variance

12 What is 'Stratified' K-Fold Cross-Validation useful for?

A. Datasets with imbalanced class distributions
B. Regression problems with continuous targets
C. Time-series data
D. Reducing computational time

13 What does 'Bagging' stand for?

A. Bootstrap Aggregating
B. Binary Aggregating
C. Boosted Aggregating
D. Backward Aggregating

14 How does Bagging create different training sets?

A. By sampling with replacement from the original dataset
B. By selecting only the most difficult instances
C. By splitting the data into K distinct folds
D. By sampling without replacement from the original dataset

15 Bagging is particularly effective at reducing which component of error?

A. Noise
B. Variance
C. Bias
D. Computation time

16 In Bagging, how is the final prediction made for a regression problem?

A. Majority Voting
B. Averaging
C. Weighted Voting
D. Selecting the single best model

17 What is the 'Out-of-Bag' (OOB) error in Bagging?

A. The error calculated using an external validation set
B. The error on the training set
C. The error calculated using data not included in the bootstrap sample
D. The error due to missing values

18 Which ensemble method builds models sequentially, where each new model attempts to correct the errors of the previous one?

A. Bagging
B. Cross-Validation
C. Boosting
D. Random Forests

19 Random Forest is an extension of which technique?

A. Boosting
B. Bagging
C. K-Means Clustering
D. Linear Regression

20 In Random Forests, how are features selected for splitting a node?

A. All features are considered at every split.
B. The single best feature from the entire dataset is always chosen.
C. A random subset of features is considered at every split.
D. Features are selected based on user preference.

21 Why are Random Forests generally better than a single Decision Tree?

A. They reduce overfitting and variance.
B. They provide a linear decision boundary.
C. They are easier to interpret.
D. They are faster to train.

22 In the context of Boosting, what is a 'weak learner'?

A. A model that has 100% accuracy
B. A model that performs slightly better than random guessing
C. A model with high variance
D. A model with complex architecture

23 How does AdaBoost (Adaptive Boosting) handle misclassified instances?

A. It discards them.
B. It keeps their weights constant.
C. It increases their weights.
D. It decreases their weights.

24 Which of the following is a key difference between Bagging and Boosting?

A. Bagging uses weighted voting; Boosting uses simple averaging.
B. Bagging increases bias; Boosting increases variance.
C. Bagging trains models in parallel; Boosting trains models sequentially.
D. Bagging uses the whole dataset; Boosting uses a subset.

25 Gradient Boosting improves the model by minimizing:

A. The weights of the features
B. The number of trees
C. The variance of the data
D. A loss function using gradient descent

26 Which algorithm is most likely to overfit if the number of base estimators (iterations) is too large?

A. Random Forest
B. Bagging
C. Leave-One-Out CV
D. Boosting

27 Which hyperparameter in Random Forests controls the number of features to consider when looking for the best split?

A. max_depth
B. min_samples_leaf
C. max_features (mtry)
D. n_estimators

28 If a model has high bias, which ensemble method is most likely to improve performance?

A. Stratified Sampling
B. Boosting
C. Bagging
D. Pruning

29 If a model has high variance, which ensemble method is most likely to improve performance?

A. Bagging
B. Boosting (without regularization)
C. Gradient Descent
D. Linear Regression

30 In K-fold cross-validation, what is the trade-off when increasing K?

A. Bias decreases, Variance decreases, Computation time decreases.
B. Bias decreases, Variance increases, Computation time increases.
C. Bias increases, Variance decreases, Computation time decreases.
D. Bias increases, Variance increases, Computation time increases.

31 What is the typical base learner used in Random Forests?

A. Support Vector Machines
B. Neural Networks
C. Decision Trees
D. Linear Regression

32 Which of the following is NOT a benefit of Random Forests?

A. Is easily interpretable visually like a single tree
B. Robust to outliers
C. Provides feature importance estimates
D. Handles high-dimensional data well

33 When using Bootstrap sampling in Bagging, approximately what fraction of unique observations from the original dataset are included in each sample?

A. 33%
B. 100%
C. 50%
D. 63.2%

34 Which Boosting algorithm uses a learning rate parameter to shrink the contribution of each tree?

A. Bagging
B. AdaBoost
C. Gradient Boosting
D. Random Forest

35 In the bias-variance decomposition, if the total error is high and the training error is also high, the model suffers from:

A. Low Bias
B. High Variance
C. High Bias
D. Overfitting

36 Which cross-validation method involves randomly splitting the data into a training set and a test set without distinct 'folds'?

A. Bootstrap
B. Leave-One-Out CV
C. Holdout Method
D. K-Fold CV

37 What is 'Stacking' in the context of model performance?

A. Adding more features to the data
B. Using a single Deep Neural Network
C. Combining predictions from multiple different models using a meta-model
D. Running Cross-Validation multiple times

38 In Random Forests, increasing the number of trees (n_estimators) typically:

A. Increases overfitting significantly
B. Decreases bias significantly
C. Stabilizes the error but increases training time
D. Decreases the computational cost

39 Which of the following describes the 'Stump' often used in AdaBoost?

A. A linear regression model
B. A random forest with 10 trees
C. A tree with full depth
D. A tree with only one split (depth = 1)

40 XGBoost is a popular implementation of which algorithm?

A. Gradient Boosting
B. Support Vector Machine
C. Random Forest
D. K-Nearest Neighbors

41 In K-fold Cross-Validation, the final performance metric is usually calculated by:

A. Taking the worst score among the K folds
B. Summing the scores of the K folds
C. Taking the best score among the K folds
D. Averaging the scores of the K folds

42 What is the primary motivation for using Cross-Validation over a simple Train/Test split?

A. It uses less data.
B. It is faster.
C. It automatically tunes hyperparameters.
D. It provides a less biased estimate of model performance on unseen data.

43 In the context of bias-variance, a very deep Decision Tree without pruning usually exhibits:

A. Low Bias, Low Variance
B. Low Bias, High Variance
C. High Bias, Low Variance
D. High Bias, High Variance

44 Why does Random Forest usually perform better than Bagging with Decision Trees?

A. It decorrelates the trees by restricting feature selection.
B. It uses a different loss function.
C. It uses more trees.
D. It does not use bootstrap sampling.

45 The process of tuning hyperparameters using Cross-Validation is often called:

A. Forward Selection
B. Bagging
C. Backpropagation
D. Grid Search

46 When N is small (small dataset), which Cross-Validation method is preferred to maximize the data used for training?

A. Holdout (50/50 split)
B. 2-Fold CV
C. Bootstrap
D. Leave-One-Out CV

47 Which technique allows for parallel processing during training?

A. Recurrent Neural Networks
B. Random Forest
C. AdaBoost
D. Gradient Boosting

48 What is the 'Learning Rate' in Boosting?

A. A parameter scaling the contribution of each tree to the final prediction
B. The percentage of data used for training
C. The depth of the trees
D. The speed at which the computer processes data

49 Which of the following is true regarding the bias-variance trade-off in K-Nearest Neighbors (KNN)?

A. Small K results in Low Bias and High Variance.
B. Large K results in High Variance.
C. Small K results in High Bias.
D. K does not affect Bias or Variance.

50 If your training error is 1% and your test error is 20%, your model is likely:

A. Underfitting
B. Perfectly balanced
C. Experiencing high bias
D. Overfitting