Unit5 - Subjective Questions

CSE274 • Practice Questions with Detailed Answers

1

Define Ensemble Learning and explain the motivation behind using it. How does it address the Bias-Variance Tradeoff?

2

Differentiate between Bagging and Boosting with respect to their training mechanisms and objectives.

3

Explain the concept of a Majority Voting Classifier. Distinguish between Hard Voting and Soft Voting.

4

Describe the working principle of the Random Forest algorithm. How does it introduce randomness to improve upon standard Bagging?

5

What is the Out-of-Bag (OOB) error in Random Forests, and why is it useful?

6

Explain the AdaBoost (Adaptive Boosting) algorithm. How does it update weights for misclassified instances?

7

What is the fundamental intuition behind Gradient Boosting Machines (GBM)? How does it differ from AdaBoost?

8

Explain XGBoost (eXtreme Gradient Boosting) and list three key features that make it superior to standard GBM.

9

Compare Level-wise tree growth vs. Leaf-wise tree growth strategies. Which algorithm uses which strategy?

10

Discuss the two novel techniques introduced by LightGBM: GOSS and EFB.

11

What distinguishes CatBoost from other Gradient Boosting frameworks regarding categorical data handling?

12

Explain the concept of Stacking (Stacked Generalization). How is it different from Voting?

13

What are Machine Learning Pipelines? Why are they essential in the context of Hyperparameter Tuning and Cross-Validation?

14

Compare Grid Search and Random Search for hyperparameter tuning. When would you prefer Random Search?

15

Explain Bayesian Optimization for hyperparameter tuning. What are the roles of the Surrogate Model and the Acquisition Function?

16

Describe K-Fold Cross-Validation and Stratified K-Fold Cross-Validation. When is Stratified K-Fold necessary?

17

How is Ensemble Regression performed? Discuss how Bagging and Boosting are adapted for regression tasks.

18

Derive or explain the XGBoost Objective Function considering the Loss term and the Regularization term.

19

What metrics are commonly used for Model Evaluation in Ensemble Learning for Classification and Regression problems?

20

How do Tree-based Ensembles (Random Forest, XGBoost) calculate Feature Importance?