Unit4 - Subjective Questions

INT255 • Practice Questions with Detailed Answers

1

Explain the general formulation of an optimization problem in the context of Machine Learning. What are the key components involved?

2

Discuss the role of loss functions and regularizers in Machine Learning optimization. Provide an example of each.

3

Define a convex set and provide an example. Why are convex sets important in optimization?

4

Define a convex function and explain its properties. Why is convexity desirable in Machine Learning optimization?

5

Give an example of a function that is convex and one that is not convex. Justify your choices.

6

Explain the concept of the gradient of a multivariable function. How is it used in optimization algorithms like Gradient Descent?

7

Define the directional derivative and explain its relationship with the gradient.

8

Given the function , calculate its gradient .

9

Describe the basic intuition behind the Gradient Descent algorithm. What is its primary goal?

10

Compare and contrast Batch Gradient Descent (BGD), Stochastic Gradient Descent (SGD), and Mini-batch Gradient Descent (MBGD), highlighting their computational costs, convergence properties, and practical use cases.

11

Discuss the advantages and disadvantages of using Stochastic Gradient Descent (SGD) over Batch Gradient Descent (BGD) in large datasets.

12

Explain the role of the learning rate () in Gradient Descent algorithms and its impact on convergence. What are the consequences of choosing a learning rate that is too high or too low?

13

Explain the concept of momentum in optimization algorithms. How does it help overcome local minima and speed up convergence?

14

Describe the update rule for a typical Momentum-based Gradient Descent algorithm, explaining each component.

15

Explain the main idea behind the RMSProp optimizer. How does it address the vanishing/exploding gradient problem?

16

Describe the core components and advantages of the Adam optimizer, explaining how it combines ideas from other optimizers.

17

Compare Adam with RMSProp, highlighting their similarities and key differences.

18

Discuss common challenges encountered when performing optimization in large-scale Machine Learning systems.

19

Explain techniques like distributed optimization and data parallelism in the context of large-scale Machine Learning.

20

How does the choice of optimizer (e.g., SGD, Adam) impact resource utilization and training time in large-scale machine learning applications?