1

Explain the characteristics of sequential text data and discuss why traditional feedforward neural networks struggle to process it effectively.

2

Describe the architecture of a standard Recurrent Neural Network (RNN). Provide the mathematical equations for computing the hidden state and the output at a given time step $t$ .

3

Explain the vanishing and exploding gradient problems in traditional RNNs. How do they affect the learning of long-term dependencies in sequence modeling?

4

Detail the architecture of a Long Short-Term Memory (LSTM) network. Explain the role of the forget, input, and output gates along with their respective mathematical equations.

5

Compare and contrast traditional Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks.

6

Explain the architecture of a Gated Recurrent Unit (GRU). Provide the mathematical formulations for its update and reset gates.

7

Compare LSTMs and GRUs. In what scenarios might one be preferred over the other for sequence modeling?

8

What are Bidirectional RNNs? Explain their architecture and how they capture context from both past and future states in sequential text data.

9

Discuss a specific NLP application where Bidirectional RNNs significantly outperform unidirectional RNNs, and explain the reasoning behind this.

10

Describe how sequence models can be applied to text classification tasks. Explain the typical pipeline from raw text to class probabilities.

11

How is sentiment classification formulated as a sequence modeling problem? Discuss the architecture choices for an RNN-based sentiment classifier.

12

What is Teacher Forcing in the context of training sequence models? Explain its advantages and potential drawbacks.

13

Explain the concept of Exposure Bias in sequence training. How is it related to Teacher Forcing, and how can it be mitigated?

14

Describe the Backpropagation Through Time (BPTT) algorithm used in sequence training. What are its computational challenges?

15

Explain Truncated Backpropagation Through Time (TBPTT). How does it solve the challenges of standard BPTT while maintaining the ability to learn sequences?

16

Discuss the evaluation metrics commonly used for text classification tasks. Provide the mathematical formulas for Precision, Recall, and F1-Score.

17

Explain the Perplexity metric used in sequence modeling. How is it related to cross-entropy loss, and what does a lower perplexity indicate?

18

Discuss the difference between Many-to-One and Many-to-Many RNN architectures. Provide examples of NLP tasks that utilize each configuration.

19

Discuss the role of word embeddings when used as input features for Deep Learning sequence models in NLP.

20

Explain the technique of Gradient Clipping in sequence training. Why is it necessary when training recurrent neural networks?

Unit3 - Subjective Questions