Unit1 - Subjective Questions

CSE472 • Practice Questions with Detailed Answers

1

Trace the origin and historical evolution of Natural Language Processing (NLP).

2

Explain the concepts of 'Language' and 'Grammar' in the context of Natural Language Processing.

3

Discuss the three linguistic essentials: Morphology, Syntax, and Semantics, providing suitable examples for each.

4

Define Morphology and explain the difference between derivational and inflectional morphology.

5

Distinguish between Syntax and Semantics in NLP.

6

What are the major challenges faced in Natural Language Processing? Discuss at least four challenges.

7

List and describe five major applications of Natural Language Processing.

8

Define Tokenization. Explain the difference between word tokenization and sentence tokenization.

9

Compare and contrast Stemming and Lemmatization. When should one be preferred over the other?

10

Why is stop-word removal and punctuation handling important in NLP preprocessing?

11

Discuss strategies for handling Out-of-Vocabulary (OOV) words in text processing.

12

What is Text Normalization? Outline the common steps involved in normalizing text.

13

Explain the Bag-of-Words (BoW) model. What are its main advantages and limitations?

14

Define N-grams. Give examples of a unigram, bigram, and trigram for the sentence: 'Deep learning is fascinating'.

15

Derive and explain the mathematical formulation of TF-IDF. Why is the logarithm used in IDF?

16

Compare TF-IDF with the standard Bag-of-Words (BoW) representation.

17

Explain the concept of ambiguity in NLP and describe Lexical, Syntactic, and Semantic Ambiguity with examples.

18

How do n-grams solve the context limitation of the Bag-of-Words model?

19

What are Over-stemming and Under-stemming? Explain with examples.

20

Design a complete text preprocessing pipeline for a sentiment analysis task, outlining each step from raw text to TF-IDF vectors.