1What is the fundamental building block of an Artificial Neural Network, inspired by the biological neuron?
A.Kernel
B.Perceptron
C.Transformer
D.Token
Correct Answer: Perceptron
Explanation:The Perceptron is the simplest type of artificial neural network and serves as the fundamental building block, mimicking the function of a biological neuron by taking inputs, applying weights, and passing the result through an activation function.
Incorrect! Try again.
2Mathematically, the output of a single perceptron with inputs , weights , bias , and activation function is represented as:
A.
B.
C.
D.
Correct Answer:
Explanation:The perceptron calculates the weighted sum of inputs plus a bias term, and then applies a non-linear activation function to this sum.
Incorrect! Try again.
3In a Multi-Layer Perceptron (MLP), what is the primary purpose of the activation function?
A.To reset the weights to zero
B.To introduce non-linearity into the network
C.To convert the input data into images
D.To increase the size of the dataset
Correct Answer: To introduce non-linearity into the network
Explanation:Without activation functions, a neural network, regardless of how many layers it has, would simply be a linear regression model. Non-linearity allows the network to learn complex patterns.
Incorrect! Try again.
4Which algorithm is commonly used to train neural networks by calculating the gradient of the loss function with respect to the weights?
A.K-Means Clustering
B.Forward Propagation
C.Backpropagation
D.Principal Component Analysis
Correct Answer: Backpropagation
Explanation:Backpropagation (backward propagation of errors) is the standard method for training neural networks. It calculates the gradient of the loss function and updates weights to minimize error.
Incorrect! Try again.
5What is the primary advantage of a Convolutional Neural Network (CNN) over a standard MLP for image processing?
A.CNNs do not require activation functions
B.CNNs capture spatial hierarchies and local patterns using shared weights
C.CNNs can only process text data
D.CNNs do not use backpropagation
Correct Answer: CNNs capture spatial hierarchies and local patterns using shared weights
Explanation:CNNs use convolutional layers (filters/kernels) to detect local features like edges and textures, preserving the spatial relationship of pixels, which is crucial for image analysis.
Incorrect! Try again.
6In a CNN, what is the function of a Pooling Layer?
A.To increase the dimensionality of the feature map
B.To reduce the spatial dimensions and computation parameters
C.To change the color of the image
D.To classify the final output
Correct Answer: To reduce the spatial dimensions and computation parameters
Explanation:Pooling layers (like Max Pooling) downsample the feature maps, reducing the amount of data and parameters, which helps control overfitting and reduces computational load.
Incorrect! Try again.
7Recurrent Neural Networks (RNNs) are specifically designed to handle which type of data?
A.Static images
B.Tabular data with no time component
C.Sequential data (e.g., time series, text)
D.Unstructured noise
Correct Answer: Sequential data (e.g., time series, text)
Explanation:RNNs have internal memory (loops) that allow them to process sequences of inputs, making them suitable for language modeling, time series prediction, and speech recognition.
Incorrect! Try again.
8What major issue often affects standard RNNs when training on long sequences?
A.Exploding biases
B.Vanishing Gradient Problem
C.Over-tokenization
D.Image saturation
Correct Answer: Vanishing Gradient Problem
Explanation:In standard RNNs, gradients can become extremely small during backpropagation through time, preventing the network from learning dependencies that are far apart in the sequence.
Incorrect! Try again.
9Which architecture was introduced to solve the short-term memory limitation of standard RNNs?
A.Perceptron
B.LSTM (Long Short-Term Memory)
C.Random Forest
D.SVM
Correct Answer: LSTM (Long Short-Term Memory)
Explanation:LSTMs introduce gates (input, output, and forget gates) that regulate the flow of information, allowing the network to retain information over longer sequences.
Incorrect! Try again.
10What is the core innovation of the Transformer architecture introduced in the paper 'Attention Is All You Need'?
A.Recurrence mechanism
B.Convolutional filters
C.Self-Attention mechanism
D.Gradient boosting
Correct Answer: Self-Attention mechanism
Explanation:Transformers discard recurrence and convolutions entirely, relying on the Self-Attention mechanism to weigh the significance of different words in a sentence relative to one another.
Incorrect! Try again.
11In the context of Transformers, what does the Self-Attention mechanism calculate?
A.The probability of the next word based on the previous word only
B.The relevance of each word in a sequence to every other word in the same sequence
C.The pixel intensity of an image
D.The grammatical correctness of a sentence
Correct Answer: The relevance of each word in a sequence to every other word in the same sequence
Explanation:Self-attention allows the model to look at other positions in the input sequence for clues that can help lead to a better encoding for the current word.
Incorrect! Try again.
12Which of the following is a key advantage of Transformers over RNNs regarding training?
A.Transformers require less data
B.Transformers process data sequentially, making them slower
C.Transformers allow for parallelization of data processing
D.Transformers cannot handle long sequences
Correct Answer: Transformers allow for parallelization of data processing
Explanation:Unlike RNNs, which process words one by one sequentially, Transformers process the entire sequence simultaneously (in parallel), significantly speeding up training.
Incorrect! Try again.
13What is NLP in the context of Artificial Intelligence?
A.Neural Linear Processing
B.Natural Language Processing
C.Network Level Protocol
D.Native Language Programming
Correct Answer: Natural Language Processing
Explanation:Natural Language Processing (NLP) is a branch of AI focused on the interaction between computers and humans using natural language.
Incorrect! Try again.
14Which phase of NLP involves breaking down a text paragraph into smaller units like sentences or words?
A.Tokenization
B.Stemming
C.Sentiment Analysis
D.Discourse Analysis
Correct Answer: Tokenization
Explanation:Tokenization is the process of segmenting text into smaller units called tokens (words, subwords, or characters).
Incorrect! Try again.
15In NLP preprocessing, what does Stemming refer to?
A.Removing stop words like 'and' or 'the'
B.Reducing words to their root form by chopping off affixes (e.g., 'running' -> 'run')
C.Converting text to vectors
D.Identifying proper nouns
Correct Answer: Reducing words to their root form by chopping off affixes (e.g., 'running' -> 'run')
Explanation:Stemming is a crude heuristic process that chops off the ends of words to reduce them to a base form, often resulting in non-dictionary words.
Incorrect! Try again.
16How does Lemmatization differ from Stemming?
A.Lemmatization is faster but less accurate
B.Lemmatization uses a vocabulary and morphological analysis to return the dictionary base form
C.Lemmatization only works on vowels
D.There is no difference
Correct Answer: Lemmatization uses a vocabulary and morphological analysis to return the dictionary base form
Explanation:Unlike stemming, lemmatization considers the context and converts the word to its meaningful base form (lemma). E.g., 'better' -> 'good'.
Incorrect! Try again.
17What are Stop Words in NLP?
A.Words that end a sentence
B.Common words (e.g., 'is', 'the', 'at') often removed during preprocessing because they carry little unique meaning
C.Words that stop the training process
D.Keywords that trigger a chatbot action
Correct Answer: Common words (e.g., 'is', 'the', 'at') often removed during preprocessing because they carry little unique meaning
Explanation:Stop words are high-frequency words that are often filtered out before processing natural language data because they don't add significant semantic value for classification or retrieval.
Incorrect! Try again.
18What is the definition of Word Embeddings?
A.A dictionary definition of a word
B.A dense vector representation of words where similar words have similar vector values
C.A method to bold words in a document
D.The count of how many times a word appears
Correct Answer: A dense vector representation of words where similar words have similar vector values
Explanation:Word embeddings (like Word2Vec or GloVe) map words to vectors of real numbers in a continuous vector space, capturing semantic relationships.
Incorrect! Try again.
19In a vector space model (like Word2Vec), which arithmetic operation captures the relationship:
A.Prince
B.Queen
C.Princess
D.Monarch
Correct Answer: Queen
Explanation:This is a classic example of how embeddings capture semantic analogies. Subtracting the 'Man' vector from 'King' and adding 'Woman' results in a vector closest to 'Queen'.
Incorrect! Try again.
20What is BERT (Bidirectional Encoder Representations from Transformers) primarily used for?
A.Generating images from text
B.Understanding the context of words in a sentence by looking in both directions
C.Playing chess
D.Only predicting the next word in a sequence
Correct Answer: Understanding the context of words in a sentence by looking in both directions
Explanation:BERT is an encoder-only Transformer model designed to pre-train deep bidirectional representations from unlabeled text, making it excellent for understanding context and NLU tasks.
Incorrect! Try again.
21Which architecture does the GPT (Generative Pre-trained Transformer) family of models primarily use?
A.Transformer Encoder only
B.Transformer Decoder only
C.CNN-RNN Hybrid
D.Random Forest
Correct Answer: Transformer Decoder only
Explanation:GPT models use a Decoder-only Transformer architecture and are trained autoregressively to predict the next token in a sequence.
Incorrect! Try again.
22What is Sentiment Analysis?
A.Translating text from English to French
B.Determining the emotional tone (positive, negative, neutral) behind a text
C.Summarizing a long article
D.Checking for spelling errors
Correct Answer: Determining the emotional tone (positive, negative, neutral) behind a text
Explanation:Sentiment analysis involves using NLP to identify and extract subjective information from source materials to determine the attitude of a writer/speaker.
Incorrect! Try again.
23Which NLP task involves converting a long document into a shorter version while retaining key information?
A.Text Classification
B.Text Summarization
C.Named Entity Recognition
D.Machine Translation
Correct Answer: Text Summarization
Explanation:Text Summarization is the process of distilling the most important information from a source (or sources) to produce an abridged version for a particular user (and task).
Incorrect! Try again.
24What is Abstractive Summarization?
A.Selecting specific sentences from the original text
B.Generating new sentences to summarize the content, potentially using words not in the original text
C.Removing vowels from the text
D.Highlighting keywords in a PDF
Correct Answer: Generating new sentences to summarize the content, potentially using words not in the original text
Explanation:Unlike extractive summarization (which copies parts of the source), abstractive summarization interprets the text and generates a summary in its own words, similar to how a human would.
Incorrect! Try again.
25In the context of chatbots, what is an Intent?
A.The specific detail provided by the user (e.g., a date or city)
B.The user's goal or purpose behind a specific input (e.g., 'Book a flight')
C.The database used to store logs
D.The programming language of the bot
Correct Answer: The user's goal or purpose behind a specific input (e.g., 'Book a flight')
Explanation:Intent classification is a key component of NLU in chatbots, determining what the user wants to achieve with their message.
Incorrect! Try again.
26In the context of chatbots, what is an Entity?
A.The background color of the chat interface
B.Specific pieces of information inside the user's input (e.g., 'New York', 'Tomorrow')
C.The machine learning model used
D.The sentiment of the user
Correct Answer: Specific pieces of information inside the user's input (e.g., 'New York', 'Tomorrow')
Explanation:Entities are parameters or variables extracted from the user's utterance that are required to fulfill the intent (e.g., in 'Book a flight to Paris', 'Paris' is a location entity).
Incorrect! Try again.
27Which concept explains why Deep Learning performs better than traditional Machine Learning on unstructured data like images and text?
Explanation:Deep learning models (like CNNs and Transformers) automatically learn the feature representations from raw data, whereas traditional ML often relies on manual feature engineering.
Incorrect! Try again.
28What is the Softmax function typically used for in the output layer of a neural network?
A.Binary classification (0 or 1)
B.Multi-class classification (generating a probability distribution over classes)
C.Regression (predicting a continuous value)
D.Reducing the image size
Correct Answer: Multi-class classification (generating a probability distribution over classes)
Explanation:Softmax normalizes the output of a network to a probability distribution over predicted output classes, ensuring they sum to 1.
Incorrect! Try again.
29Which of the following describes Named Entity Recognition (NER)?
A.Classifying an email as spam or ham
B.Identifying and classifying key information in text into categories like names, organizations, locations, etc.
C.Translating a sentence to Spanish
D.Converting speech to text
Correct Answer: Identifying and classifying key information in text into categories like names, organizations, locations, etc.
Explanation:NER is a sub-task of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories.
Incorrect! Try again.
30What is Fine-tuning in the context of Large Language Models (LLMs)?
A.Training a model from scratch with random weights
B.Taking a pre-trained model and training it further on a specific dataset for a specific task
C.Adjusting the monitor brightness
D.Reducing the vocabulary size
Correct Answer: Taking a pre-trained model and training it further on a specific dataset for a specific task
Explanation:Fine-tuning involves taking a model that has already learned general language patterns (pre-training) and updating its weights slightly using a smaller, task-specific dataset.
Incorrect! Try again.
31What is the role of Positional Encoding in Transformer models?
A.It encrypts the data for security
B.It injects information about the position of tokens in the sequence since Transformers process data in parallel
C.It determines the language of the text
D.It converts images to text
Correct Answer: It injects information about the position of tokens in the sequence since Transformers process data in parallel
Explanation:Since Transformers (unlike RNNs) do not process data sequentially, they have no inherent sense of order. Positional encodings are added to embeddings to give the model information about the relative or absolute position of tokens.
Incorrect! Try again.
32Which term describes the phenomenon where an LLM generates incorrect or nonsensical information confidently?
A.Backpropagation
B.Hallucination
C.Tokenization
D.Regularization
Correct Answer: Hallucination
Explanation:Hallucination in AI refers to the generation of outputs that may sound plausible but are factually incorrect or unrelated to the input context.
Incorrect! Try again.
33What is the Bag of Words (BoW) model?
A.A model that keeps the exact order of words
B.A representation of text that describes the occurrence of words within a document, disregarding grammar and order
C.A deep learning model for image recognition
D.A technique to remove stop words
Correct Answer: A representation of text that describes the occurrence of words within a document, disregarding grammar and order
Explanation:BoW creates a vocabulary of all unique words and represents documents as vectors of word counts, ignoring the sequence or structure of the sentence.
Incorrect! Try again.
34In the attention formula , what do Q, K, and V stand for?
A.Quantity, Kernel, Value
B.Query, Key, Value
C.Question, Key, Vector
D.Query, Kernel, Variance
Correct Answer: Query, Key, Value
Explanation:These vectors are abstractions inspired by retrieval systems. The Query is what you are looking for, the Key is what you match against, and the Value is the information extracted.
Incorrect! Try again.
35Which loss function is most commonly used for binary classification tasks in neural networks?
A.Mean Squared Error (MSE)
B.Binary Cross-Entropy
C.Categorical Cross-Entropy
D.Hinge Loss
Correct Answer: Binary Cross-Entropy
Explanation:Binary Cross-Entropy (or Log Loss) measures the performance of a classification model whose output is a probability value between 0 and 1.
Incorrect! Try again.
36What is the purpose of the Rectified Linear Unit (ReLU) activation function?
A.It outputs 1 for positive inputs and 0 for negative inputs
B.It outputs the input directly if it is positive, otherwise, it outputs zero
C.It squashes inputs between -1 and 1
D.It converts inputs to a probability distribution
Correct Answer: It outputs the input directly if it is positive, otherwise, it outputs zero
Explanation:ReLU is defined as . It is computationally efficient and helps mitigate the vanishing gradient problem compared to Sigmoid or Tanh.
Incorrect! Try again.
37Which NLP application is responsible for converting spoken language into written text?
Explanation:ASR (often called Speech-to-Text) is the technology that allows human beings to use their voice to speak with a computer interface in a way that resembles normal human conversation.
Incorrect! Try again.
38What is the main limitation of One-Hot Encoding for words?
A.It is too complex to implement
B.It results in high-dimensional sparse vectors and captures no semantic relationship between words
C.It can only handle numbers
D.It requires a GPU to run
Correct Answer: It results in high-dimensional sparse vectors and captures no semantic relationship between words
Explanation:One-hot encoding creates a vector size equal to the vocabulary size (very large/sparse) and orthogonal vectors, meaning the distance between any two words is identical, ignoring meaning.
Incorrect! Try again.
39In a Neural Network, what is an Epoch?
A.A single step of gradient descent
B.One complete pass of the entire training dataset through the network
C.The number of layers in the network
D.The initial learning rate
Correct Answer: One complete pass of the entire training dataset through the network
Explanation:An epoch occurs when every sample in the training dataset has had an opportunity to update the internal model parameters.
Incorrect! Try again.
40Which technique is used to prevent Overfitting in Neural Networks by randomly ignoring some neurons during training?
A.Pooling
B.Dropout
C.Padding
D.Flattening
Correct Answer: Dropout
Explanation:Dropout is a regularization technique where randomly selected neurons are ignored (dropped out) during training, which forces the network to learn more robust features.
Incorrect! Try again.
41What type of neural network is generally best suited for Time Series Forecasting?
A.CNN
B.RNN / LSTM
C.Perceptron
D.Decision Tree
Correct Answer: RNN / LSTM
Explanation:Because time series data is sequential and current values depend on previous values, RNNs and LSTMs are the standard architecture choice.
Incorrect! Try again.
42What is the definition of Transfer Learning?
A.Transferring data from one hard drive to another
B.Using a model trained on one task as a starting point for a model on a second, related task
C.Converting a CNN to an RNN
D.Learning without any data
Correct Answer: Using a model trained on one task as a starting point for a model on a second, related task
Explanation:Transfer learning leverages knowledge (weights) gained from solving one problem and applies it to a different but related problem, often reducing training time and data requirements.
Incorrect! Try again.
43In NLP, what is Part-of-Speech (POS) Tagging?
A.Identifying if a sentence is a question or statement
B.Assigning a grammatical category (noun, verb, adjective, etc.) to each word in a text
C.Translating the text
D.removing HTML tags from text
Correct Answer: Assigning a grammatical category (noun, verb, adjective, etc.) to each word in a text
Explanation:POS tagging involves labeling words with their syntactic roles, which is essential for understanding sentence structure and meaning.
Incorrect! Try again.
44Which of the following is a key component of a Digital Assistant (like Siri or Alexa)?
A.Wake word detection
B.Visual Basic scripting
C.Manual data entry
D.CSS styling
Correct Answer: Wake word detection
Explanation:Digital assistants constantly listen for a specific 'wake word' (e.g., 'Hey Siri') to activate the main processing pipeline.
Incorrect! Try again.
45What is the Encoder's role in a standard Encoder-Decoder Transformer architecture (like for Translation)?
A.To generate the final text in the target language
B.To process the input sequence and create a contextualized representation
C.To classify images
D.To remove noise from audio
Correct Answer: To process the input sequence and create a contextualized representation
Explanation:The Encoder takes the input sequence (e.g., an English sentence) and processes it into a rich vector representation that the Decoder then uses to generate the output (e.g., a French sentence).
Incorrect! Try again.
46What is Zero-shot Learning in the context of models like GPT-3?
A.The model requires 0 minutes to train
B.The ability of the model to perform a task without having seen any specific examples of that task during training
C.The model has 0 layers
D.The model predicts 0 for all outputs
Correct Answer: The ability of the model to perform a task without having seen any specific examples of that task during training
Explanation:Zero-shot learning refers to a model's ability to handle tasks it wasn't explicitly trained to do, relying on its general understanding of language.
Incorrect! Try again.
47What does Masked Language Modeling (MLM) involve in BERT training?
A.Removing all vowels from the text
B.Hiding (masking) some percentage of the input tokens at random and predicting those masked tokens
C.Masking the output layer
D.Ignoring the last word of every sentence
Correct Answer: Hiding (masking) some percentage of the input tokens at random and predicting those masked tokens
Explanation:MLM is the pre-training objective of BERT where the model learns bidirectional context by trying to guess words that have been artificially hidden in the input.
Incorrect! Try again.
48In a neural network, what are Weights?
A.The input data values
B.Learnable parameters that determine the strength of the connection between neurons
C.The number of neurons in a layer
D.The final output classes
Correct Answer: Learnable parameters that determine the strength of the connection between neurons
Explanation:Weights are the primary parameters adjusted during training. They transform the input data as it passes through the network layers.
Incorrect! Try again.
49What is a Corpus in NLP?
A.A type of neural network layer
B.A large and structured set of texts (dataset) used for training NLP models
C.A coding error
D.The core processing unit of a computer
Correct Answer: A large and structured set of texts (dataset) used for training NLP models
Explanation:A corpus (plural: corpora) is the collection of text data (e.g., Wikipedia dump, collection of books) used to train or test NLP models.
Incorrect! Try again.
50Which of the following is an example of an Extractive approach to Question Answering?
A.Generating a new paragraph explaining the answer
B.Identifying the specific start and end indices of the answer span within a provided text context
C.Translating the question to another language
D.Searching a database for a keyword
Correct Answer: Identifying the specific start and end indices of the answer span within a provided text context
Explanation:Extractive QA models (like BERT fine-tuned on SQuAD) look at a context passage and highlight the exact segment of text that answers the user's question.
Incorrect! Try again.
Give Feedback
Help us improve by sharing your thoughts or reporting issues.