Explanation:Clustering is an unsupervised task where the goal is to group similar data points without pre-existing labels.
Incorrect! Try again.
5Reinforcement Learning is primarily based on:
A.Minimizing the reconstruction error.
B.Maximizing a cumulative reward signal.
C.Predicting a continuous value.
D.Classifying data into fixed categories.
Correct Answer: Maximizing a cumulative reward signal.
Explanation:Reinforcement learning involves an agent interacting with an environment to maximize total rewards over time.
Incorrect! Try again.
6What distinguishes a Parametric Model from a Non-parametric Model?
A.Parametric models cannot be used for regression.
B.Parametric models assume a fixed number of parameters independent of the sample size.
C.Non-parametric models do not use any parameters.
D.Parametric models are always slower to train.
Correct Answer: Parametric models assume a fixed number of parameters independent of the sample size.
Explanation:Parametric models (like Linear Regression) assume a functional form and have a fixed number of parameters. Non-parametric models (like KNN) grow in complexity with the data size.
Incorrect! Try again.
7Which of the following is an example of a Non-parametric algorithm?
A.Linear Regression
B.Logistic Regression
C.K-Nearest Neighbors (KNN)
D.Linear Discriminant Analysis
Correct Answer: K-Nearest Neighbors (KNN)
Explanation:KNN is non-parametric because it does not make strong assumptions about the underlying data distribution and the 'parameters' are effectively the training data itself.
Incorrect! Try again.
8A Discriminative Model attempts to learn:
A.The joint probability distribution
B.The conditional probability distribution
C.The marginal probability
D.The distribution of the input data only
Correct Answer: The conditional probability distribution
Explanation:Discriminative models focus on modeling the boundary between classes or predicting given directly.
Incorrect! Try again.
9Which of the following is a Generative Model?
A.Support Vector Machine (SVM)
B.Logistic Regression
C.Naive Bayes
D.Decision Tree
Correct Answer: Naive Bayes
Explanation:Naive Bayes is a generative model because it models how the data is generated ( and ) to infer the class using Bayes' theorem.
Incorrect! Try again.
10In the standard Machine Learning Workflow, which step typically follows 'Data Preprocessing'?
A.Model Deployment
B.Data Collection
C.Model Training/Selection
D.Problem Definition
Correct Answer: Model Training/Selection
Explanation:After data is collected and preprocessed (cleaned/transformed), the next step is usually selecting an algorithm and training the model.
Incorrect! Try again.
11Why do we split data into Training and Testing sets?
A.To double the size of the dataset.
B.To ensure the model has enough data to learn.
C.To evaluate how well the model generalizes to unseen data.
D.To fix syntax errors in the code.
Correct Answer: To evaluate how well the model generalizes to unseen data.
Explanation:The test set acts as a proxy for future, unseen data to ensure the model hasn't just memorized the training examples.
Incorrect! Try again.
12Which of the following is a typical split ratio for Train/Test data?
A.10% Train, 90% Test
B.50% Train, 50% Test
C.80% Train, 20% Test
D.100% Train, 0% Test
Correct Answer: 80% Train, 20% Test
Explanation:While it varies, an 80/20 or 70/30 split is standard practice to provide sufficient data for learning while retaining enough for validation.
Incorrect! Try again.
13When a model performs very well on training data but effectively 'guesses' on test data (high training accuracy, low test accuracy), it is suffering from:
A.Underfitting
B.Overfitting
C.Optimal convergence
D.High Bias
Correct Answer: Overfitting
Explanation:Overfitting occurs when a model learns the noise and details of the training data to the extent that it negatively impacts the performance on new data.
Incorrect! Try again.
14Which concept is associated with High Bias?
A.Overfitting
B.Underfitting
C.High Variance
D.Complex decision boundaries
Correct Answer: Underfitting
Explanation:High bias implies the model makes strong assumptions about the data (simplification), often leading to underfitting where it fails to capture the underlying trend.
Incorrect! Try again.
15The Bias-Variance Trade-off suggests that:
A.Ideally, we want high bias and high variance.
B.Increasing model complexity typically decreases bias but increases variance.
C.Increasing model complexity typically increases bias and decreases variance.
D.Bias and Variance are unrelated concepts.
Correct Answer: Increasing model complexity typically decreases bias but increases variance.
Explanation:Simple models have high bias/low variance. Complex models have low bias/high variance. The goal is to find the optimal balance.
Incorrect! Try again.
16In a Regression problem, which metric is commonly used to evaluate performance?
A.Accuracy
B.F1-Score
C.Mean Squared Error (MSE)
D.Confusion Matrix
Correct Answer: Mean Squared Error (MSE)
Explanation:MSE measures the average squared difference between the estimated values and the actual value, suitable for continuous outputs.
Incorrect! Try again.
17What is calculated using the formula: ?
A.Precision
B.Recall
C.Accuracy
D.F1 Score
Correct Answer: Accuracy
Explanation:This is the formula for Accuracy: the ratio of correctly predicted observations to the total observations.
Incorrect! Try again.
18Which command is used to import Numpy in Python?
A.import numpy as np
B.include numpy
C.using package numpy
D.from numpy import *
Correct Answer: import numpy as np
Explanation:The standard convention in the Python data science community is to alias numpy as np.
Incorrect! Try again.
19What is the output of the following code? import numpy as np arr = np.array([1, 2, 3]) print(arr.ndim)
A.3
B.1
C.(3,)
D.Array(3)
Correct Answer: 1
Explanation:ndim returns the number of dimensions (axes) of the array. Since it is a 1D list converted to an array, the dimension is 1.
Incorrect! Try again.
20How do you create a 3x3 identity matrix in Numpy?
A.np.identity(3, 3)
B.np.eye(3)
C.np.array([3, 3], type='identity')
D.np.ones((3, 3))
Correct Answer: np.eye(3)
Explanation:np.eye(N) creates an N x N identity matrix (ones on the diagonal, zeros elsewhere).
Incorrect! Try again.
21Which Numpy function creates an array of evenly spaced values within a given interval?
A.np.space()
B.np.arrange()
C.np.linspace()
D.np.interval()
Correct Answer: np.linspace()
Explanation:np.linspace(start, stop, num) generates num evenly spaced samples, calculated over the interval [start, stop].
Incorrect! Try again.
22Consider the code: a = np.array([1, 2, 3]) and b = np.array([2, 2, 2]). What is a * b?
A.[2, 4, 6]
B.12
C.[3, 4, 5]
D.Error
Correct Answer: [2, 4, 6]
Explanation:In Numpy, the * operator performs element-wise multiplication.
Incorrect! Try again.
23What is the requirement for Broadcasting two arrays in Numpy?
A.They must have exactly the same shape.
B.They must have the same number of dimensions.
C.For each dimension, the sizes must be equal or one of them must be 1.
D.Both arrays must be 1-dimensional.
Correct Answer: For each dimension, the sizes must be equal or one of them must be 1.
Explanation:This is the Broadcasting Rule. Numpy compares shapes element-wise starting from the trailing dimensions.
Incorrect! Try again.
24If A is an array of shape (4, 1) and B is an array of shape (3,), what is the resulting shape of A + B?
A.(4, 1)
B.(4, 3)
C.(7,)
D.Error: shapes are incompatible
Correct Answer: (4, 3)
Explanation:B is treated as (1, 3) for broadcasting. The dimensions are compatible: (4, 1) and (1, 3) result in (4, 3).
Incorrect! Try again.
25Which function performs Matrix Multiplication in Numpy?
A.np.mult()
B.np.dot()
C.np.matrix_multiply()
D.np.cross()
Correct Answer: np.dot()
Explanation:np.dot(a, b) or the @ operator is used for matrix multiplication.
Incorrect! Try again.
26To convert a 2D array of shape (3, 4) into a 1D array of shape (12,), which method is used?
A.arr.transpose()
B.arr.reshape(12)
C.arr.flatten()
D.Both B and C
Correct Answer: Both B and C
Explanation:Both reshape(-1) or reshape(12) and flatten() can be used to convert the matrix into a 1D array.
Incorrect! Try again.
27What does np.random.rand(2, 3) return?
A.A 2x3 array of random integers.
B.A 2x3 array of random floats sampled from a normal distribution.
C.A 2x3 array of random floats sampled from a uniform distribution over [0, 1).
D.A single random number.
Correct Answer: A 2x3 array of random floats sampled from a uniform distribution over [0, 1).
Explanation:np.random.rand generates random values in a given shape from a uniform distribution over [0, 1).
Incorrect! Try again.
28Which Numpy attribute is used to find the number of rows and columns of a matrix?
A..size
B..shape
C..dim
D..length
Correct Answer: .shape
Explanation:arr.shape returns a tuple representing the dimensions of the array (e.g., (rows, cols)).
Incorrect! Try again.
29Which of the following is an application of Machine Learning?
A.Recommendation Systems (e.g., Netflix)
B.SQL Database Management
C.Network routing protocols (Traditional)
D.Compiling Java code
Correct Answer: Recommendation Systems (e.g., Netflix)
Explanation:Recommendation systems analyze user behavior patterns to predict preferences, a classic ML application.
Incorrect! Try again.
30In the context of evaluating a classification model, what is a False Positive?
A.The model correctly predicts the positive class.
B.The model correctly predicts the negative class.
C.The model incorrectly predicts the positive class (predicts Yes when actual is No).
D.The model incorrectly predicts the negative class (predicts No when actual is Yes).
Correct Answer: The model incorrectly predicts the positive class (predicts Yes when actual is No).
Explanation:A False Positive (Type I error) occurs when the model predicts the positive class erroneously.
Incorrect! Try again.
31How can you ensure reproducibility of random numbers in Numpy?
A.By running the code multiple times.
B.By using np.random.seed(value).
C.By using np.random.secure().
D.It is impossible to reproduce random numbers.
Correct Answer: By using np.random.seed(value).
Explanation:Setting the seed initializes the pseudo-random number generator to a fixed state, ensuring the same sequence of numbers is generated.
Incorrect! Try again.
32What happens if you try to modify the shape of an array using .reshape() such that the total number of elements changes?
A.It pads the array with zeros.
B.It truncates the array.
C.It raises a ValueError.
D.It creates a new array with random values.
Correct Answer: It raises a ValueError.
Explanation:Reshaping must preserve the total number of elements. e.g., you cannot reshape a 10-element array into a 3x3 matrix (9 elements).
Incorrect! Try again.
33Which of the following creates a Numpy array filled with zeros?
A.np.empty((2,2))
B.np.zeros((2,2))
C.np.null((2,2))
D.np.O((2,2))
Correct Answer: np.zeros((2,2))
Explanation:np.zeros() creates an array filled with 0s.
Incorrect! Try again.
34If x = np.array([[1, 2], [3, 4]]), what is x.T?
A.The inverse of the matrix.
B.The transpose of the matrix.
C.The determinant.
D.The flattened array.
Correct Answer: The transpose of the matrix.
Explanation:.T is the accessor for the transpose of the array (swapping rows and columns).
Incorrect! Try again.
35Which metric is best suited for a classification problem with imbalanced classes?
A.Accuracy
B.Mean Squared Error
C.F1-Score
D.R-squared
Correct Answer: F1-Score
Explanation:Accuracy can be misleading on imbalanced datasets. F1-Score (harmonic mean of Precision and Recall) provides a better balance.
Incorrect! Try again.
36Which step helps in handling missing values in a dataset?
A.Feature Scaling
B.Data Imputation
C.One-hot encoding
D.Cross-validation
Correct Answer: Data Imputation
Explanation:Data imputation involves filling missing data with estimated values (like mean, median, or mode).
Incorrect! Try again.
37What is the purpose of Cross-Validation?
A.To deploy the model.
B.To assess model performance more reliably by using multiple train-test splits.
C.To mix two different datasets.
D.To speed up training.
Correct Answer: To assess model performance more reliably by using multiple train-test splits.
Explanation:Cross-validation (like k-fold) reduces the variance associated with a single trial of train/test split.
Incorrect! Try again.
38Which of the following is NOT a core component of a Confusion Matrix?
A.True Positive
B.False Negative
C.Mean Absolute Error
D.True Negative
Correct Answer: Mean Absolute Error
Explanation:Mean Absolute Error is a regression metric. Confusion matrices consist of TP, TN, FP, and FN.
Incorrect! Try again.
39Numpy operations are often faster than standard Python lists because:
A.They utilize Vectorization.
B.They run on the cloud.
C.They ignore data types.
D.They are interpreted line by line.
Correct Answer: They utilize Vectorization.
Explanation:Vectorization allows Numpy to apply operations to whole arrays at once using optimized C-level code, avoiding explicit Python loops.
Incorrect! Try again.
40What is the range of values returned by np.random.randn()?
A.[0, 1]
B.[-1, 1]
C. following a Standard Normal Distribution
D.All positive integers
Correct Answer: following a Standard Normal Distribution
Explanation:randn samples from the 'standard normal' distribution (mean 0, variance 1).
Incorrect! Try again.
41In the equation , what does represent?
A.The slope
B.The bias term
C.Irreducible Error / Noise
D.The feature vector
Correct Answer: Irreducible Error / Noise
Explanation:In statistical learning, represents the random error term or noise that cannot be reduced by the model.
Incorrect! Try again.
42Semi-supervised learning is a combination of:
A.Supervised and Reinforcement Learning
B.Small amount of labeled data and large amount of unlabeled data
C.Clustering and Regression
D.Parametric and Non-parametric models
Correct Answer: Small amount of labeled data and large amount of unlabeled data
Explanation:Semi-supervised learning falls between supervised and unsupervised learning, utilizing a mix of labeled and unlabeled data.
Incorrect! Try again.
43What does np.argmax(array) return?
A.The maximum value in the array.
B.The index of the maximum value.
C.The sorted array.
D.The mean of the array.
Correct Answer: The index of the maximum value.
Explanation:argmax returns the indices of the maximum values along an axis.
Incorrect! Try again.
44Which of the following best describes Regularization?
A.Adding more features to the model.
B.Techniques used to prevent overfitting by penalizing complex models.
C.Cleaning the data using regular expressions.
D.Increasing the learning rate.
Correct Answer: Techniques used to prevent overfitting by penalizing complex models.
Explanation:Regularization adds a penalty term to the loss function (e.g., L1 or L2) to discourage the model from becoming too complex.
Incorrect! Try again.
45If arr = np.array([10, 20, 30, 40]), what does arr[1:3] return?
A.array([20, 30])
B.array([10, 20])
C.array([20, 30, 40])
D.array([30])
Correct Answer: array([20, 30])
Explanation:Slicing is inclusive of the start index and exclusive of the stop index. Index 1 is 20, index 2 is 30. Stop at index 3.
Incorrect! Try again.
46Data Science is considered an interdisciplinary field that includes:
A.Only Computer Science
B.Only Statistics
C.Statistics, Computer Science, and Domain Expertise
D.Only Web Development
Correct Answer: Statistics, Computer Science, and Domain Expertise
Explanation:Data Science is the intersection of hacking skills (CS), math/stats knowledge, and substantive domain expertise.
Incorrect! Try again.
47Which Numpy function calculates the standard deviation?
A.np.mean()
B.np.var()
C.np.std()
D.np.dev()
Correct Answer: np.std()
Explanation:np.std() computes the standard deviation of the given data along the specified axis.
Incorrect! Try again.
48In Machine Learning, a feature vector refers to:
A.The output label.
B.An n-dimensional vector of numerical features that represent some object.
C.The error rate of the model.
D.The Python list containing the library imports.
Correct Answer: An n-dimensional vector of numerical features that represent some object.
Explanation:A feature vector is the input representation of an instance used by the algorithm.
Incorrect! Try again.
49Which type of learning is used for a chess-playing engine that learns by playing millions of games against itself?
A.Supervised Learning
B.Unsupervised Learning
C.Reinforcement Learning
D.Semi-supervised Learning
Correct Answer: Reinforcement Learning
Explanation:The engine learns through trial and error, receiving a 'reward' (win) or 'penalty' (loss).
Incorrect! Try again.
50What is the result of np.arange(5)?
A.[1, 2, 3, 4, 5]
B.[0, 1, 2, 3, 4]
C.[0, 1, 2, 3, 4, 5]
D.[5]
Correct Answer: [0, 1, 2, 3, 4]
Explanation:np.arange(stop) generates values from 0 up to (but not including) stop.