1What does the "state space" in AI problem solving represent?
Problem formulation and state space search
Easy
A.The cost of the solution path
B.The set of all possible states reachable from the initial state
C.The algorithm used to solve the problem
D.Only the final goal state
Correct Answer: The set of all possible states reachable from the initial state
Explanation:
The state space is a fundamental concept in AI, representing the complete set of configurations or states a problem can be in, connected by actions or transitions.
Incorrect! Try again.
2Which of the following is a key characteristic of uninformed search algorithms?
Introduction to uninformed search
Easy
A.They have no information about the cost or distance to the goal
B.They use problem-specific knowledge to find a solution
C.They are also known as heuristic search algorithms
D.They always find the optimal solution
Correct Answer: They have no information about the cost or distance to the goal
Explanation:
Uninformed (or blind) search algorithms explore the search space systematically without any extra information about the problem, such as how far a state is from the goal. They only know the problem definition.
Incorrect! Try again.
3What is a heuristic function, denoted as , used for in a search algorithm?
Heuristic search
Easy
A.To estimate the cost of the cheapest path from node to a goal state
B.To guarantee that the search will find a solution
C.To store the visited nodes in the search tree
D.To calculate the exact cost of the path from the start node to node
Correct Answer: To estimate the cost of the cheapest path from node to a goal state
Explanation:
A heuristic function provides an educated guess or an estimate of the cost to reach the goal from a given state . It guides informed search algorithms towards more promising states.
Incorrect! Try again.
4The A* search algorithm evaluates nodes by combining which two functions?
Core search algorithms: A* search
Easy
A. and
B.The depth of the node and the branching factor
C.only
D.only
Correct Answer: and
Explanation:
A* search uses an evaluation function , where is the actual cost from the start node to node , and is the estimated cost from node to the goal.
Incorrect! Try again.
5In a Constraint Satisfaction Problem (CSP), what is the primary goal?
Constraint satisfaction
Easy
A.To find an assignment of values to variables that satisfies all given constraints
B.To find a state that maximizes a certain objective function
C.To find the shortest path between two nodes
D.To explore the entire state space
Correct Answer: To find an assignment of values to variables that satisfies all given constraints
Explanation:
A CSP is defined by a set of variables, a domain of values for each variable, and a set of constraints. The solution is an assignment of values that violates no constraints.
Incorrect! Try again.
6What is the fundamental idea behind Gradient Descent?
Basics of optimization: gradient-based methods and metaheuristics
Easy
A.To take iterative steps in the opposite direction of the function's gradient
B.To explore all possible solutions to find the absolute best one
C.To randomly jump to different points in the search space
D.To divide the problem into smaller, simpler subproblems
Correct Answer: To take iterative steps in the opposite direction of the function's gradient
Explanation:
Gradient Descent is an optimization algorithm that aims to find a local minimum of a function by repeatedly moving in the direction of the steepest descent, which is the negative of the gradient.
Incorrect! Try again.
7In reinforcement learning, what does an "agent" receive from the "environment" as feedback for its actions?
Introduction to reinforcement learning for sequential decision problems
Easy
A.A new set of rules
B.A reward or a penalty
C.The source code of the environment
D.A heuristic function
Correct Answer: A reward or a penalty
Explanation:
The core loop of reinforcement learning involves an agent taking an action in an environment, and the environment responding with a new state and a reward signal (positive or negative) that indicates the quality of the action.
Incorrect! Try again.
8Which of the following is a standard metric for evaluating the performance of a search algorithm?
AI problem design: Solution metrics
Easy
A.Completeness (does it always find a solution if one exists?)
B.The developer's experience level
C.The number of lines of code
D.The programming language used
Correct Answer: Completeness (does it always find a solution if one exists?)
Explanation:
Key metrics for search algorithms include completeness, optimality (does it find the best solution?), time complexity, and space complexity.
Incorrect! Try again.
9Which component of a problem formulation defines the starting point for the search agent?
Problem formulation and state space search
Easy
A.The action function
B.The goal test
C.The path cost
D.The initial state
Correct Answer: The initial state
Explanation:
A well-defined problem has four essential components: an initial state, a set of possible actions, a transition model (or action function), and a goal test. The initial state is where the agent begins.
Incorrect! Try again.
10Which uninformed search algorithm explores the deepest node in the current frontier of the search tree first?
Introduction to uninformed search
Easy
A.Breadth-First Search (BFS)
B.Depth-First Search (DFS)
C.Uniform-Cost Search
D.A* Search
Correct Answer: Depth-First Search (DFS)
Explanation:
Depth-First Search (DFS) always expands the deepest node in the current frontier. It explores as far as possible along each branch before backtracking.
Incorrect! Try again.
11Greedy Best-First Search is a type of Best-First Search that expands the node that is...?
Core search algorithms: Best-first search
Easy
A.Chosen randomly from the frontier
B.The deepest in the search tree
C.Closest to the initial state
D.Estimated to be closest to the goal state
Correct Answer: Estimated to be closest to the goal state
Explanation:
Greedy Best-First Search uses only a heuristic function to estimate the distance to the goal and always expands the node that appears to be closest, ignoring the cost already incurred to reach that node.
Incorrect! Try again.
12What does it mean for a heuristic to be "admissible"?
Heuristic search
Easy
A.It always calculates the exact cost to the goal
B.It is very complex to compute
C.It never overestimates the true cost to reach the goal
D.It guarantees the search will be fast
Correct Answer: It never overestimates the true cost to reach the goal
Explanation:
An admissible heuristic is an optimistic one; its estimated cost is always less than or equal to the actual cost of the cheapest path from node to the goal. Admissibility is a crucial property for A* to guarantee optimality.
Incorrect! Try again.
13Which of the following is a classic example of a Constraint Satisfaction Problem (CSP)?
Constraint satisfaction
Easy
A.Sorting a list of numbers
B.The 8-puzzle problem
C.Finding the shortest path in a graph
D.Map coloring
Correct Answer: Map coloring
Explanation:
The map coloring problem, where no two adjacent regions can have the same color, is a classic CSP. The variables are the regions, the domains are the available colors, and the constraints are that adjacent regions must have different colors.
Incorrect! Try again.
14What is a "metaheuristic"?
Basics of optimization: gradient-based methods and metaheuristics
Easy
A.A specific, low-level algorithm for a single problem
B.A method that guarantees finding the globally optimal solution
C.A high-level strategy that guides other heuristics to find a good solution
D.A mathematical proof of an algorithm's correctness
Correct Answer: A high-level strategy that guides other heuristics to find a good solution
Explanation:
Metaheuristics like Genetic Algorithms or Simulated Annealing are general-purpose optimization strategies that orchestrate a search process, often for problems where finding an exact, optimal solution is too difficult.
Incorrect! Try again.
15In the context of search algorithms, what does "space complexity" generally measure?
AI problem design: Complexity
Easy
A.The time it takes a programmer to write the code
B.The number of nodes generated during the search
C.The length of the solution path found
D.The amount of memory the algorithm requires
Correct Answer: The amount of memory the algorithm requires
Explanation:
Space complexity refers to the maximum number of nodes stored in memory at any one time during the search. It's a critical measure of an algorithm's resource requirements.
Incorrect! Try again.
16For a search problem like finding the best route on a map, what is the most fundamental data required to define the problem?
AI problem design: Data requirements
Easy
A.User reviews of different locations
B.The history of all previous trips taken
C.The current weather conditions
D.A graph of locations (nodes) and roads (edges) with costs
Correct Answer: A graph of locations (nodes) and roads (edges) with costs
Explanation:
To formulate a route-finding problem, the essential data is the state space, which is represented by a graph of locations and the connections between them, including costs like distance or travel time.
Incorrect! Try again.
17If the heuristic function in an A search is always zero, which algorithm does A search effectively become?
Core search algorithms: A* search
Easy
A.Greedy Best-First Search
B.Hill Climbing
C.Depth-First Search
D.Uniform-Cost Search
Correct Answer: Uniform-Cost Search
Explanation:
If , the evaluation function becomes . The search will then prioritize expanding the node with the lowest path cost from the start, which is the exact behavior of Uniform-Cost Search.
Incorrect! Try again.
18The main goal of a reinforcement learning agent is typically to learn a...?
Introduction to reinforcement learning for sequential decision problems
Easy
A.Policy that maximizes cumulative reward
B.Heuristic function for A* search
C.Map of the environment
D.Set of constraints for a CSP
Correct Answer: Policy that maximizes cumulative reward
Explanation:
A policy is a strategy or mapping from states to actions. The agent learns a policy that tells it what action to take in any given state to maximize its total expected reward over the long run.
Incorrect! Try again.
19What does "optimality" mean as a metric for a search algorithm?
AI problem design: Solution metrics
Easy
A.The algorithm uses the least amount of memory
B.The algorithm is guaranteed to find the best possible solution
C.The algorithm finds a solution very quickly
D.The algorithm can solve any type of problem
Correct Answer: The algorithm is guaranteed to find the best possible solution
Explanation:
An optimal algorithm is one that, if a solution exists, will find the one with the highest quality according to the problem's cost function (e.g., the one with the lowest path cost).
Incorrect! Try again.
20What is the "goal test" in a search problem?
Problem formulation and state space search
Easy
A.A function that determines if a given state is a goal state
B.A heuristic that estimates the distance to the goal
C.The final step in the search algorithm
D.The cost of reaching the goal
Correct Answer: A function that determines if a given state is a goal state
Explanation:
The goal test is a critical part of problem formulation. It's a function that takes a state as input and returns true if it is a solution to the problem, and false otherwise, telling the search when to stop.
Incorrect! Try again.
21If you are formulating the 8-puzzle problem and change the state representation from a 3x3 grid of integers to a single string of 9 characters, how does this primarily affect the problem formulation?
Problem formulation and state space search
Medium
A.It alters the set of possible actions (e.g., 'move blank left').
B.It changes the goal test, as the target configuration is now a string.
C.It fundamentally changes the size of the abstract state space.
D.It is an implementation detail that does not change the abstract problem formulation (states, actions, goal test).
Correct Answer: It is an implementation detail that does not change the abstract problem formulation (states, actions, goal test).
Explanation:
The abstract state space (the set of all reachable board configurations) remains the same regardless of how a state is represented in code. The conceptual actions (moving the blank tile) and the goal configuration are also unchanged. Changing the data structure from a 2D array to a string is a choice of implementation, not a change to the formal definition of the problem itself.
Incorrect! Try again.
22For a state space graph with a very large (or infinite) depth but a solution at a shallow depth d, why is Breadth-First Search (BFS) often preferred over Depth-First Search (DFS)?
Introduction to uninformed search
Medium
A.DFS is not complete and might get stuck in an infinite branch, never finding the shallow solution.
B.DFS is not optimal, whereas BFS always finds the best solution.
C.BFS is faster because it explores fewer nodes.
D.BFS has better space complexity in this scenario.
Correct Answer: DFS is not complete and might get stuck in an infinite branch, never finding the shallow solution.
Explanation:
In a graph with infinite paths, Depth-First Search is not complete. It may follow a deep or infinite path and never backtrack to find a shallow solution on another branch. Breadth-First Search explores layer by layer, guaranteeing that it will find the shallowest solution first and is therefore complete.
Incorrect! Try again.
23In A search, the evaluation function is . If we use an admissible heuristic where for all nodes , which search algorithm does A effectively become?
Core search algorithms: Best-first search, A* search, Heuristic search
Medium
A.Greedy Best-First Search
B.Uniform Cost Search (UCS)
C.Breadth-First Search (BFS)
D.Depth-First Search (DFS)
Correct Answer: Uniform Cost Search (UCS)
Explanation:
If the heuristic is always 0, the A evaluation function becomes . The search will then prioritize expanding the node with the lowest path cost from the start, which is the exact definition of Uniform Cost Search. A is a generalization of UCS.
Incorrect! Try again.
24An A* search is being performed. The path to node N has a cost . The heuristic estimate to the goal is . A child of N, node M, is generated with a step cost of 5. The heuristic for M is . What is the -value for node M?
Core search algorithms: Best-first search, A* search, Heuristic search
Medium
A.14
B.21
C.16
D.19
Correct Answer: 21
Explanation:
The A* evaluation function is . To calculate , we need and .
The cost to reach M, , is the cost to reach its parent N plus the step cost from N to M: .
The heuristic for M, , is given as 6.
Therefore, .
Incorrect! Try again.
25In a Constraint Satisfaction Problem (CSP), what is the primary benefit of using the Minimum Remaining Values (MRV) heuristic for variable selection?
Constraint satisfaction
Medium
A.It tends to choose a variable that is most likely to cause a failure early, thereby pruning large portions of the search tree.
B.It guarantees that a solution will be found without any backtracking.
C.It always selects the variable with the simplest constraints first.
D.It ensures the solution found is optimal according to some cost function.
Correct Answer: It tends to choose a variable that is most likely to cause a failure early, thereby pruning large portions of the search tree.
Explanation:
The Minimum Remaining Values (MRV) heuristic advises choosing the variable with the fewest legal values remaining in its domain. This is a 'fail-first' strategy. By picking the most constrained variable, if there is a path that will lead to a failure (a dead end with no legal values), this heuristic makes it more likely to discover that failure sooner, allowing the algorithm to backtrack and prune that entire subtree from the search space.
Incorrect! Try again.
26A key challenge in using gradient-based optimization methods like Gradient Descent is getting stuck in a local minimum. Which of the following techniques is specifically designed to help escape local minima?
Basics of optimization: gradient-based methods and metaheuristics
Medium
A.Using a larger batch size
B.Simulated Annealing
C.Decreasing the learning rate
D.Normalizing the input features
Correct Answer: Simulated Annealing
Explanation:
Simulated Annealing is a metaheuristic optimization algorithm that can escape local minima. It does this by occasionally accepting moves to worse solutions, with the probability of acceptance decreasing over time (as the 'temperature' parameter cools). Gradient-based methods, by their nature, follow the slope downwards and will stop at the first minimum they find, whether it's local or global.
Incorrect! Try again.
27You are building an email spam detector. The dataset is highly imbalanced, with 99% of emails being non-spam. Why is 'accuracy' a poor metric for evaluating your model's performance in this case?
AI problem design: Complexity, Data requirements, Solution metrics
Medium
A.Accuracy is too computationally expensive to calculate for large datasets.
B.Accuracy does not account for the speed of the classification.
C.Accuracy cannot be used for binary classification problems.
D.A naive model that classifies all emails as non-spam would achieve 99% accuracy but be useless.
Correct Answer: A naive model that classifies all emails as non-spam would achieve 99% accuracy but be useless.
Explanation:
In imbalanced datasets, accuracy can be highly misleading. A simple model that always predicts the majority class (in this case, 'not spam') would achieve an accuracy of 99%. However, it would fail to identify any spam emails at all, making it completely ineffective for its intended purpose. Metrics like Precision, Recall, and F1-score are much more informative in such scenarios.
Incorrect! Try again.
28In a Markov Decision Process (MDP), what does the 'Markov Property' imply about the states?
Introduction to reinforcement learning for sequential decision problems
Medium
A.Every state must be reachable from every other state.
B.The future is independent of the past given the present; the current state captures all relevant information.
C.The reward for reaching a state is independent of the action taken.
D.The number of possible actions is the same in every state.
Correct Answer: The future is independent of the past given the present; the current state captures all relevant information.
Explanation:
The Markov Property is the fundamental assumption of an MDP. It states that the effects of an action taken in a state depend only on that state and not on the prior history of actions and states. In other words, the current state contains all the information needed to decide the future, and the history of how the agent got to is irrelevant for future transitions.
Incorrect! Try again.
29What is the primary risk of using Greedy Best-First Search for a route-finding problem?
Core search algorithms: Best-first search, A* search, Heuristic search
Medium
A.It often finds a path that is significantly longer (and thus suboptimal) than the best possible path.
B.It has a much higher space complexity than A* search.
C.It is not guaranteed to find a solution even if one exists.
D.It cannot be used if the graph contains cycles.
Correct Answer: It often finds a path that is significantly longer (and thus suboptimal) than the best possible path.
Explanation:
Greedy Best-First Search expands the node that it estimates to be closest to the goal, based solely on the heuristic . It completely ignores the cost to reach that node, . This can lead it down a path that seems promising initially but turns out to be very long and costly overall. Therefore, it is optimal neither in terms of path cost nor completeness (it can get stuck in loops).
Incorrect! Try again.
30What is the main difference between applying Forward Checking and maintaining Arc Consistency (e.g., using the AC-3 algorithm) in a CSP search?
Constraint satisfaction
Medium
A.Arc Consistency guarantees finding a solution, while Forward Checking does not.
B.Forward Checking can solve any CSP without backtracking, unlike Arc Consistency.
C.Forward Checking only checks constraints between the current variable and its neighbors, while Arc Consistency propagates constraints further.
D.Arc Consistency is applied before the search begins, while Forward Checking is used during the search.
Correct Answer: Forward Checking only checks constraints between the current variable and its neighbors, while Arc Consistency propagates constraints further.
Explanation:
When a variable X is assigned a value, Forward Checking checks each unassigned variable Y connected to X and deletes values from Y's domain that are inconsistent. Arc Consistency is more powerful; it propagates constraints. After a value is removed from Y's domain, it will then check all neighbors of Y to see if their domains are now inconsistent, and so on, until no more values can be removed.
Incorrect! Try again.
31Under what condition is Uniform Cost Search (UCS) guaranteed to be optimal?
Introduction to uninformed search
Medium
A.When all path costs are non-negative, and step costs are greater than some small positive constant .
B.When all step costs are equal.
C.When the state space is a tree.
D.When the heuristic used is admissible.
Correct Answer: When all path costs are non-negative, and step costs are greater than some small positive constant .
Explanation:
UCS finds the optimal (lowest cost) path by always expanding the node with the lowest path cost . This strategy works as long as path costs never decrease. This is guaranteed if all step costs are non-negative. The strict positivity () ensures the algorithm terminates by preventing infinite loops with zero-cost cycles. UCS does not use a heuristic.
Incorrect! Try again.
32In a Genetic Algorithm, what is the combined purpose of the 'selection' and 'mutation' operators?
Basics of optimization: gradient-based methods and metaheuristics
Medium
A.Selection diversifies the population, while mutation ensures convergence.
B.Selection drives the population toward better solutions, while mutation introduces diversity to prevent premature convergence.
C.Selection creates new solutions from old ones, while mutation removes bad solutions.
D.Both operators work to reduce the size of the population in each generation.
Correct Answer: Selection drives the population toward better solutions, while mutation introduces diversity to prevent premature convergence.
Explanation:
Selection (like roulette wheel or tournament selection) ensures that fitter individuals are more likely to be chosen for reproduction, pushing the population's average fitness up (exploitation). Mutation introduces random changes into the individuals' genetic code, creating new, unexplored possibilities and preventing the algorithm from getting stuck in a local optimum by maintaining genetic diversity (exploration).
Incorrect! Try again.
33An AI model for image recognition is very complex, using a deep neural network with 150 layers. What is a likely consequence of this high complexity when training on a small dataset of only 1,000 images?
AI problem design: Complexity, Data requirements, Solution metrics
Medium
A.The model will be unable to learn any patterns from the data.
B.The model will likely achieve high training accuracy but perform poorly on new, unseen images (overfitting).
C.The model will generalize well to new data because of its high capacity to learn.
D.The model will train very quickly due to its advanced architecture.
Correct Answer: The model will likely achieve high training accuracy but perform poorly on new, unseen images (overfitting).
Explanation:
A highly complex model has a high capacity to learn, meaning it can fit very intricate patterns. When the dataset is small, the model has enough capacity to essentially 'memorize' the training examples, including their noise and random fluctuations. This leads to high performance on the data it has seen (training accuracy) but a failure to learn the underlying general pattern, resulting in poor performance on new data (poor generalization). This phenomenon is known as overfitting.
Incorrect! Try again.
34An RL agent is learning to navigate a maze. For the first 100 trials, it always chooses a random available move. After that, it always chooses the move that it currently estimates has the highest value. This strategy is an example of what?
Introduction to reinforcement learning for sequential decision problems
Medium
A.A purely exploitative strategy.
B.A separation of the exploration and exploitation phases.
C.An Epsilon-Greedy policy.
D.Q-learning.
Correct Answer: A separation of the exploration and exploitation phases.
Explanation:
This strategy dedicates an initial period exclusively to exploration (taking random actions to learn about the environment) and then switches to a purely exploitative phase (always choosing the best-known action). While simple, it's a valid way to handle the trade-off. An Epsilon-Greedy policy would mix exploration and exploitation throughout the entire learning process, not separate them into distinct phases.
Incorrect! Try again.
35When defining a search problem, what is the main consequence of having a more informed (i.e., more constrained and specific) goal test?
Problem formulation and state space search
Medium
A.It guarantees that any solution found is optimal.
B.It always increases the branching factor of the state space.
C.It makes the problem unsolvable by uninformed search algorithms.
D.It may significantly reduce the size of the effective state space to be searched.
Correct Answer: It may significantly reduce the size of the effective state space to be searched.
Explanation:
A more specific goal test prunes branches of the search tree earlier. For example, if the goal is 'get to city C', the search might be large. If the goal is 'get to city C with less than $100 in cost and in under 2 hours', many paths that would have been explored in the first problem will be immediately discarded because they violate the new constraints, effectively reducing the portion of the state space that needs to be searched.
Incorrect! Try again.
36If an A* search uses an inadmissible heuristic that sometimes overestimates the true cost to the goal, what is the primary guarantee that is lost?
Core search algorithms: Best-first search, A* search, Heuristic search
Medium
A.The guarantee of optimal efficiency (expanding the minimum number of nodes).
B.The guarantee of completeness (finding a solution if one exists).
C.The guarantee that the algorithm will terminate.
D.The guarantee of finding the optimal (least-cost) solution.
Correct Answer: The guarantee of finding the optimal (least-cost) solution.
Explanation:
Admissibility (never overestimating the cost) is the crucial property that guarantees A*'s optimality. If a heuristic overestimates the cost from a node N on an optimal path, its -value might become higher than the -value of a node on a suboptimal path. This could cause the algorithm to commit to the suboptimal path and find a goal state there first, returning a solution that is not the least-cost one.
Incorrect! Try again.
37What is the primary motivation for using optimization methods like Stochastic Gradient Descent (SGD) instead of standard (batch) Gradient Descent in large-scale machine learning?
Basics of optimization: gradient-based methods and metaheuristics
Medium
A.SGD does not require setting a learning rate.
B.SGD provides a much more accurate estimate of the gradient at each step.
C.SGD is computationally much cheaper per update, as it uses only one or a few examples instead of the entire dataset.
D.SGD is guaranteed to converge to the global minimum, whereas batch GD is not.
Correct Answer: SGD is computationally much cheaper per update, as it uses only one or a few examples instead of the entire dataset.
Explanation:
In large-scale ML, datasets can contain millions or billions of examples. Calculating the true gradient requires summing over all of them, which is computationally prohibitive for each weight update. SGD (or mini-batch SGD) approximates the gradient using a single example or a small batch. While this estimate is noisy, it is far cheaper to compute, allowing for much faster and more frequent updates, which often leads to faster convergence in practice.
Incorrect! Try again.
38For a map-coloring CSP, which variable ordering heuristic would be most effective for a backtracking search?
Constraint satisfaction
Medium
A.Degree Heuristic: Choose the country that borders the most other uncolored countries.
B.Random Selection: Choose the next country to color at random.
C.Least Constraining Value: Choose the country that can be colored with the most possible colors.
D.Alphabetical Order: Choose the country that comes first alphabetically.
Correct Answer: Degree Heuristic: Choose the country that borders the most other uncolored countries.
Explanation:
The Degree Heuristic advises choosing the variable that is involved in the largest number of constraints on other unassigned variables. In map coloring, this means picking the country that borders the most neighbors. Like MRV, this is a 'fail-first' strategy. By coloring the most constraining region first, it helps to quickly prune the search space and detect dead ends early.
Incorrect! Try again.
39When designing an AI system, what does the term 'solution metric' refer to?
AI problem design: Complexity, Data requirements, Solution metrics
Medium
A.A quantifiable measure used to evaluate how well the AI's output achieves the problem's goal.
B.The amount of data required to train the AI model.
C.The computational complexity (time and space) of the AI system.
D.The algorithm used to find the solution, such as A* or Gradient Descent.
Correct Answer: A quantifiable measure used to evaluate how well the AI's output achieves the problem's goal.
Explanation:
A solution metric is a performance measure that defines what a 'good' solution looks like. It is how you score the output of your AI system. Examples include accuracy for classification, mean squared error for regression, path cost for a search problem, or cumulative reward in reinforcement learning. Choosing the right metric is critical as it dictates how the system is optimized and evaluated.
Incorrect! Try again.
40What is the fundamental difference between model-based and model-free reinforcement learning?
Introduction to reinforcement learning for sequential decision problems
Medium
A.Model-free RL is always more sample-efficient than model-based RL.
B.Model-free RL uses a policy, while model-based RL uses a value function.
C.Model-based RL attempts to learn the environment's dynamics (transition and reward functions), while model-free RL learns a value function or policy directly from experience.
D.Model-based RL works only in discrete state spaces, while model-free works in continuous ones.
Correct Answer: Model-based RL attempts to learn the environment's dynamics (transition and reward functions), while model-free RL learns a value function or policy directly from experience.
Explanation:
This is the core distinction. A model-based agent first learns a model of 'how the world works' (e.g., ). It can then use this model to 'plan' by simulating future outcomes. A model-free agent, like one using Q-learning, skips building an explicit model and learns directly what to do (a policy) or how good it is to be in a state/take an action (a value function) through trial and error.
Incorrect! Try again.
41In an A search, you are using a heuristic that is admissible but not* consistent. What is the most significant consequence of this property for the algorithm's behavior on a general graph (not a tree)?
Core search algorithms: A* search
Hard
A.The algorithm's time complexity degrades to be equivalent to that of Dijkstra's algorithm.
B.The algorithm is no longer guaranteed to find the optimal solution.
C.The algorithm might re-expand a node after having already found an optimal path to it, if a new, cheaper path is discovered.
D.The algorithm can no longer use a closed set (visited list) to prune nodes, leading to exponential complexity on all graphs.
Correct Answer: The algorithm might re-expand a node after having already found an optimal path to it, if a new, cheaper path is discovered.
Explanation:
Consistency (or the triangle inequality, ) is a stronger condition than admissibility. When a heuristic is consistent, A is guaranteed to find the optimal path to any node the first time it is expanded. If the heuristic is only admissible, it's possible for the algorithm to find a suboptimal path to a node first. Later, it might discover a new, cheaper path to , requiring it to be re-added to the frontier and potentially re-expanded. This is why some A implementations need to handle re-opening nodes from the closed set. Admissibility alone is sufficient to guarantee an optimal solution is found at the end, but consistency ensures efficiency by preventing re-expansions of optimal paths.
Incorrect! Try again.
42Consider a Constraint Satisfaction Problem being solved with backtracking search. If you apply Arc Consistency (AC-3) as a pre-processing step, and then during the search you use Forward Checking, what is the primary relationship between these two constraint propagation methods in this context?
Constraint satisfaction
Hard
A.Forward Checking will not prune any additional values from the domains of unassigned variables immediately after the first variable assignment, because AC-3 has already made every arc consistent.
B.The combination is redundant; AC-3 makes Forward Checking's work during the search unnecessary for the entire search tree.
C.Forward Checking propagates constraints to all other unassigned variables, whereas AC-3 only considered pairs of variables, making Forward Checking strictly more powerful.
D.AC-3 guarantees that the first variable assignment will not lead to an immediate dead-end that Forward Checking would have detected.
Correct Answer: Forward Checking will not prune any additional values from the domains of unassigned variables immediately after the first variable assignment, because AC-3 has already made every arc consistent.
Explanation:
AC-3 enforces arc consistency for all pairs of variables in the entire CSP. This means for any arc , every value in the domain of has at least one supporting value in the domain of . Forward Checking, after assigning a value to a variable , checks all unassigned variables that share a constraint with and removes values from inconsistent with the assignment. Because AC-3 was run beforehand, the initial assignment to the first variable (say ) will not cause any immediate domain wipes for its neighbors. Any value in a neighbor's domain already has support, and the first assignment simply selects one of those supports. The real power of Forward Checking is seen after the second assignment and beyond, where new inconsistencies can arise that AC-3 (as a one-off pre-processor) didn't account for.
Incorrect! Try again.
43An agent is using Q-Learning to navigate a grid world. It is in state , takes action , and observes reward and next state . According to its -greedy policy, the next action it actually takes from is a random exploratory action, . How does the Q-value update for incorporate this information?
Introduction to reinforcement learning for sequential decision problems
Hard
A.The update uses a weighted average of and based on the value of .
B.The update uses the value as the bootstrap estimate for the next state's value.
C.The update uses , ignoring the random action that was actually taken.
D.The update is skipped for this step because an exploratory action was taken, which does not reflect the optimal policy.
Correct Answer: The update uses , ignoring the random action that was actually taken.
Explanation:
This question targets the core difference between on-policy (SARSA) and off-policy (Q-Learning) methods. Q-Learning is off-policy, meaning it updates its action-value function using a policy that is different from the one it uses to generate behavior. The behavior policy is -greedy, but the update policy is purely greedy. The Q-learning update rule is: . The term represents the estimated optimal future value from state , regardless of which action is actually taken next. SARSA, an on-policy algorithm, would have used in its update, thus using the value of the action that was actually taken.
Incorrect! Try again.
44You are optimizing a high-dimensional, non-convex function with many saddle points using a gradient-based method. Which of the following algorithm variants is most likely to escape these saddle points most efficiently?
Basics of optimization: gradient-based methods and metaheuristics
Hard
A.Standard (Batch) Gradient Descent
B.Gradient Descent with Momentum
C.Newton's method without any regularization
D.Gradient Descent with a very small, fixed learning rate
Correct Answer: Gradient Descent with Momentum
Explanation:
Saddle points are regions where the gradient is close to zero, but they are not local minima. Standard Gradient Descent can get stuck or slow down dramatically in these regions. Gradient Descent with Momentum accumulates a velocity vector from past gradients. When it enters a saddle point region, even if the current gradient is small, the accumulated momentum helps it 'overshoot' the saddle point and continue descending on the other side. A small learning rate would make it even more likely to get stuck. Newton's method is attracted to points where the gradient is zero, which includes saddle points, and can get stuck there as well.
Incorrect! Try again.
45Consider the problem of finding the shortest path for a knight on an chessboard. If we formulate this as a state space search problem, what is the most accurate description of the structure of the state space graph?
Problem formulation and state space search
Hard
A.An undirected graph where some nodes have a degree as low as 2 and others as high as 8.
B.A strongly connected component, as every square is reachable from every other square.
C.A directed acyclic graph (DAG) because the path cost is always increasing.
D.A tree with a branching factor varying between 2 and 8.
Correct Answer: An undirected graph where some nodes have a degree as low as 2 and others as high as 8.
Explanation:
The state space consists of nodes representing the squares on the board. An edge exists between two nodes if a knight can move between them. Since a knight's move is reversible (if you can go from A to B, you can go from B to A), the graph is undirected. It is not a tree because there are multiple paths between squares (cycles exist). It is not a DAG for the same reason. While it is true that all squares are reachable (for ), this describes a property of the graph (connectivity), not its fundamental structure. The most precise description is an undirected graph where the degree of each node (number of possible moves) varies based on its position: a corner square has 2 moves, an edge square might have 3 or 4, and a central square has 8. This variance is a key feature of the graph structure.
Incorrect! Try again.
46You are designing an AI for an autonomous drone that must inspect 100 specific points of interest (POIs) on a large structure within a limited battery life. The drone gets a high reward for each POI successfully inspected, but a massive penalty if the battery dies before it returns to the charging station. The time taken to travel between POIs is non-uniform. Which problem formulation and corresponding solution metric is most appropriate?
AI problem design: Solution metrics
Hard
A.Formulate as a search for the highest-reward POI and use a greedy algorithm to maximize immediate reward.
B.Formulate as an Orienteering Problem and use the total reward collected as the primary metric, subject to the battery constraint.
C.Formulate as a Traveling Salesperson Problem (TSP) and use the total path length as the metric to minimize.
D.Formulate as a Vehicle Routing Problem (VRP) and use the number of drones required as the metric.
Correct Answer: Formulate as an Orienteering Problem and use the total reward collected as the primary metric, subject to the battery constraint.
Explanation:
This problem is a classic example of the Orienteering Problem. The goal is not necessarily to visit all points (like TSP) but to collect the maximum possible reward (by visiting a subset of valuable POIs) within a strict budget or constraint (the battery life, which translates to a maximum path cost/time). A TSP formulation is incorrect because it assumes all 100 points must be visited, which might be impossible. A simple greedy approach is unlikely to be optimal, as it might lead the drone far away for a high-reward POI, leaving no battery to return. VRP is irrelevant as there is only one 'vehicle' (drone).
Incorrect! Try again.
47You are using A search with an inadmissible heuristic that consistently overestimates the true cost by a constant factor , i.e., . Which of the following statements about the solution found by this A variant is guaranteed to be true?
Core search algorithms: A* search
Hard
A.The cost of the solution found will be no more than times the cost of the optimal solution.
B.The algorithm will find the optimal solution, but it will be much slower.
C.The algorithm will behave identically to a Greedy Best-First Search.
D.The algorithm is not guaranteed to find a solution, even if one exists.
Correct Answer: The cost of the solution found will be no more than times the cost of the optimal solution.
Explanation:
Using an inadmissible heuristic of the form for turns the A algorithm into a variant known as Weighted A. This algorithm is no longer guaranteed to be optimal. However, it has a bounded sub-optimality. The cost of the solution found is guaranteed to be at most times the cost of the optimal solution. This provides a trade-off: you sacrifice guaranteed optimality for often much faster search speeds, as the search becomes more 'greedy' and directed towards the goal.
Incorrect! Try again.
48Consider a state space graph that is a balanced binary tree of depth . If the single goal node is located at the rightmost leaf of the tree, how does the number of nodes expanded by Iterative Deepening Search (IDS) compare to Depth-First Search (DFS)?
Introduction to uninformed search
Hard
A.IDS expands approximately twice as many nodes as DFS.
B.DFS and IDS expand the exact same number of nodes.
C.DFS expands significantly fewer nodes as it finds the goal almost immediately.
D.IDS expands approximately times more nodes than DFS.
Correct Answer: IDS expands approximately twice as many nodes as DFS.
Explanation:
This is a worst-case scenario for both algorithms. A standard DFS would traverse the entire left subtree before moving to the right, expanding almost all nodes. IDS would do the same. Let's analyze the nodes generated by IDS. The last iteration (depth limit ) expands the entire tree. The second-to-last iteration (depth limit ) expands the entire tree up to that depth. The total nodes generated by IDS is , which is asymptotically dominated by the last two levels. In a binary tree, the number of nodes generated by IDS is approximately times the number generated by BFS, where . So IDS generates about twice the nodes of BFS. In this specific worst-case for DFS, it explores the entire tree, similar to BFS. Therefore, IDS will expand roughly twice the number of nodes as this worst-case DFS. A common misconception is that IDS is much less efficient, but in a tree with a reasonable branching factor, the cost is dominated by the last level, making it asymptotically similar to BFS in time but superior in space.
Incorrect! Try again.
49In a backtracking search for a CSP, you are deciding between the Minimum Remaining Values (MRV) and Degree Heuristic for variable ordering. In which scenario would the Degree Heuristic be a more impactful choice than MRV?
Constraint satisfaction
Hard
A.On a problem where the constraint graph is a simple chain (A-B-C-D).
B.In the middle of the search, when one variable has only one value left in its domain.
C.At the very beginning of the search, when all variables have the same number of legal values in their domains.
D.When the goal is to find all possible solutions, not just the first one.
Correct Answer: At the very beginning of the search, when all variables have the same number of legal values in their domains.
Explanation:
The MRV heuristic chooses the variable with the fewest remaining legal values. The Degree Heuristic chooses the variable involved in the largest number of constraints on other unassigned variables. At the start of the search, all domains are full, so MRV has no basis for making a choice (it sees a tie between all variables). In this situation, the Degree Heuristic is a powerful tie-breaker. By choosing a variable with a high degree, we are attempting to prune the search space more effectively early on. For example, assigning a value to a highly connected variable will constrain its many neighbors, hopefully detecting failures earlier. Once the search is underway, MRV often becomes more effective as domain sizes shrink.
Incorrect! Try again.
50You are using Simulated Annealing to solve an optimization problem. The cooling schedule is defined by , where is the temperature at iteration . What is the most likely consequence of choosing a value of very close to 1 (e.g., 0.9999)?
Basics of optimization: gradient-based methods and metaheuristics
Hard
A.The algorithm's performance will be highly sensitive to the initial state .
B.The algorithm will perform a very long and thorough search of the solution space, but may be computationally expensive.
C.The algorithm will fail to converge as the probability of accepting worse states will remain high indefinitely.
D.The algorithm will converge very quickly to the nearest local minimum, behaving like greedy hill-climbing.
Correct Answer: The algorithm will perform a very long and thorough search of the solution space, but may be computationally expensive.
Explanation:
An value very close to 1 means the temperature decreases very slowly (a slow annealing schedule). This allows the algorithm to spend a long time at higher temperatures, where it has a high probability of accepting worse moves, enabling it to explore the entire search space broadly and escape local optima. As the temperature eventually cools, it will transition to exploitation and settle into a good minimum. This slow cooling increases the probability of finding the global optimum but comes at the cost of a much longer runtime. A value of close to 0 would cause rapid cooling, making the algorithm behave like hill-climbing.
Incorrect! Try again.
51In the context of a Markov Decision Process (MDP), what does the Bellman Optimality Equation for fundamentally express?
Introduction to reinforcement learning for sequential decision problems
Hard
A.The optimal policy is to always choose the action that leads to the state with the highest immediate reward.
B.The transition probability from state to is independent of all previous states and actions.
C.The optimal value of a state is the expected immediate reward plus the discounted optimal value of the best successor state, maximized over all possible actions.
D.The value of a state is the sum of all possible future rewards the agent can receive from that state.
Correct Answer: The optimal value of a state is the expected immediate reward plus the discounted optimal value of the best successor state, maximized over all possible actions.
Explanation:
The Bellman Optimality Equation for the optimal value function is: This equation states that the value of being in state under the optimal policy is found by choosing the action that maximizes the expected return. The expected return for a given action is the sum over all possible next states of: the probability of transitioning to () times the sum of the immediate reward () and the discounted value of that next state under the optimal policy (). The other options are incorrect: one describes the Markov property, another is a purely greedy (and suboptimal) policy, and another omits the crucial discounting and expectation components.
Incorrect! Try again.
52If the state in a search problem is represented by a permutation of items, and the actions are swaps of any two adjacent items, what is the diameter of the state space graph (i.e., the longest shortest path between any two states)?
AI problem design: Complexity
Hard
A.
B.
C.
D.
Correct Answer:
Explanation:
The problem of sorting a permutation using adjacent swaps is equivalent to Bubble Sort. The number of swaps required to transform one permutation into another is related to the number of inversions between them. The maximum number of inversions in a permutation of items is when it is in reverse sorted order, which has inversions. Since each adjacent swap can reduce the number of inversions by at most one, this gives a lower bound. It is also an upper bound, as Bubble Sort can sort any permutation in this many swaps. Therefore, the longest shortest path (diameter) between the identity permutation and the reverse permutation is , which is in .
Incorrect! Try again.
53You are given a search problem where the path cost from the start node to any node is exactly equal to the depth of . The heuristic is admissible. In this specific scenario, which of the following algorithms will expand nodes in the exact same order as A* search?
Core search algorithms: Best-first search
Hard
A.GBFS expanded at least one node where its path cost was greater than 50.
B.The initial heuristic value at the start node, , must have been greater than 50.
C.GBFS must have expanded a node that was not on the optimal path found by A*.
D.A* search expanded strictly more nodes than GBFS.
Correct Answer: GBFS must have expanded a node that was not on the optimal path found by A*.
Explanation:
A with an admissible heuristic is guaranteed to find the optimal path, which has a cost of 50. GBFS is guided solely by the heuristic and ignores the path cost . For GBFS to find a suboptimal path of cost 100, it must have been 'lured' by promising heuristic values down a path that turned out to be expensive. This means it must have chosen to expand a node that was not on the true optimal path, otherwise it would have followed the optimal path. The other options are not necessarily true. GBFS could find the suboptimal path without exceeding 50 until the very end. The initial heuristic value is an estimate for the optimal path (50), so it's likely less than or equal to 50, not greater. A could potentially expand fewer nodes if the heuristic is very accurate and guides it directly to the goal while GBFS gets lost in a large, cheap-in-heuristic-value but expensive-in-reality section of the state space.
Incorrect! Try again.
54An AI system for diagnosing rare diseases from patient symptoms shows 99.9% accuracy during testing. However, the prevalence of any single rare disease is 1 in 10,000. Why is accuracy a deeply misleading metric in this context, and what would be a more appropriate set of metrics to evaluate its real-world performance?
AI problem design: Data requirements
Hard
A.Because a model that always predicts 'no rare disease' would achieve 99.99% accuracy, yet be useless. Better metrics would be Precision, Recall, and the F1-Score.
B.Because the dataset is too small to achieve statistical significance, and the model is likely overfitting. The best metric is validation loss.
C.Because accuracy doesn't account for the computational cost of the model. A better metric is accuracy per FLOP.
D.Because the problem is sequential, and accuracy doesn't capture the temporal nature of diagnostics. A better metric is discounted cumulative reward.
Correct Answer: Because a model that always predicts 'no rare disease' would achieve 99.99% accuracy, yet be useless. Better metrics would be Precision, Recall, and the F1-Score.
Explanation:
This is a classic case of a severe class imbalance problem. When the positive class (having a rare disease) is extremely rare, a naive classifier can achieve very high accuracy by simply always predicting the majority negative class. This 'accuracy paradox' makes accuracy a useless metric. The critical goal is to correctly identify the few positive cases. Precision (of the positive predictions, how many are correct?) and Recall (of all actual positive cases, how many did the model find?) are far more informative. The F1-Score, which is the harmonic mean of Precision and Recall, provides a single score that balances this trade-off, giving a much better assessment of the model's utility.
Incorrect! Try again.
55You are formulating the Rubik's Cube problem for an AI solver. Which of the following state representations would be most problematic for use with an efficient search algorithm like A* and a pattern database heuristic?
Problem formulation and state space search
Hard
A.A 3D array representing the cube, with each of the 26 outer cubies having an ID and an orientation vector.
B.An image of the cube from a single, fixed perspective.
C.A flat list of 54 color labels, one for each facelet, in a fixed order.
D.A string of 20 characters representing the core properties (position and orientation) of the 8 corner and 12 edge pieces.
Correct Answer: An image of the cube from a single, fixed perspective.
Explanation:
An effective state representation must be canonical and complete. An image from a single perspective is highly problematic because it's ambiguous (you can't see the other faces) and not canonical (multiple cube states can look identical from one angle, and the same state can produce different images depending on lighting and global orientation). This makes it nearly impossible to check for visited states (a critical part of A*) or to compute a consistent heuristic. The other options, while different, are all complete and canonical representations of the cube's permutation state, allowing for effective generation of successor states and heuristic calculation.
Incorrect! Try again.
56In a graph with many cycles and non-uniform positive edge costs, what is a primary advantage of Uniform Cost Search (UCS) over a standard Breadth-First Search (BFS) implementation that stops upon finding the first path to the goal?
Introduction to uninformed search
Hard
A.UCS has a better space complexity, as it does not need to store all nodes at the current depth.
B.UCS is guaranteed to find the optimal (least-cost) path, whereas BFS is only guaranteed to find the shallowest path (least number of edges).
C.UCS can handle negative edge costs, while BFS cannot.
D.UCS expands fewer nodes on average because it is a form of informed search.
Correct Answer: UCS is guaranteed to find the optimal (least-cost) path, whereas BFS is only guaranteed to find the shallowest path (least number of edges).
Explanation:
The defining characteristic of UCS (Dijkstra's Algorithm) is that it explores the state space in order of increasing path cost . This guarantees that the first time it reaches the goal node, it has found the path with the minimum possible cost. BFS explores in order of increasing depth (number of edges). In a graph with non-uniform costs, the shallowest path is often not the cheapest one. For example, a path A->B->Goal with costs (10, 10) is shallower than A->C->D->Goal with costs (1, 1, 1), but the latter is far cheaper. BFS would find the first path and stop, while UCS would find the second. UCS is an uninformed search (it has no heuristic), and it generally has worse or equal space complexity to BFS as its frontier can be large. Neither can handle negative edge costs without modification.
Incorrect! Try again.
57Consider the design of a heuristic for the 15-puzzle. Let be the number of misplaced tiles. Let be the sum of the Manhattan distances of each tile from its goal position. Let . Let . Which of these heuristics is both admissible and dominates the others that are also admissible?
Heuristic search
Hard
A.
B.
C.
D.
Correct Answer:
Explanation:
A heuristic dominates if for all states . An admissible heuristic never overestimates the true cost. \n1. (misplaced tiles) is admissible because each misplaced tile must be moved at least once. \n2. (Manhattan distance) is admissible because each tile must move at least its Manhattan distance, and moves do not interfere with each other in this calculation (i.e., moving one tile doesn't move another closer). \n3. is not admissible. A single move can fix a misplaced tile, so the cost is 1, but would estimate 2. \n4. . Since every tile that is a non-zero Manhattan distance from its goal is also a misplaced tile, is always greater than or equal to . Therefore, . \nComparing the admissible heuristics and , provides a more accurate (and always greater or equal) estimate. Therefore, is admissible and dominates .
Incorrect! Try again.
58In Reinforcement Learning, what is the fundamental challenge addressed by using eligibility traces (as in TD() or Q())?
Introduction to reinforcement learning for sequential decision problems
Hard
A.The curse of dimensionality: handling state spaces that are too large to store in a lookup table.
B.The exploration-exploitation trade-off: deciding whether to try a new action or stick with the current best one.
C.The non-stationarity of the environment: adapting when the transition probabilities or reward function change over time.
D.The temporal credit assignment problem: efficiently propagating a delayed reward back to the sequence of state-action pairs that were responsible for it.
Correct Answer: The temporal credit assignment problem: efficiently propagating a delayed reward back to the sequence of state-action pairs that were responsible for it.
Explanation:
The credit assignment problem is a core challenge in RL, especially for tasks with delayed rewards (e.g., winning a game of chess). When a reward is received, how do we know which of the many preceding actions were crucial for that success? Standard one-step TD methods (like Q-learning) propagate this reward information back only one step at a time, which can be very slow. Eligibility traces provide a mechanism to bridge this temporal gap. They maintain a short-term memory of recently visited state-action pairs, and when a TD error occurs (e.g., a reward is received), this error is used to update all of the preceding pairs in the trace, weighted by how long ago they occurred. This allows for more efficient propagation of credit (or blame) through time.
Incorrect! Try again.
59When applying Stochastic Gradient Descent (SGD) to train a deep neural network (a non-convex optimization problem), why might a learning rate schedule that starts high and gradually anneals to a lower value be more effective than using a small, fixed learning rate?
Basics of optimization: gradient-based methods and metaheuristics
Hard
A.A high learning rate ensures that the weights are properly initialized before training begins, preventing vanishing gradients.
B.The learning rate must be annealed to satisfy the mathematical convergence criteria for convex optimization problems.
C.The initial high learning rate and noise from mini-batches help the optimizer escape poor local minima and saddle points early in training, while the later low learning rate allows it to converge finely within a good basin of attraction.
D.A small, fixed learning rate causes the loss to oscillate wildly, preventing convergence, whereas an annealed rate dampens these oscillations.
Correct Answer: The initial high learning rate and noise from mini-batches help the optimizer escape poor local minima and saddle points early in training, while the later low learning rate allows it to converge finely within a good basin of attraction.
Explanation:
In complex, non-convex landscapes like those of neural networks, training involves two phases: exploration and exploitation. A high initial learning rate encourages exploration; the large, noisy steps (due to both the rate and the stochastic nature of the gradients) can help the optimizer jump out of sharp, suboptimal local minima or move past saddle points. As training progresses, the model is hopefully in a 'good' region of the loss landscape. At this point, the learning rate is decreased (annealed) to allow the optimizer to take smaller steps and carefully converge to the bottom of this wide, promising basin without overshooting it. A small fixed learning rate might get permanently stuck in the first poor local minimum it encounters.
Incorrect! Try again.
60You are performing A* search on a graph where all edge costs are integers greater than or equal to a constant . The heuristic function is consistent. What can you conclude about the values of for the sequence of nodes expanded by the algorithm?
Core search algorithms: A* search
Hard
A.The sequence of -values will be random, depending on the graph topology.
B.The sequence of -values is guaranteed to be equal to the true cost of the path to the goal.
C.The sequence of -values for the expanded nodes is non-decreasing.
D.The sequence of -values is strictly increasing.
Correct Answer: The sequence of -values for the expanded nodes is non-decreasing.
Explanation:
This is a key property of A when used with a consistent heuristic. A always expands the node with the lowest -value on the frontier. Let be the node just expanded. Let be any successor of . The cost to get to through is . Because the heuristic is consistent, . Rearranging this, we get , which simplifies to . This means that the -value along any path is non-decreasing. Since A* always picks the node with the minimum -value to expand, the sequence of -values of the nodes it expands must also be non-decreasing. It's not strictly increasing because multiple nodes can have the same -value.