1In association rule mining, what does the 'Support' metric measure?
Support
Easy
A.The ratio of independent occurrences of two itemsets.
B.The frequency or proportion of transactions that contain a specific itemset.
C.The probability that a rule is incorrect.
D.The error rate of the generated association rules.
Correct Answer: The frequency or proportion of transactions that contain a specific itemset.
Explanation:
Support indicates how frequently an itemset appears in the dataset. It is calculated as the number of transactions containing the itemset divided by the total number of transactions.
Incorrect! Try again.
2How is the Confidence of an association rule mathematically defined?
Confidence
Easy
A.
B.
C.
D.
Correct Answer:
Explanation:
Confidence measures the likelihood of seeing item Y in a transaction given that it already contains item X. It is the ratio of the support of the combined itemset to the support of the antecedent (X).
Incorrect! Try again.
3What does a high confidence score for the rule {Bread} {Butter} indicate?
Confidence
Easy
A.Bread and Butter are the most popular items in the store.
B.Customers who buy Bread are highly likely to also buy Butter.
C.Customers who buy Butter are highly likely to buy Bread.
D.Bread and Butter are always bought independently.
Correct Answer: Customers who buy Bread are highly likely to also buy Butter.
Explanation:
Confidence measures conditional probability. A high confidence for {Bread} {Butter} means that when Bread is purchased, Butter is frequently purchased in the same transaction.
Incorrect! Try again.
4What does a Lift value greater than 1 indicate for an association rule ?
Lift
Easy
A.X and Y are completely independent of each other.
B.The rule is invalid and should be discarded.
C.A positive correlation, meaning X and Y appear together more often than expected by chance.
D.A negative correlation, meaning X and Y appear together less often than expected.
Correct Answer: A positive correlation, meaning X and Y appear together more often than expected by chance.
Explanation:
Lift measures how much more likely X and Y are to be bought together compared to if they were independent. A value > 1 signifies positive dependence or correlation.
Incorrect! Try again.
5Which metric evaluates the ratio of the observed support of to the expected support if X and Y were independent?
Lift
Easy
A.Support
B.Confidence
C.Lift
D.Conviction
Correct Answer: Lift
Explanation:
Lift is mathematically defined as . The denominator represents the expected support under independence.
Incorrect! Try again.
6What is the primary purpose of the 'Conviction' metric in association rule mining?
Conviction
Easy
A.To determine the speed of the Apriori algorithm.
B.To measure the ratio of the expected frequency that X occurs without Y (if independent) to the observed frequency of X occurring without Y.
C.To compress the dataset into a tree structure.
D.To measure the absolute frequency of an itemset in the database.
Correct Answer: To measure the ratio of the expected frequency that X occurs without Y (if independent) to the observed frequency of X occurring without Y.
Explanation:
Conviction compares the probability that X appears without Y if they were independent against the actual empirical frequency of X appearing without Y. It evaluates how much the consequent depends on the antecedent.
Incorrect! Try again.
7What is the core idea behind the Apriori principle?
Apriori Algorithm
Easy
A.Infrequent itemsets can have frequent supersets.
B.If an itemset is frequent, then all of its subsets must also be frequent.
C.Association rules can only contain two items.
D.If an itemset is frequent, then all of its supersets must also be frequent.
Correct Answer: If an itemset is frequent, then all of its subsets must also be frequent.
Explanation:
The Apriori principle states that any subset of a frequent itemset must also be frequent. This anti-monotone property is used to drastically reduce the search space.
Incorrect! Try again.
8What happens during the 'candidate generation' step of the Apriori algorithm?
Apriori Algorithm
Easy
A.Anomalies in the transaction records are removed.
B.The dataset is clustered into groups based on similarity.
C.Larger itemsets (candidates) are created by joining smaller frequent itemsets from the previous pass.
D.A tree data structure is built to avoid scanning the database.
Correct Answer: Larger itemsets (candidates) are created by joining smaller frequent itemsets from the previous pass.
Explanation:
Apriori builds itemsets iteratively. Candidate generation involves taking the frequent itemsets of size and joining them to form potential frequent itemsets of size .
Incorrect! Try again.
9Which property allows the Apriori algorithm to prune the search space efficiently?
Apriori Algorithm
Easy
A.Triangle inequality
B.Curse of dimensionality
C.Anti-monotone property of support
D.Monotonicity of confidence
Correct Answer: Anti-monotone property of support
Explanation:
The anti-monotone property of support (the support of an itemset never exceeds the support of its subsets) allows Apriori to safely ignore any supersets of an infrequent itemset.
Incorrect! Try again.
10What is a major advantage of the FP-Growth algorithm over the Apriori algorithm?
FP-Growth algorithm
Easy
A.It generates association rules without calculating support.
B.It requires multiple scans of the database for every itemset size.
C.It does not require candidate generation.
D.It is only applicable to continuous numerical data.
Correct Answer: It does not require candidate generation.
Explanation:
FP-Growth overcomes the main bottleneck of Apriori by bypassing the costly candidate generation process. It uses a compact data structure to extract frequent patterns directly.
Incorrect! Try again.
11Which data structure is uniquely utilized by the FP-Growth algorithm to compress database transactions?
FP-Growth algorithm
Easy
A.B-Tree
B.FP-Tree (Frequent Pattern Tree)
C.KD-Tree
D.Binary Search Tree
Correct Answer: FP-Tree (Frequent Pattern Tree)
Explanation:
The FP-Growth algorithm compresses the transaction database into a highly condensed structure called an FP-Tree (Frequent Pattern Tree), which retains itemset association information.
Incorrect! Try again.
12What is the primary business goal of Market Basket Analysis?
Market Basket Analysis
Easy
A.To classify customers based on their age and income.
B.To predict the exact total price of a customer's basket.
C.To identify combinations of products that are frequently bought together.
D.To detect credit card fraud at checkout.
Correct Answer: To identify combinations of products that are frequently bought together.
Explanation:
Market Basket Analysis helps retailers understand customer purchasing behavior by discovering groups of items that are commonly bought together, which informs store layout and cross-selling strategies.
Incorrect! Try again.
13Which of the following scenarios is a classic application of Market Basket Analysis?
Market Basket Analysis
Easy
A.Predicting future housing prices based on square footage.
B.Placing diapers and beer near each other in a supermarket based on past purchasing data.
C.Classifying images of cats and dogs.
D.Filtering spam emails out of an inbox.
Correct Answer: Placing diapers and beer near each other in a supermarket based on past purchasing data.
Explanation:
The 'diapers and beer' anecdote is the most famous example of Market Basket Analysis, illustrating how retailers place co-occurring items nearby to boost sales.
Incorrect! Try again.
14What is the main difference between anomaly detection and novelty detection?
Anomaly Detection: Anomaly vs. novelty detection
Easy
A.Novelty detection finds frequent items, while anomaly detection finds association rules.
B.There is no difference; they are exactly the same concept in all contexts.
C.Novelty detection is supervised learning, while anomaly detection is reinforcement learning.
D.Novelty detection assumes the training data has no outliers, while anomaly detection assumes training data may contain outliers.
Correct Answer: Novelty detection assumes the training data has no outliers, while anomaly detection assumes training data may contain outliers.
Explanation:
In novelty detection, the model is typically trained on a 'clean' dataset representing normal behavior to detect new (novel) patterns. Anomaly detection algorithms are designed to handle training sets that are already polluted by outliers.
Incorrect! Try again.
15In the context of machine learning, how is an 'anomaly' defined?
Anomaly Detection: Anomaly vs. novelty detection
Easy
A.The most frequently occurring data point in a dataset.
B.A data point that differs significantly from the majority of other observations.
C.A missing value in a dataset.
D.The average or mean value of a particular feature.
Correct Answer: A data point that differs significantly from the majority of other observations.
Explanation:
An anomaly (or outlier) is an observation that deviates so much from other observations that it arouses suspicion that it was generated by a different mechanism.
Incorrect! Try again.
16How does the Isolation Forest algorithm identify anomalies?
Isolation Forest
Easy
A.Anomalies require fewer random splits (shorter path lengths) to be isolated in a tree.
B.It isolates data points based on their frequency in an FP-Tree.
C.It calculates the Euclidean distance to the nearest neighbor.
D.Anomalies require more random splits (longer path lengths) to be isolated.
Correct Answer: Anomalies require fewer random splits (shorter path lengths) to be isolated in a tree.
Explanation:
Because anomalies are few and different, they are easier to separate from the rest of the data. Thus, in a random tree structure, they end up closer to the root (shorter path length).
Incorrect! Try again.
17The Isolation Forest algorithm is fundamentally based on which type of machine learning structure?
Isolation Forest
Easy
A.Artificial Neural Networks
B.K-Means Clustering
C.Support Vector Machines
D.Decision Trees
Correct Answer: Decision Trees
Explanation:
Isolation Forest builds an ensemble of random Decision Trees (specifically, Isolation Trees) to partition the data space and isolate anomalies.
Incorrect! Try again.
18What core concept does the Local Outlier Factor (LOF) algorithm use to find anomalies?
Local Outlier Factor (LOF)
Easy
A.It counts the absolute number of points in a given radius globally.
B.It compares the local density of a data point to the local densities of its neighbors.
C.It calculates the length of paths in a forest of random trees.
D.It calculates the support and confidence of items in transactions.
Correct Answer: It compares the local density of a data point to the local densities of its neighbors.
Explanation:
LOF is a density-based algorithm. An anomaly is identified if a point has a much lower density than its surrounding neighbors, indicating it is relatively isolated.
Incorrect! Try again.
19If a data point has an LOF score significantly greater than 1, what does this typically imply?
Local Outlier Factor (LOF)
Easy
A.The point is the exact center of a cluster.
B.The point is perfectly normal (an inlier).
C.The point is part of a frequent itemset.
D.The point is likely an outlier or anomaly.
Correct Answer: The point is likely an outlier or anomaly.
Explanation:
An LOF score near 1 means the point has a similar density to its neighbors (normal). A score significantly greater than 1 means its density is lower than its neighbors, marking it as an outlier.
Incorrect! Try again.
20Why is unsupervised anomaly detection particularly useful in cybersecurity (e.g., intrusion detection)?
Applications in Cybersecurity and Fraud Detection
Easy
A.It automatically repairs all software vulnerabilities.
B.It can detect new, previously unknown attacks (zero-day threats) without needing labeled examples of those attacks.
C.It uses labeled data to perfectly classify known viruses.
D.It encrypts all incoming and outgoing network traffic.
Correct Answer: It can detect new, previously unknown attacks (zero-day threats) without needing labeled examples of those attacks.
Explanation:
Because unsupervised anomaly detection models normal behavior and flags anything highly unusual, it is capable of catching entirely new attack types (zero-day attacks) that do not yet have known signatures.
Incorrect! Try again.
21In a dataset of 200 transactions, itemset appears 60 times, itemset appears 80 times, and both and appear together in 30 transactions. What is the support of the rule ?
Support
Medium
A.0.50
B.0.15
C.0.30
D.0.375
Correct Answer: 0.15
Explanation:
Support is calculated as the proportion of total transactions that contain both items. Support = .
Incorrect! Try again.
22Suppose the support of itemset is 0.20, the support of itemset is 0.40, and the support of is 0.15. What is the confidence of the association rule ?
Confidence
Medium
A.0.60
B.0.15
C.0.375
D.0.75
Correct Answer: 0.75
Explanation:
Confidence is the probability of seeing given , computed as . Here, .
Incorrect! Try again.
23For the association rule , the confidence is 0.8 and the support of in the entire dataset is 0.4. What is the Lift of this rule, and what does it indicate?
Lift
Medium
A.Lift = 1.2; and are independent.
B.Lift = 2.0; and are negatively correlated.
C.Lift = 0.5; and are negatively correlated.
D.Lift = 2.0; and are positively correlated.
Correct Answer: Lift = 2.0; and are positively correlated.
Explanation:
Lift = . A Lift greater than 1 indicates a positive correlation between the occurrence of and .
Incorrect! Try again.
24A lift value of 0.6 for the rule implies what about the relationship between items and ?
Lift
Medium
A.They are negatively correlated, meaning they act as substitutes.
B. always appears whenever appears in a transaction.
C.They are completely independent of each other.
D.The presence of increases the likelihood of occurring.
Correct Answer: They are negatively correlated, meaning they act as substitutes.
Explanation:
A Lift value less than 1 indicates a negative correlation; the occurrence of one item makes the occurrence of the other less likely. They may be substitute products.
Incorrect! Try again.
25The rule has a confidence of 0.75. If the overall probability of occurring in the dataset is 0.5, what is the conviction of this rule?
Conviction
Medium
A.2.0
B.1.5
C.0.5
D.1.0
Correct Answer: 2.0
Explanation:
Conviction is calculated as . Substituting the values gives .
Incorrect! Try again.
26The Apriori algorithm relies heavily on the anti-monotone property of support. Which of the following best describes this property?
Apriori Algorithm
Medium
A.If an itemset is frequent, all its supersets must also be frequent.
B.If an itemset is infrequent, all its supersets must be infrequent.
C.Confidence is always strictly greater than or equal to support.
D.The support of an itemset is always equal to the sum of the supports of its subsets.
Correct Answer: If an itemset is infrequent, all its supersets must be infrequent.
Explanation:
The anti-monotone property (or downward closure property) states that adding items to an itemset can never increase its support. Thus, if a subset is infrequent, any superset containing it is also guaranteed to be infrequent.
Incorrect! Try again.
27During the candidate generation step of the Apriori algorithm, if a candidate 3-itemset is , what condition must be met for it to survive the pruning step?
Apriori Algorithm
Medium
A.It must have a confidence greater than its support.
B.At least one of its 2-item subsets must be in the set of frequent 2-itemsets ().
C.Only the subset needs to be present in .
D.All of its 2-item subsets () must be present in the set of frequent 2-itemsets ().
Correct Answer: All of its 2-item subsets () must be present in the set of frequent 2-itemsets ().
Explanation:
To avoid calculating support for unnecessary itemsets, Apriori prunes any candidate -itemset if any of its -item subsets is not frequent.
Incorrect! Try again.
28How does the FP-Growth algorithm primarily improve upon the efficiency of the Apriori algorithm?
FP-Growth algorithm
Medium
A.It processes the database using an SQL JOIN instead of traditional looping.
B.It uses confidence thresholds instead of support to drastically reduce the search space.
C.It builds a compact tree structure and extracts frequent itemsets without explicit candidate generation.
D.It generates candidates using a join step without a computationally expensive pruning phase.
Correct Answer: It builds a compact tree structure and extracts frequent itemsets without explicit candidate generation.
Explanation:
FP-Growth compresses the dataset into an FP-Tree and recursively mines it, entirely avoiding the costly candidate generation and testing phase that slows down Apriori.
Incorrect! Try again.
29When constructing an FP-Tree, the items within each scanned transaction are sorted before insertion. What is the standard sorting order used?
FP-Growth algorithm
Medium
A.The order in which they physically appear in the original database.
B.Increasing order of their global support frequencies.
C.Decreasing order of their global support frequencies.
D.Alphabetical or lexicographical order.
Correct Answer: Decreasing order of their global support frequencies.
Explanation:
Sorting items in decreasing order of frequency ensures that the most frequent items are placed closer to the root of the tree, which maximizes path sharing and keeps the FP-Tree as compact as possible.
Incorrect! Try again.
30In Market Basket Analysis, if the rule has high support but a lift close to 1.0, what does this indicate to a retailer?
Market Basket Analysis
Medium
A.Customers almost never buy both items together in a single trip.
B.Milk and Bread are highly dependent on each other; a discount on one drives the other.
C.The co-occurrence of Milk and Bread is largely due to their high individual frequencies, not a strong specific association.
D.Buying Milk actively prevents the customer from buying Bread.
Correct Answer: The co-occurrence of Milk and Bread is largely due to their high individual frequencies, not a strong specific association.
Explanation:
A Lift of 1 implies that the occurrence of the two items is independent. High support merely means both items are very popular overall.
Incorrect! Try again.
31When designing promotional strategies using Market Basket Analysis, a high confidence for the rule suggests that:
Market Basket Analysis
Medium
A.Batteries are the most frequently purchased item in the store.
B.Discounting Batteries will definitively cause a massive spike in Flashlight sales.
C.Flashlights and Batteries should be placed far apart to encourage impulse buying.
D.Most transactions that contain a Flashlight also contain Batteries.
Correct Answer: Most transactions that contain a Flashlight also contain Batteries.
Explanation:
High confidence means that the conditional probability is high. It tells us about the proportion of Flashlight buyers who also buy Batteries.
Incorrect! Try again.
32Which of the following best distinguishes novelty detection from standard anomaly detection?
Anomaly Detection: Anomaly vs. novelty detection
Medium
B.Novelty detection looks exclusively for single-point outliers, while anomaly detection looks exclusively for contextual outliers.
C.Anomaly detection only works with labeled data, while novelty detection is purely an unsupervised approach.
D.Novelty detection assumes the training data is clean and identifies new, unseen observations that differ from it, whereas anomaly detection expects outliers within the training data itself.
Correct Answer: Novelty detection assumes the training data is clean and identifies new, unseen observations that differ from it, whereas anomaly detection expects outliers within the training data itself.
Explanation:
Novelty detection is typically applied when the training set is unpolluted by outliers. Standard anomaly detection deals with data that already contains outliers and aims to fit the dense regions, ignoring the existing anomalies.
Incorrect! Try again.
33A machine learning system is trained entirely on normal, benign server network traffic. Once deployed, it monitors real-time traffic and flags any unseen pattern as a potential intrusion. This paradigm is best described as:
Anomaly Detection: Anomaly vs. novelty detection
Medium
A.Contextual anomaly detection
B.Novelty detection
C.Density-based clustering
D.Association rule mining
Correct Answer: Novelty detection
Explanation:
Training on purely 'normal' data to establish a baseline and then detecting deviations in new data is the defining characteristic of novelty detection.
Incorrect! Try again.
34In an Isolation Forest model, how are anomalies predominantly identified?
Isolation Forest
Medium
A.By calculating the highest local density of data points around them.
B.By measuring their minimal Euclidean distance to the nearest cluster centroid.
C.By having the shortest average path lengths from the root to the leaf in the isolation trees.
D.By having the longest path lengths from the root to the leaf in the isolation trees.
Correct Answer: By having the shortest average path lengths from the root to the leaf in the isolation trees.
Explanation:
Isolation Forest isolates observations by randomly selecting a feature and a split value. Anomalies are few and different, so they get isolated quickly, resulting in shorter path lengths in the trees.
Incorrect! Try again.
35Why is the Isolation Forest algorithm particularly efficient and well-suited for high-dimensional datasets compared to nearest-neighbor approaches?
Isolation Forest
Medium
A.It requires computing exact distance metrics for all pairs of points.
B.It relies on random feature selection and split values, completely avoiding expensive distance calculations.
C.It explicitly projects all data into a two-dimensional space before finding anomalies.
D.It uses Principal Component Analysis (PCA) internally to reduce dimensions before building the trees.
Correct Answer: It relies on random feature selection and split values, completely avoiding expensive distance calculations.
Explanation:
Because Isolation Forest constructs trees via random splits, it does not need to compute pairwise distances between data points, making it highly scalable with respect to both volume and dimensions.
Incorrect! Try again.
36In the Isolation Forest algorithm, the anomaly score is calculated using . If a point has an expected path length exactly equal to the average path length , what does its anomaly score evaluate to, and what does it signify?
Isolation Forest
Medium
A.; the point is a mathematical outlier.
B.; the point does not have any distinct anomaly characteristics.
C.; the point is a strong inlier.
D.; the point is definitely an anomaly.
Correct Answer: ; the point does not have any distinct anomaly characteristics.
Explanation:
If , then . A score of 0.5 means the observation's path length is essentially average, so it is considered a normal point, not an anomaly.
Incorrect! Try again.
37The Local Outlier Factor (LOF) algorithm determines whether a data point is an outlier by primarily evaluating:
Local Outlier Factor (LOF)
Medium
A.The point's absolute Euclidean distance to the global mean of the dataset.
B.The sum of squared errors between the point and its assigned cluster centroid.
C.The point's local density compared to the local densities of its -nearest neighbors.
D.The point's isolation depth within a set of randomly constructed binary trees.
Correct Answer: The point's local density compared to the local densities of its -nearest neighbors.
Explanation:
LOF is a density-based method. It compares the local density of an observation to the local densities of its neighbors. A substantially lower density signifies a local outlier.
Incorrect! Try again.
38What is the typical interpretation when a data point yields an LOF score significantly greater than 1?
Local Outlier Factor (LOF)
Medium
A.The point represents the exact centroid of a well-defined cluster.
B.The point has a lower local density than its neighbors, strongly suggesting it is an outlier.
C.The point has an identical local density to its neighbors.
D.The point is located in a significantly denser region than its neighbors (a strong inlier).
Correct Answer: The point has a lower local density than its neighbors, strongly suggesting it is an outlier.
Explanation:
An LOF score around 1 indicates a density similar to neighbors. An LOF score means the point is less dense than its neighbors, making it an outlier. An LOF means it is in a denser region.
Incorrect! Try again.
39When utilizing unsupervised anomaly detection algorithms (like Isolation Forest) for credit card fraud detection, what is a primary operational challenge?
Applications in Cybersecurity and Fraud Detection
Medium
A.The algorithm typically memorizes fraudulent transactions since they dominate the dataset.
B.The models require thousands of explicitly labeled fraudulent examples to initialize the random trees.
C.Unsupervised models are strictly incapable of processing numerical transaction amounts.
D.Rare but completely legitimate customer behaviors might be flagged, leading to a high false-positive rate.
Correct Answer: Rare but completely legitimate customer behaviors might be flagged, leading to a high false-positive rate.
Explanation:
Because unsupervised anomaly detection flags any statistically unusual event, legitimate but uncommon transactions (like buying an expensive item while traveling) are often flagged, increasing false positives.
Incorrect! Try again.
40In detecting unauthorized access attempts on a corporate network, why might the Local Outlier Factor (LOF) method be preferred over a global distance-based statistical method?
Applications in Cybersecurity and Fraud Detection
Medium
A.LOF can identify localized anomalies (e.g., weird traffic relative only to the HR department), which global methods might gloss over as normal overall traffic.
B.Global methods process data too slowly to be used in real-time cybersecurity environments.
C.LOF utilizes an isolation tree structure that naturally maps directly to standard IP address subnets.
D.LOF requires strictly labeled training data, ensuring a higher precision in identifying specific known malware.
Correct Answer: LOF can identify localized anomalies (e.g., weird traffic relative only to the HR department), which global methods might gloss over as normal overall traffic.
Explanation:
A local anomaly might be perfectly normal in the global context of the network but highly unusual for a specific local group (like HR). LOF evaluates density relative to neighbors, successfully catching these local deviations.
Incorrect! Try again.
41Given two itemsets and with support values and , what is the maximum theoretically possible value of , and what does it imply about the itemsets?
Lift
Hard
A.1.00; and are mutually exclusive.
B.2.00; and always occur together.
C.1.25; is a perfect subset of in the transaction database.
D.1.25; and are perfectly independent.
Correct Answer: 1.25; is a perfect subset of in the transaction database.
Explanation:
Lift is defined as . The maximum possible value for is . Thus, maximum Lift = . This maximum occurs when all transactions containing also contain , meaning is a subset of .
Incorrect! Try again.
42Assume the confidence of the rule is and the confidence of the rule is . Which of the following represents the tightest mathematically guaranteed lower bound for the confidence of the rule ?
Confidence
Hard
A.
B.
C.
D.
Correct Answer:
Explanation:
By probability axioms conditioned on , . Since , the lower bound is . Because probabilities cannot be negative, the tightest bound is , known as the Fréchet inequality.
Incorrect! Try again.
43Consider two mutually exclusive itemsets and in a dataset where and . What is the conviction of the rule ?
Conviction
Hard
A.
B.
C.
D.
Correct Answer:
Explanation:
Conviction is defined as . If and are mutually exclusive, they never occur together, so . Plugging this into the formula gives .
Incorrect! Try again.
44During the candidate generation step () of the Apriori algorithm, let . Assuming lexicographic ordering, how many candidate 3-itemsets () are generated before the pruning step, and how many remain after the pruning step?
Apriori Algorithm
Hard
A.Generated: 3, Remaining: 2
B.Generated: 4, Remaining: 2
C.Generated: 4, Remaining: 1
D.Generated: 3, Remaining: 1
Correct Answer: Generated: 3, Remaining: 1
Explanation:
To generate , we join pairs in sharing the first item. Matches: {A,B}&{A,C} {A,B,C}; {A,B}&{A,D} {A,B,D}; {A,C}&{A,D} {A,C,D}. So, 3 are generated. Pruning checks if all subsets of size 2 exist in . For {A,B,C}: {A,B}, {A,C}, {B,C} exist (Keep). For {A,B,D}: {B,D} is missing (Prune). For {A,C,D}: {C,D} exists, but wait, {A,C}, {A,D}, {C,D} all exist! (Keep). Wait, let's re-verify: {A,B}, {A,C}, {A,D}, {B,C}, {C,D}. Subsets of {A,C,D}: AC, AD, CD. All exist. So 2 remain. Let's correct: generated 3, remaining 2.
Incorrect! Try again.
45Which of the following scenarios describes the worst-case space complexity for the FP-Tree constructed by the FP-Growth algorithm?
B.Every transaction contains exactly the same items.
C.Every transaction contains entirely distinct items, sharing no common prefixes.
D.The dataset follows a strict power-law distribution for item frequencies.
Correct Answer: Every transaction contains entirely distinct items, sharing no common prefixes.
Explanation:
The FP-Tree achieves compression by sharing common prefixes among transactions. In the worst-case scenario where transactions share absolutely no items (or no common prefixes after sorting by frequency), the FP-Tree cannot compress the data and branches out maximally, effectively resulting in a space complexity equivalent to the original uncompressed database.
Incorrect! Try again.
46In the Local Outlier Factor (LOF) algorithm, what happens to the LOF score of a point if it is placed perfectly inside a uniform, highly dense cluster where the distance between all adjacent points is a microscopic constant ?
Local Outlier Factor (LOF)
Hard
A.The LOF score approaches 0, identifying it as an extreme inlier.
B.The LOF score approaches due to division by zero in reachability distance.
C.The LOF score approaches 1, as its local reachability density matches its neighbors' densities.
D.The LOF score oscillates unpredictably depending on the value of .
Correct Answer: The LOF score approaches 1, as its local reachability density matches its neighbors' densities.
Explanation:
LOF is a ratio of the average local reachability density (lrd) of a point's -nearest neighbors to its own lrd. In a perfectly uniform cluster (regardless of how dense it is), every point has the exact same lrd. Therefore, the ratio (LOF score) evaluates to 1, accurately reflecting that the point is a standard inlier relative to its local neighborhood.
Incorrect! Try again.
47The anomaly score in an Isolation Forest is given by . If a specific data point evaluates to a score of across a large number of trees, what is the correct interpretation regarding ?
Isolation Forest
Hard
A.Point is highly anomalous, as its expected path length is half the maximum possible depth.
B.Point is isolated perfectly at the root node in half of the trees.
C.Point is a novelty introduced during inference, completely unrepresented in the training space.
D.Point is definitively a normal instance, as its expected path length matches the average path length of an unsuccessful search in a BST.
Correct Answer: Point is definitively a normal instance, as its expected path length matches the average path length of an unsuccessful search in a BST.
Explanation:
If , then . Here, is the average path length of an unsuccessful search in a Binary Search Tree (BST) built from instances. This indicates that the point is separated at an average depth, indistinguishable from the bulk of normal data, meaning it is a typical normal instance.
Incorrect! Try again.
48A data scientist is modeling network traffic to flag intrusions. The training data is known to be 'polluted' with a small fraction of unidentified malicious packets. Which algorithmic approach and paradigm is mathematically most robust for this specific training condition?
Anomaly vs. novelty detection
Hard
A.Novelty Detection using a standard One-Class SVM with a hard margin.
B.Supervised Binary Classification using highly imbalanced Random Forests.
C.Novelty Detection using an Isolation Forest trained on clean baseline data.
D.Anomaly Detection using an Isolation Forest or a soft-margin One-Class SVM (e.g., using a hyperparameter).
Correct Answer: Anomaly Detection using an Isolation Forest or a soft-margin One-Class SVM (e.g., using a hyperparameter).
Explanation:
The dataset is 'polluted', meaning the anomalies are already mixed into the training set. This requires Anomaly Detection (which seeks outliers in polluted data) rather than strict Novelty Detection (which assumes a pristine, clean training set to model 'normal' behavior). Isolation Forests and soft-margin One-Class SVMs (which allow for a contamination parameter ) are explicitly designed to handle polluted training data.
Incorrect! Try again.
49Fraudsters often attempt an 'evasion attack' against LOF-based fraud detection systems by creating dense 'Sybil' clusters—a tightly knit group of fraudulent accounts that exhibit identical, highly correlated transaction behaviors. How does this attack mathematically exploit the Local Outlier Factor (LOF)?
Applications in Cybersecurity and Fraud Detection
Hard
A.The attack creates an infinite loop in the LOF neighborhood search, causing a denial-of-service in the fraud detection system.
B.By forming a dense cluster, their -distances shrink to near zero, making their local reachability density equal to that of their neighbors, yielding an LOF .
C.The attack artificially inflates the LOF score of the honest users, causing them to be banned instead.
D.By increasing their collective distance from the origin, they force LOF to normalize their scores to 0.
Correct Answer: By forming a dense cluster, their -distances shrink to near zero, making their local reachability density equal to that of their neighbors, yielding an LOF .
Explanation:
LOF detects anomalies based on local density relative to neighbors. If fraudsters create a dense, isolated cluster (a 'Sybil' cluster) with a population greater than , each fraudulent account will only have other fraudulent accounts as its -nearest neighbors. Since they are closely packed, their local densities will be very high and mutually similar, resulting in an LOF score around 1, thereby evading detection.
Incorrect! Try again.
50In Market Basket Analysis, a retailer wants to optimize cross-selling campaigns by choosing between two rules: (Lift = 3.0, Support = 2%) and (Lift = 1.5, Support = 10%). If the sole objective is to maximize the absolute number of additional expected transactions driven by the rule (above random chance), which rule should be selected and why?
Market Basket Analysis
Hard
A.Rule , because the product of Lift and Support is higher.
B.Rule , because it has a higher support, meaning it covers more total transactions regardless of Lift.
C.Rule , because a higher Lift guarantees a higher conditional probability of purchase.
D.It cannot be determined without knowing the Leverage or the absolute probabilities of individual items.
Correct Answer: It cannot be determined without knowing the Leverage or the absolute probabilities of individual items.
Explanation:
The absolute number of additional expected transactions is measured by Leverage (), not directly by Support or Lift alone. While Support is and Lift is , we need the individual baseline probabilities ( and ) to compute the actual Leverage. Thus, the information provided is insufficient.
Incorrect! Try again.
51A dataset contains 1,000,000 transactions, out of which 990,000 are 'null transactions' that contain neither item A nor item B. If we remove all 990,000 null transactions to create a filtered dataset, which of the following metrics for the rule remains perfectly invariant?
Support
Hard
A.Leverage of the rule
B.Support of the rule
C.Lift of the rule
D.Confidence of the rule
Correct Answer: Confidence of the rule
Explanation:
Confidence is defined as . Removing null transactions (which contain neither A nor B) decreases the total number of transactions (), which mathematically scales both the numerator and denominator by the exact same factor, canceling out. Metrics like Support () and Lift are sensitive to changes in and will change.
Incorrect! Try again.
52An Isolation Forest algorithm struggles to isolate a specific point anomaly in a dataset with 50 dimensions. Investigation reveals the anomaly is easily linearly separable but hidden within a complex linear combination of 10 heavily correlated features. Why does the standard Isolation Forest fail here?
Isolation Forest
Hard
A.The expected path length diverges to infinity when features are heavily correlated.
B.The sub-sampling parameter is too low to capture 50 dimensions simultaneously.
C.Isolation Forests calculate Euclidean distances, which suffer from the curse of dimensionality.
D.Isolation Forest relies on strict axis-parallel splits, making it blind to anomalies defined solely by non-axis-aligned linear combinations.
Correct Answer: Isolation Forest relies on strict axis-parallel splits, making it blind to anomalies defined solely by non-axis-aligned linear combinations.
Explanation:
Standard Isolation Forests construct trees by randomly selecting a feature and a split value. This results in axis-parallel hyperplanes. If an anomaly is only separable via a diagonal or oblique combination of correlated features, axis-parallel splits will struggle to efficiently isolate it, often requiring deep, complex splits that mimic an inlier's path length.
Incorrect! Try again.
53The Apriori property (anti-monotonicity) states that if an itemset is frequent, all its subsets must be frequent. Suppose a researcher modifies the definition of a transaction such that it includes 'negative items' (e.g., a transaction contains if item A was deliberately not bought). Does the Apriori property still mathematically hold for mining itemsets containing both positive and negative items?
Apriori Algorithm
Hard
A.Yes, but only if the minimum support threshold is set higher than $0.5$ to account for the high frequency of negative items.
B.No, the introduction of negative items breaks the subset probability axiom, making anti-monotonicity fail.
C.No, because is inversely proportional to , causing the support counts to oscillate.
D.Yes, because a negative item can simply be treated as a unique, distinct item, and the subset probability bounds still fundamentally apply.
Correct Answer: Yes, because a negative item can simply be treated as a unique, distinct item, and the subset probability bounds still fundamentally apply.
Explanation:
The Apriori property relies strictly on the mathematical axiom that the probability of an intersection of events is always less than or equal to the probability of any subset of those events. Whether an event represents 'purchasing an item' or 'not purchasing an item', treating as just another item retains this subset frequency relationship perfectly. Thus, anti-monotonicity is preserved.
Incorrect! Try again.
54During the recursive mining of the FP-Tree, the algorithm constructs 'Conditional Pattern Bases'. For a heavily skewed dataset where a single frequent item appears in 95% of all transactions, where will typically be located in the conditional pattern bases of other items?
FP-Growth algorithm
Hard
A.It will not appear in any conditional pattern bases, as its frequency causes it to be pruned early.
B.It will frequently appear as a single-node path near the root of the conditional trees, leading to highly efficient compression.
C.It will force the algorithm to fall back to an Apriori-like candidate generation to handle the skewness.
D.It will appear frequently at the leaves of the conditional trees, causing high memory overhead.
Correct Answer: It will frequently appear as a single-node path near the root of the conditional trees, leading to highly efficient compression.
Explanation:
FP-Growth orders items by decreasing support frequency. An item appearing in 95% of transactions will be placed at or very near the root of the global FP-Tree. Consequently, when constructing conditional pattern bases for less frequent items (which trace paths from the leaves up to the root), will consistently appear near the top of these paths, enabling massive prefix sharing and compression.
Incorrect! Try again.
55A fraud detection system evaluates credit card transactions. Let a rule be . Given the extreme class imbalance (Fraud is of all transactions), which of the following statements about rule evaluation metrics is mathematically true?
Applications in Cybersecurity and Fraud Detection
Hard
A.Conviction will approach zero, rendering it useless for asymmetric rules.
B.Leverage will be mathematically maximized because is near zero.
C.Lift will likely be extremely high even if the rule generates many false positives, suffering from the Base Rate Fallacy.
D.Confidence is an unbiased estimator for fraud likelihood, unaffected by class imbalance.
Correct Answer: Lift will likely be extremely high even if the rule generates many false positives, suffering from the Base Rate Fallacy.
Explanation:
Lift evaluates . Because (the base rate) is exceptionally tiny ($0.0001$), even a marginal increase in the conditional probability due to will result in a massive Lift score (e.g., if confidence is just , Lift is $200$). This makes Lift heavily susceptible to the Base Rate Fallacy in highly imbalanced datasets, potentially misleading analysts despite high false positive rates.
Incorrect! Try again.
56Let and be itemsets. If , which of the following logical equivalences is definitively true regarding the transaction database?
Support
Hard
A.Every transaction that contains must also contain .
B. and are identical itemsets.
C.Every transaction that contains must also contain .
D.The dataset contains no null transactions.
Correct Answer: Every transaction that contains must also contain .
Explanation:
Support() counts transactions containing both and . If , it means the intersection of and is precisely the set of transactions containing . Mathematically, the set of transactions with is a subset of the transactions with , meaning every transaction containing inevitably contains ().
Incorrect! Try again.
57In the context of the exact mathematical objective functions, how does a One-Class SVM (used for Novelty Detection) fundamentally differ from standard PCA-based reconstruction error (used for Anomaly Detection)?
Anomaly vs. novelty detection
Hard
A.One-Class SVM maximizes the margin separating data from the origin, whereas PCA minimizes orthogonal projection loss.
B.One-Class SVM minimizes a volume enclosing the training data origin, whereas PCA minimizes orthogonal projection loss.
C.One-Class SVM relies on Gaussian distributions, whereas PCA is non-parametric.
D.There is no difference; both project data onto a lower-dimensional hyperplane optimized for variance.
Correct Answer: One-Class SVM maximizes the margin separating data from the origin, whereas PCA minimizes orthogonal projection loss.
Explanation:
A One-Class SVM maps data into a higher-dimensional feature space (via the kernel trick) and attempts to separate the data from the origin with the maximum possible margin, effectively defining a bounding frontier. Conversely, PCA models normal behavior by finding axes of maximum variance and flagging anomalies based on high orthogonal reconstruction error when projecting back.
Incorrect! Try again.
58Consider the edge case in calculating the Local Outlier Factor (LOF) where a dataset contains identical duplicate instances. If there are perfectly identical points and the neighborhood parameter is set such that , what critical computational failure occurs natively in the standard LOF definition?
Local Outlier Factor (LOF)
Hard
A.The Reachability Distance reduces to the Manhattan distance, ignoring local density structures.
B.The points form an infinite loop during the nearest-neighbor search, halting the algorithm.
C.The algorithm identifies all points as outliers because their LOF score becomes infinite.
D.The -distance of these identical points evaluates to zero, causing a division by zero when calculating the Local Reachability Density (lrd).
Correct Answer: The -distance of these identical points evaluates to zero, causing a division by zero when calculating the Local Reachability Density (lrd).
Explanation:
If there are identical points, the nearest neighbor of any of these points is at a distance of 0. Thus, the -distance is 0, and the reachability distance to all neighbors is 0. Because Local Reachability Density (lrd) is defined as the inverse of the average reachability distance, it forces a division by zero. Robust implementations handle this by adding a small or deduplicating data.
Incorrect! Try again.
59Which of the following conditions proves that is mathematically symmetric (i.e., )?
Lift
Hard
A.It is only true when .
B.It is only true when .
C.It is only true when and are statistically independent.
D.It is always true by the definition of the formula.
Correct Answer: It is always true by the definition of the formula.
Explanation:
By definition, Lift is calculated as . Since intersection is commutative () and scalar multiplication is commutative (), the resulting formula is inherently symmetric for any itemsets and . Thus, is always mathematically true.
Incorrect! Try again.
60A supermarket analyzes baskets containing {Bread, Butter, Jam}. They observe the rule has and . However, the rule has and . According to Market Basket Analysis principles, what does this mathematically reveal about the relationship?
Market Basket Analysis
Hard
A.Bread and Butter are mutually exclusive items.
B.The dataset suffers from the base-rate fallacy regarding the purchase of Jam.
C.The itemsets violate the Apriori anti-monotonicity property, indicating a calculation error.
D.Butter is an 'anti-catalyst' for buying Jam when Bread is already in the basket.
Correct Answer: Butter is an 'anti-catalyst' for buying Jam when Bread is already in the basket.
Explanation:
The confidence of a rule drops from to and Lift drops from $1.5$ to $1.2$ when Butter is added to the antecedent {Bread}. This indicates a negative interaction: the addition of Butter actually decreases the conditional probability of buying Jam. In association rule terminology, Butter acts as a suppressor or anti-catalyst for Jam in the presence of Bread.