Unit 4 - Practice Quiz

INT345 60 Questions
0 Correct 0 Wrong 60 Left
0/60

1 What is a local feature in the context of computer vision?

Overview of feature detection Easy
A. A completely flat, uniform region in an image
B. A metadata tag describing the image content
C. An image pattern that is easily distinguishable from its immediate neighborhood
D. The overall color histogram of the entire image

2 What type of image structure is the Harris detector primarily designed to find?

Harris corner detector Easy
A. Corners
B. Smooth gradients
C. Flat uniform regions
D. Straight horizontal lines

3 What does the acronym SIFT stand for in computer vision?

scale invariant feature transform Easy
A. Scale-Invariant Feature Transform
B. Spatial Intensity Feature Transform
C. Speeded-up Invariant Feature Test
D. Standard Image Feature Tool

4 The SURF algorithm was primarily designed as a faster alternative to which popular feature detector?

speeded up robust features Easy
A. HOG
B. RANSAC
C. BRIEF
D. SIFT

5 What is the primary advantage of using binary feature descriptors over floating-point descriptors?

binary feature detectors Easy
A. They are highly efficient to compute and match
B. They provide a continuously varying gradient map
C. They capture more color information
D. They are entirely unaffected by extreme lighting changes

6 How does the FAST algorithm typically identify a corner?

FAST Easy
A. By calculating the second derivative of the entire image
B. By comparing a central pixel's intensity to a circular ring of surrounding pixels
C. By analyzing the frequency domain using a Fourier transform
D. By building a histogram of oriented gradients

7 BRIEF creates an image descriptor by performing what specific operation?

BRIEF Easy
A. Computing gradient histograms
B. Calculating complex matrix eigenvalues
C. Extracting scale-space extrema using Difference of Gaussians
D. Simple binary intensity comparisons between random pixel pairs

8 ORB was introduced as a free and efficient alternative to SIFT and SURF. What two algorithms does it combine?

ORB Easy
A. FAST and BRIEF
B. Harris and HOG
C. RANSAC and SIFT
D. K-D tree and LSH

9 What does HOG stand for in computer vision?

HOG Easy
A. Highly Optimized Graphics
B. Heuristic Object Generator
C. Histogram of Oriented Gradients
D. Hashing of Grayscale

10 Which of the following is a common real-world application of feature detection and description?

applications of descriptors Easy
A. Typing text into a document
B. Adjusting monitor brightness
C. Converting images from JPEG to PNG
D. Image panorama stitching

11 What is the primary goal of feature matching?

Overview of feature matching Easy
A. To compress an image size
B. To establish correspondences between features in two or more images
C. To apply color filters to an image
D. To detect human faces only

12 Which similarity measure is computationally best suited for comparing binary feature descriptors like ORB and BRIEF?

similarity measures Easy
A. Mahalanobis distance
B. Cosine similarity
C. Hamming distance
D. Euclidean distance

13 How does a Brute Force matcher find the best match for a feature from a first image?

brute force matching Easy
A. It randomly guesses the best match to save time
B. It skips matching and outputs the identity matrix
C. It compares the feature against every single feature in the second image
D. It only compares the feature with features of the same color

14 In the context of feature matching, what is a K-D tree primarily used for?

K-D tree Easy
A. Calculating the focal length of a camera
B. Speeding up nearest neighbor searches in high-dimensional spaces
C. Storing pixel colors in a 2D array
D. Drawing edges on a black and white image

15 What is the core idea behind Locality-Sensitive Hashing (LSH)?

Locality-Sensitive Hashing (LSH) Easy
A. Hashing dissimilar items into the exact same bucket
B. Encrypting image features so they cannot be read
C. Hashing similar input items into the same buckets with high probability
D. Sorting features alphabetically

16 What does RANSAC stand for?

RANSAC for robust matching Easy
A. Random Sample Consensus
B. Robust Analysis of Scale and Corners
C. Randomized System Algorithm Code
D. Rapid Sampling and Calculation

17 In the context of feature matching, what is the main purpose of the RANSAC algorithm?

RANSAC for robust matching Easy
A. To extract more features from an image
B. To filter out incorrect matches (outliers) and find a robust mathematical model
C. To increase the brightness of the image
D. To compress the feature descriptors

18 Which of the following properties makes SIFT highly reliable for matching images taken from different distances and angles?

scale invariant feature transform Easy
A. It converts the image to purely black and white
B. It is invariant to uniform image scaling and rotation
C. It only detects vertical lines
D. It relies entirely on color histograms

19 Which distance metric is most commonly used to match continuous floating-point descriptors like SIFT and SURF?

similarity measures Easy
A. Jaccard index
B. Hamming distance
C. Euclidean distance ( norm)
D. Levenshtein distance

20 HOG descriptors were traditionally and most famously popularized for which specific computer vision task?

HOG Easy
A. Color correction
B. Pedestrian detection
C. Image compression
D. Lens distortion removal

21 Which of the following characteristics is most crucial for a local feature detector to ensure that the same physical point in a scene is extracted across images taken from different viewpoints?

Overview of feature detection Medium
A. Sparsity
B. Repeatability
C. Low contrast
D. High dimensionality

22 In the Harris corner detector, the structure tensor (second-moment matrix) is computed. Let and be its eigenvalues. What does it indicate if and ?

Harris corner detector Medium
A. An isolated point of noise
B. An edge
C. A corner
D. A flat region

23 The Harris response function is given by . Why is this specific formula used instead of directly computing the eigenvalues?

Harris corner detector Medium
A. To avoid the computationally expensive eigenvalue decomposition
B. To normalize the image gradients against illumination changes
C. To achieve scale invariance
D. To enforce rotation invariance in the descriptor

24 How does the Scale Invariant Feature Transform (SIFT) algorithm initially identify potential keypoints across different scales?

scale invariant feature transform Medium
A. By computing the Harris response at multiple image resolutions
B. By searching for local maxima in the Histogram of Oriented Gradients (HOG)
C. By finding local extrema in a Difference of Gaussians (DoG) pyramid
D. By finding local extrema in a Laplacian of Gaussian (LoG) pyramid

25 To achieve rotation invariance, SIFT assigns a dominant orientation to each keypoint. How is this dominant orientation determined?

scale invariant feature transform Medium
A. By calculating the direction of the principal eigenvector of the Harris matrix
B. By aligning the keypoint patch with the absolute horizontal axis of the image
C. By analyzing the peak of a histogram of local image gradient orientations around the keypoint
D. By computing the phase of the Fourier transform of the keypoint patch

26 SURF accelerates the feature detection process compared to SIFT. What mathematical tool does SURF heavily rely on to speed up its filtering operations?

speeded up robust features Medium
A. Singular Value Decomposition (SVD)
B. Integral images
C. Discrete Cosine Transform (DCT)
D. Fast Fourier Transform (FFT)

27 The FAST detector examines a circle of 16 pixels around a candidate keypoint. To quickly reject non-corners, which pixels are typically tested first?

FAST Medium
A. All odd-numbered pixels
B. All even-numbered pixels
C. Pixels 1, 2, 3, and 4
D. Pixels 1, 5, 9, and 13

28 How is a BRIEF descriptor formed for a given keypoint patch?

BRIEF Medium
A. By creating a histogram of local gradient orientations
B. By performing simple binary intensity tests between random pairs of pixels
C. By calculating the Haar wavelet responses in horizontal and vertical directions
D. By computing the eigenvectors of the local covariance matrix

29 Standard FAST keypoints lack orientation, and standard BRIEF descriptors are not rotationally invariant. How does ORB modify FAST to compute a keypoint orientation?

ORB Medium
A. It aligns the patch to the strongest edge detected nearby
B. It fits an ellipse to the keypoint patch and uses the major axis
C. It computes a gradient histogram similar to SIFT
D. It calculates the intensity centroid of the keypoint patch

30 In the HOG (Histogram of Oriented Gradients) descriptor, what is the primary purpose of applying block normalization across overlapping blocks?

HOG Medium
A. To make the descriptor invariant to rotation
B. To reduce the dimensionality of the descriptor
C. To provide invariance to local changes in illumination and contrast
D. To ensure the descriptor is invariant to scale changes

31 When building an object recognition system for a heavily cluttered scene, why are local features (like SIFT) preferred over global features (like global color histograms)?

applications of descriptors Medium
A. Local features are highly robust to partial occlusions
B. Local features are faster to extract than global features
C. Local features require significantly less memory to store
D. Local features inherently capture the absolute color of the object better

32 Which distance metric is computationally optimal for matching binary descriptors such as BRIEF, ORB, or BRISK?

similarity measures Medium
A. Euclidean distance ( norm)
B. Manhattan distance ( norm)
C. Cosine similarity
D. Hamming distance

33 What is the primary objective of Lowe's ratio test during the feature matching phase?

Overview of feature matching Medium
A. To ensure features are matched across the same scale space
B. To normalize the descriptor lengths before computing Euclidean distances
C. To reject ambiguous matches by comparing the nearest neighbor distance to the second nearest neighbor distance
D. To filter out features that do not have a strong gradient magnitude

34 Given an image with descriptors and an image with descriptors, what is the time complexity of brute force matching if we want to find the best match in for every descriptor in ?

brute force matching Medium
A.
B.
C.
D.

35 While K-D trees are often used to accelerate nearest neighbor searches in feature matching, under what condition does a K-D tree's performance typically degrade to that of a linear (brute force) search?

K-D tree Medium
A. When the descriptors contain negative values
B. When the dimensionality of the descriptors is very high (e.g., > 100)
C. When the number of descriptors is very small
D. When the descriptors are highly clustered

36 What is the core principle behind Locality-Sensitive Hashing (LSH) for approximate nearest neighbor search?

Locality-Sensitive Hashing (LSH) Medium
A. It ensures that dissimilar items are mapped to the same hash bucket to save memory
B. It partitions the space deterministically using the median values of each dimension
C. It uses random projections so that similar descriptors fall into the same hash bucket with high probability
D. It uses cryptographic hash functions to secure descriptor data

37 In RANSAC, let be the probability that a selected match is an inlier. If a model requires matches to be computed, what does the expression represent?

RANSAC for robust matching Medium
A. The expected number of inliers found after iterations
B. The variance of the error in the final model fit
C. The probability that RANSAC finds a correct model after iterations
D. The probability of selecting at least one outlier in iterations

38 When using RANSAC to compute a homography matrix between two images, what is the minimum number of point correspondences () required to instantiate a model hypothesis in a single iteration?

RANSAC for robust matching Medium
A. 8
B. 2
C. 4
D. 3

39 What happens to the required number of RANSAC iterations if the proportion of outliers in your feature matches increases dramatically, assuming you want to maintain a 99% confidence of finding a good model?

RANSAC for robust matching Medium
A. The required iterations increase exponentially
B. The required iterations remain the same, but the model fitting takes longer
C. The required iterations decrease because outliers are easier to reject
D. The required iterations increase linearly

40 Binary descriptors are primarily favored in real-time computer vision applications (like mobile AR) because they are memory efficient and fast to match. Which of the following is NOT a binary descriptor?

binary feature detectors Medium
A. SURF
B. BRISK
C. BRIEF
D. ORB

41 The Harris corner detector computes a corner response function . If the eigenvalues and of the second-moment matrix satisfy (where is very large), how does the Harris corner response behave, and what does it indicate structurally?

Harris corner detector Hard
A. , indicating a flat region
B. , indicating an edge
C. , indicating an isolated point
D. , indicating a corner

42 In the SIFT algorithm, extreme points are detected in the Difference of Gaussian (DoG) scale space. During sub-pixel localization, the Taylor expansion of the DoG function is used. If the calculated offset has a component larger than $0.5$ in any dimension, what action is taken?

scale invariant feature transform Hard
A. The local contrast threshold is increased to filter out noise.
B. The offset is clamped to exactly 0.5 to prevent divergence.
C. The extremum is rejected as an unstable edge.
D. The extremum is moved to the adjacent sample point and the localization is recomputed.

43 SURF achieves computational efficiency by using Box Filters to approximate the Hessian matrix. How does SURF evaluate these Box Filters independent of the filter scale in time?

speeded up robust features Hard
A. By utilizing an Integral Image representation
B. By precomputing the Fourier transform of the image
C. By downsampling the original image progressively
D. By applying 1D Gaussian smoothing separably

44 The FAST feature detector evaluates a circle of 16 pixels around a candidate pixel . To accelerate the rejection of non-corners, FAST initially examines a specific subset of pixels. Which subset is typically checked first to quickly reject a non-corner candidate?

FAST Hard
A. Pixels 2, 6, 10, and 14
B. Pixels 1, 5, 9, and 13
C. Pixels 1, 3, 5, and 7
D. All even-numbered pixels around the circle

45 BRIEF descriptors are highly sensitive to in-plane rotation. How does the ORB algorithm modify the BRIEF descriptor extraction to achieve rotation invariance (steerable BRIEF)?

ORB Hard
A. By calculating the intensity centroid of the patch and steering the sampling pattern accordingly
B. By computing the eigenvectors of the second moment matrix to align the patch
C. By extracting the dominant gradient orientation using a SIFT-like histogram
D. By replacing random point pairs with symmetric circular patterns

46 In HOG descriptor computation, block normalization is a critical step for contrast invariance. If L2-Hys normalization is applied to a block vector , which of the following best describes the process?

HOG Hard
A. Apply a low-pass filter to , compute the L2-norm, and scale by a factor of 2.
B. Normalize using L1-norm, compute the square root of each element, and divide by the maximum value.
C. Normalize using L2-norm, clip values at a threshold (e.g., 0.2), and renormalize using L2-norm.
D. Subtract the mean of , divide by the standard deviation, and apply a sigmoid function.

47 When using RANSAC to estimate a model from a set of putative point matches containing an unknown outlier ratio , the number of iterations required to ensure a probability of picking at least one outlier-free sample of size is given by:

RANSAC for robust matching Hard
A.
B.
C.
D.

48 In the context of approximate nearest neighbor search using LSH for binary descriptors like ORB, which hash function family is commonly utilized to preserve the Hamming distance?

Locality-Sensitive Hashing (LSH) Hard
A. MinHash using Jaccard similarity over the non-zero gradients
B. Random projection hashing where each bit samples a specific index of the binary descriptor
C. E2LSH (Exact Euclidean LSH) using -stable distributions
D. SimHash operating on the TF-IDF weights of visual words

49 Why does the standard K-D tree approach degrade to linear search when matching high-dimensional descriptors like the 128-dimensional SIFT?

K-D tree Hard
A. The Euclidean distance metric loses its triangular inequality property in dimensions greater than 20.
B. The median-finding algorithm during tree construction fails to partition data evenly in high dimensions.
C. The tree depth exceeds the maximum stack limit during recursive traversal, forcing a linear scan.
D. The volume of the search sphere intersects an exponentially increasing number of hyperplane boundaries, requiring backtracking through almost all nodes.

50 When matching two sets of feature descriptors using the Sum of Squared Differences (SSD) versus Normalized Cross-Correlation (NCC), which of the following conditions strictly favors NCC over SSD for robust matching?

similarity measures Hard
A. The feature descriptors are strictly binary strings extracted via BRIEF.
B. The images have undergone an affine change in illumination intensity ().
C. The images contain high levels of zero-mean Gaussian noise.
D. The images have extreme scale variations and require scale-space extrema detection.

51 A common strategy to eliminate ambiguous feature matches is Lowe's ratio test. If the nearest neighbor distance is and the second nearest neighbor distance is , the match is kept if . What underlying assumption justifies this test in complex scenes?

Overview of feature matching Hard
A. The descriptor space forms a convex hull where the first and second neighbors always lie on opposite sides of the hyperplane.
B. The scale and rotation invariance of SIFT guarantees that is always an affine transformation of .
C. Correct matches are significantly closer to the query than any incorrect match, whereas incorrect matches have many neighbors at similar distances due to background clutter.
D. Outliers in the dataset always cluster together at a distance exactly twice that of the true inlier.

52 ORB uses rBRIEF (rotation-aware BRIEF) for its descriptor. To reduce the correlation among the binary tests and maximize variance, ORB applies a specific learning step during its design. What method is used to select the optimal pairs of pixels for the binary tests?

ORB Hard
A. Support Vector Machines (SVM) to classify stable versus unstable bit pairs across the scale pyramid.
B. K-means clustering on the intensity differences of random Gaussian patches.
C. A greedy search over all possible pixel pairs maximizing variance and minimizing absolute correlation.
D. Principal Component Analysis (PCA) to project the binary strings into a lower-dimensional orthogonal space.

53 Which mathematical concept provides the theoretical foundation for treating feature detection as finding locations where the signal changes significantly in multiple directions?

Overview of feature detection Hard
A. The Radon transform integrated over a full rotation
B. The Laplacian of Gaussian (LoG) zero-crossings
C. The Fourier transform phase spectrum evaluated at high frequencies
D. The auto-correlation function and the eigenvalues of the local structure tensor

54 In bag-of-visual-words (BoVW) image classification, visual vocabularies are created by clustering feature descriptors. If an image has multiple repetitive structures yielding many identical descriptors, how does the Term Frequency-Inverse Document Frequency (TF-IDF) weighting scheme handle these specific visual words?

applications of descriptors Hard
A. It assigns them a constant weight equal to the median frequency of the vocabulary.
B. It heavily penalizes them if they also appear frequently across the entire corpus of training images.
C. It entirely filters them out before the support vector machine classification stage.
D. It boosts their weight proportionally to the square of their occurrences to emphasize repetitive textures.

55 SIFT handles rotation invariance by computing a dominant gradient orientation for each keypoint. If a local region has multiple peaks in the orientation histogram that are within 80% of the highest peak, how does SIFT process this?

scale invariant feature transform Hard
A. It rejects the keypoint as being too ambiguous and prone to mismatches.
B. It increases the smoothing of the histogram until only a single dominant peak remains.
C. It computes the weighted average of all peaks to form a single orientation vector.
D. It creates multiple keypoints at the same location and scale, but with different orientations.

56 In PROSAC (Progressive Sample Consensus), an extension of RANSAC, how is the sampling strategy modified to achieve faster convergence in feature matching?

RANSAC for robust matching Hard
A. It incrementally adds dimensions to the descriptor space during the consensus phase.
B. It selects samples based on their spatial dispersion to maximize geometric constraints.
C. It progressively decreases the threshold for what constitutes an inlier during iterations.
D. It samples from a progressively expanding subset of matches ordered by a quality metric, such as Lowe's ratio score.

57 FAST uses a machine learning approach to generalize and speed up corner detection. How is the decision tree formulated in the machine learning version of FAST?

FAST Hard
A. A deep neural network applies 1D convolutions around the circular perimeter to detect edge patterns.
B. ID3 algorithm is used to select the pixel in the 16-pixel ring that yields the most information gain for classifying corner vs. non-corner.
C. A Random Forest is trained on the continuous intensity gradients of the central pixel.
D. A Support Vector Machine maps the 16-pixel ring into a high-dimensional space to find the optimal separating hyperplane.

58 In the formulation of the Difference of Gaussian (DoG) function , the DoG provides an approximation to the scale-normalized Laplacian of Gaussian (LoG). What is the theoretical relationship between the DoG and the scale-normalized LoG ?

scale invariant feature transform Hard
A.
B.
C.
D.

59 Binary descriptors like BRISK rely on a specific sampling pattern around the keypoint. Unlike BRIEF, which uses random point pairs, what defines the spatial sampling pattern of BRISK?

binary feature detectors Hard
A. A cross-shaped sampling mask that aligns with the dominant gradient orientation of the image patch.
B. A dense rectangular grid of overlapping blocks similar to the HOG cell structure.
C. A set of concentric rings with evenly spaced sampling points, where points are smoothed with Gaussian kernels proportional to their distance from the center.
D. A spiral pattern defined by the Fibonacci sequence to ensure equal area sampling.

60 When applying Brute Force matching between image A ( features) and image B ( features) using cross-check (mutual consistency) validation, a match between feature and feature is retained only if:

brute force matching Hard
A. is the absolute nearest neighbor to in B, and is the absolute nearest neighbor to in A.
B. The distance ratio is below a threshold for both directions.
C. The distance between and is less than the median distance of all possible pairs.
D. The descriptor vector of and the descriptor vector of are linearly dependent.