Unit1 - Subjective Questions
INT345 • Practice Questions with Detailed Answers
What is Digital Image Processing? Explain the fundamental steps involved in a typical image processing system.
Digital Image Processing refers to the use of computer algorithms to perform processing on digital images. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and signal distortion during processing.
The fundamental steps in a typical image processing system are:
- Image Acquisition: The first step involves capturing the image using a sensor (like a camera) and converting it into a digital format.
- Image Enhancement: The process of manipulating an image so that the result is more suitable than the original for a specific application (e.g., adjusting contrast or brightness).
- Image Restoration: Improving the appearance of an image, typically based on mathematical or probabilistic models of image degradation.
- Morphological Processing: Tools for extracting image components that are useful in the representation and description of shape.
- Segmentation: Partitioning an image into its constituent parts or objects.
- Object Recognition: Assigning a label (e.g., "vehicle") to an object based on its descriptors.
- Representation and Description: Transforming raw data into a form suitable for subsequent computer processing and extracting features that result in quantitative information.
Distinguish between Image Processing and Computer Vision.
While both fields deal with visual data, they have distinct goals:
- Definition:
- Image Processing focuses on transforming an image to enhance it or prepare it for further analysis. The input and output are both images.
- Computer Vision aims to emulate human vision. It focuses on understanding and interpreting the content of an image. The input is an image, and the output is an interpretation, decision, or specific information.
- Objective:
- Image Processing: To improve the visual appearance of an image to a human observer or to extract some features.
- Computer Vision: To make a machine "see" and understand the visual world (e.g., object detection, scene recognition).
- Examples:
- Image Processing: Contrast enhancement, noise removal, image sharpening, compression.
- Computer Vision: Facial recognition, autonomous driving visual systems, medical anomaly detection.
- Complexity: Image Processing is often considered a subset or a lower-level precursor to the higher-level cognitive tasks performed in Computer Vision.
Explain the hierarchy of low-level, mid-level, and high-level processes in computer vision and image processing.
Computer vision and image processing tasks are often categorized into three hierarchical levels:
- Low-Level Processes: These involve primitive operations where both the input and output are images. Examples include noise reduction, contrast enhancement, and image sharpening. The focus is purely on pixel-level manipulation without any understanding of the image content.
- Mid-Level Processes: These operations involve extracting attributes or features from an image. The input is generally an image, but the output is a set of attributes or features (e.g., edges, contours, or object identities). Examples include image segmentation (partitioning an image into regions) and object edge detection.
- High-Level Processes: This involves "making sense" of an ensemble of recognized objects, akin to human cognition. The input is typically the features/attributes from the mid-level, and the output is a semantic understanding of the scene. Examples include scene analysis, autonomous navigation, and answering questions about the image content.
Discuss the prominent applications of Computer Vision in the healthcare and automotive industries.
Healthcare Industry:
- Medical Image Analysis: Computer vision algorithms are used to analyze X-rays, MRIs, and CT scans to detect abnormalities like tumors, fractures, or diseases at an early stage.
- Surgical Assistance: Vision systems guide robotic arms during minimally invasive surgeries, providing high-precision 3D tracking of organs and tools.
- Health Monitoring: Analyzing patient movements and vitals using cameras to ensure safety (e.g., detecting falls in elderly patients).
Automotive Industry:
- Autonomous Driving: Self-driving cars rely heavily on computer vision to detect pedestrians, other vehicles, traffic signs, and lane markings using arrays of cameras and LiDAR.
- Advanced Driver Assistance Systems (ADAS): Features like automatic emergency braking, adaptive cruise control, and blind-spot monitoring use real-time visual data processing.
- Driver Alertness Monitoring: In-cabin cameras analyze driver eye movement and facial expressions to detect drowsiness or distraction, issuing warnings to prevent accidents.
Describe how Image Processing and Computer Vision are utilized in agriculture and security systems.
Agriculture:
- Crop Monitoring: Drones equipped with cameras capture multispectral images of fields. Image processing evaluates crop health, identifying areas needing water or fertilizer.
- Weed Detection and Spraying: Computer vision models differentiate between crops and weeds, guiding smart tractors to apply herbicides only where necessary, reducing chemical use.
- Yield Estimation: Vision systems count fruits or vegetables on plants to estimate the overall harvest yield before picking.
Security Systems:
- Facial Recognition: Used for access control in secure facilities or identifying suspects in crowded areas by matching faces against a database.
- Intrusion Detection: Smart cameras analyze video feeds in real-time to detect unauthorized human or vehicle movement in restricted zones, minimizing false alarms from animals or weather.
- License Plate Recognition (ANPR): Automated systems read and log vehicle license plates for parking management, toll collection, and law enforcement.
Explain the characteristics of JPEG, PNG, and TIFF image file formats. When is it appropriate to use each?
1. JPEG (Joint Photographic Experts Group):
- Characteristics: Uses lossy compression, meaning some data is permanently lost to reduce file size. It supports 24-bit color (millions of colors) but does not support transparency.
- When to use: Ideal for photographs, complex images with gradients, and web graphics where small file sizes are prioritized over perfect quality.
2. PNG (Portable Network Graphics):
- Characteristics: Uses lossless compression, retaining all original image data. It supports both 8-bit (indexed color) and 24-bit color, and importantly, supports an alpha channel for transparency.
- When to use: Best for web graphics requiring transparency (logos, icons), images with text, and crisp lines where compression artifacts (like those in JPEG) would be noticeable.
3. TIFF (Tagged Image File Format):
- Characteristics: Typically uses lossless compression (or no compression). Supports multiple layers, pages, and extremely high color depths.
- When to use: The standard for professional photography, print publishing, and high-quality archiving, where maximum quality and detail are strictly required, and file size is not a concern.
What is the difference between lossy and lossless image compression? Give examples of file formats for each.
Lossless Compression:
- Definition: A compression algorithm that reduces file size without losing any information. The original image can be perfectly reconstructed from the compressed data.
- Mechanism: It works by identifying and eliminating statistical redundancy in the image data.
- Formats: PNG (Portable Network Graphics), BMP (Bitmap), TIFF (when using LZW or ZIP compression), GIF.
- Use Case: Medical imaging, line art, text, and archival storage where absolute fidelity is required.
Lossy Compression:
- Definition: A compression algorithm that reduces file size by permanently discarding some data, usually the data deemed least important to human visual perception.
- Mechanism: It approximates the original data, resulting in a significantly smaller file size but a loss of quality (often resulting in artifacts).
- Formats: JPEG, WebP (lossy mode), HEIC.
- Use Case: Web photography, digital cameras, and general image sharing where smaller file sizes are critical and a slight loss in quality is acceptable.
Define contrast enhancement. Explain the concept of linear contrast stretching with mathematical equations.
Contrast Enhancement is a spatial domain image processing technique used to improve the visibility of features in an image by increasing the dynamic range of the gray levels in the image being processed.
Linear Contrast Stretching:
It is a piece-wise linear transformation that stretches the range of intensity values it contains to span a desired range of values, e.g., the full range of the pixel datatype (like $0$ to $255$ for 8-bit images).
Let be the intensity of the input image and be the intensity of the output image. Let the minimum and maximum intensity values in the original image be and , respectively.
If we want to stretch this to the full dynamic range (where for an 8-bit image), the transformation function is given by:
- Step 1: Subtracting shifts the minimum value to $0$.
- Step 2: Dividing by normalizes the range to .
- Step 3: Multiplying by scales it to the maximum possible gray level.
This process significantly improves the visual contrast of low-contrast images.
Explain Power-Law (Gamma) transformations and their role in image enhancement.
Power-Law (Gamma) Transformation is a non-linear operation used for contrast enhancement, defined by the mathematical expression:
Where:
- is the input pixel intensity (typically normalized to the range ).
- is the output pixel intensity.
- and (gamma) are positive constants.
Role in Image Enhancement:
- : The transformation maps a narrow range of dark input values into a wider range of output values, while compressing the bright values. This is used to enhance (lighten) images that are too dark (underexposed).
- : The transformation maps a narrow range of bright input values into a wider range of output values, while compressing the dark values. This is used to enhance (darken) images that are washed out or too bright (overexposed).
- : Reduces to the identity transformation (linear mapping).
Gamma correction is also widely used in displaying images accurately on monitors, as cathode ray tubes (CRTs) and some LCDs have a non-linear intensity response.
What is a histogram in the context of image processing? Explain the procedure for Histogram Equalization.
Histogram of a digital image with intensity levels in the range is a discrete function , where is the -th intensity value and is the number of pixels in the image with intensity .
Histogram Equalization is a technique used to improve contrast in images by spreading out the most frequent intensity values. It flattens the histogram, stretching the dynamic range of the image.
Procedure:
- Calculate the Probability Density Function (PDF): Find the probability of occurrence of each gray level :
where is the total number of pixels in the image, and . - Calculate the Cumulative Distribution Function (CDF): Compute the cumulative probability for each gray level:
- Multiply by Maximum Gray Level: Scale the CDF values to the maximum gray level :
- Round off to Nearest Integer: Since gray levels must be integers, round the mapped values to the nearest integer.
- Map New Values: Replace the original gray levels in the image with the new equalized gray levels .
Differentiate between Histogram Equalization and Histogram Specification (Matching).
Histogram Equalization:
- Objective: To automatically enhance contrast by transforming the image such that its output histogram is approximately uniform (flat).
- Control: It provides no user control over the shape of the final histogram. The algorithm strictly attempts to flatten it.
- Application: Useful for general contrast enhancement when a globally uniform distribution of gray levels is desired.
Histogram Specification (Matching):
- Objective: To transform an image so that its histogram matches a specific, user-defined target histogram.
- Control: Gives complete control to the user to specify the desired shape of the histogram based on the specific requirements of the application.
- Application: Useful in situations where histogram equalization produces washed-out images, or when standardizing multiple images to have the same lighting characteristics (e.g., matching a newly captured image to a reference image).
- Process: Involves equalizing the original image, equalizing the target histogram, and then mapping the equalized original to the inverse transformation of the target.
Explain the concept of local histogram processing and why it is preferred over global processing in certain scenarios.
Local Histogram Processing:
Instead of calculating a single histogram for the entire image (global), local histogram processing involves defining a neighborhood (e.g., a or window) and moving its center from pixel to pixel. At each location, the histogram of the points in the neighborhood is computed, and a histogram equalization or specification transformation function is obtained.
Why it is preferred over Global Processing:
- Preserving Local Detail: Global histogram equalization is based on the intensity distribution of the entire image. If an image has large areas of dark backgrounds and a small object of interest that is also dark, global equalization might not enhance the small object sufficiently. Local processing enhances contrast based on local context, bringing out hidden details in small areas.
- Handling Variable Illumination: When an image suffers from non-uniform illumination (e.g., a shadow cast over half the image), global equalization fails because the statistics of the bright and dark halves are completely different. Local histogram processing adapts to the varying illumination by only considering the immediate neighborhood of a pixel.
Define image noise. What are the common sources of noise in digital images?
Image Noise is defined as random, unwanted variation in brightness or color information in an image. It degrades image quality and represents a random error in the image signal.
Common Sources of Noise:
- Image Acquisition (Sensors): The primary source of noise. Digital camera sensors are affected by environmental conditions (like high temperature, which increases electron agitation, causing thermal noise) and low light conditions (leading to amplification of the signal and thereby noise).
- Transmission Errors: Noise can be introduced during the transmission of image data over a noisy channel (e.g., wireless networks, deep-space communication), often leading to impulse (salt-and-pepper) noise due to bit errors.
- Digitization/Quantization: Converting an analog image into a digital format introduces quantization noise, which is the error between the actual analog value and the closest quantized digital value.
- Interference: Electromagnetic interference during the analog-to-digital conversion process can also introduce structured noise into the image.
Explain Gaussian noise and Salt-and-Pepper (Impulse) noise in digital images. Provide their Probability Density Functions (PDF).
1. Gaussian Noise (Normal Noise):
- Explanation: It is an idealized form of white noise caused by random fluctuations in the signal. It arises during acquisition (e.g., sensor illumination levels, temperature). The intensity variations are distributed normally (bell curve).
- PDF: The probability density function of a Gaussian random variable is given by:
Where represents the gray level, is the mean, and is the standard deviation.
2. Salt-and-Pepper Noise (Impulse Noise):
- Explanation: This noise manifests as randomly scattered white (salt) and black (pepper) pixels over the image. It is typically caused by malfunctioning camera sensor cells, memory cell errors, or transmission channel errors.
- PDF: The probability density function is discrete and given by:
If , intensity appears as a light dot (salt) and appears as a dark dot (pepper). If , the noise is zero.
Describe Uniform noise and Exponential noise with their respective mathematical models.
1. Uniform Noise:
- Description: The intensity of the noise is uniformly distributed over a specific range. It is often used to simulate quantization noise (the error introduced when converting analog signals to digital values).
- Mathematical Model (PDF):
The mean is and the variance is .
2. Exponential Noise:
- Description: This type of noise has an exponential distribution. It frequently occurs in laser imaging (speckle noise) and synthetic aperture radar (SAR) images.
- Mathematical Model (PDF):
where . The mean is and the variance is .
What is spatial domain filtering? Differentiate between linear and non-linear spatial filters with examples.
Spatial Domain Filtering involves directly manipulating the pixels of an image. A neighborhood operation is performed on a pixel and its immediate neighbors to compute a new value for that pixel. This is usually done using a mask (or kernel), which is convolved or correlated with the image.
Differentiation:
Linear Spatial Filters:
- Mechanism: The output pixel is a linear combination (sum of products) of the pixels in the neighborhood and the coefficients of the filter mask.
- Characteristics: They blur the image, smooth out noise, but also blur sharp edges and details.
- Examples:
- Mean Filter (Average Filter): Replaces the center pixel with the average of all pixels in the neighborhood.
- Gaussian Filter: A weighted average filter that gives more importance to the central pixel, resulting in smoother blurring.
Non-Linear Spatial Filters:
- Mechanism: The output pixel is determined by a non-linear mathematical operation (like ranking, max, min, or median) applied to the pixels within the neighborhood.
- Characteristics: Excellent for removing specific types of noise without severely blurring edges.
- Examples:
- Median Filter: Replaces the center pixel with the median value of the neighborhood. (Great for salt-and-pepper noise).
- Max/Min Filter: Replaces the center pixel with the maximum or minimum value in the neighborhood.
Explain the working principle of the Median filter. Why is it particularly effective against Salt-and-Pepper noise?
Working Principle of the Median Filter:
The median filter is a non-linear spatial filter. It slides a window (e.g., ) over the image. At each position, it reads all the pixel values within the window, sorts them in ascending or descending order, and selects the middle (median) value. The original center pixel's value is then replaced by this median value.
Why it is effective against Salt-and-Pepper noise:
- Salt-and-pepper noise consists of extreme pixel values (either maximum white or maximum black).
- When the values in a neighborhood containing noise are sorted, the extreme noise values (the salt and the pepper) will inevitably end up at the very beginning or the very end of the sorted list.
- Because the median filter picks the middle value of the sorted list, it completely bypasses these extreme noise values, selecting a value representative of the actual image data.
- Unlike the mean (average) filter, which gets skewed by extreme values and causes blurring, the median filter effectively removes the impulse noise while preserving sharp edges.
Describe the application of the Laplacian operator for image sharpening. Write the typical mask used.
Application of the Laplacian Operator:
The Laplacian is a 2D isotropic measure of the 2nd spatial derivative of an image. Because it highlights regions of rapid intensity change, it is extensively used for edge detection and image sharpening.
To sharpen an image, the Laplacian highlights the fine details and edges. However, because it is a derivative filter, it produces an image with grayish edge lines and a dark background. To get a sharpened image, the Laplacian output is subtracted from (or added to, depending on the center mask value) the original image. This adds the high-frequency edge information back into the original image, making edges crisper.
Formula for sharpening:
where is original image, is Laplacian, and is a constant (usually $1$ or ).
Typical Masks:
A standard Laplacian mask (where for sharpening) is:
Another variation including diagonals:
Explain the fundamental concept of frequency domain filtering. Outline the basic steps involved in filtering an image in the frequency domain.
Concept of Frequency Domain Filtering:
Frequency domain filtering is based on the Fourier Transform. An image is transformed from the spatial domain (pixels) into the frequency domain (sinusoids). In the frequency domain, low frequencies correspond to smooth regions and global shapes, while high frequencies correspond to sharp details, edges, and noise. Filtering is achieved by multiplying the frequency representation of the image by a filter transfer function, which selectively attenuates or amplifies specific frequency bands.
Basic Steps:
- Preprocessing: Obtain the input image of size . Pad the image to size (where and ) to prevent wraparound error during circular convolution.
- Centering: Multiply the padded image by to center its Fourier transform.
- Fourier Transform: Compute the Discrete Fourier Transform (DFT), , of the image.
- Filter Construction: Construct a real, symmetric filter transfer function of size with its center at .
- Multiplication: Multiply the transform by the filter: .
- Inverse Fourier Transform: Compute the Inverse DFT of to get back to the spatial domain.
- Post-processing: Extract the real part of the inverse transform, multiply by to undo the centering, and crop the image back to its original size.
Differentiate between Ideal Lowpass Filters (ILPF) and Butterworth Lowpass Filters (BLPF). Discuss the ringing effect.
Ideal Lowpass Filter (ILPF):
- Definition: An ILPF perfectly cuts off all high-frequency components beyond a specified cutoff frequency , while allowing all low frequencies to pass unchanged.
- Profile: It has a sharp, rectangular transition between the passband (values = 1) and the stopband (values = 0).
- Ringing Effect: Because of the sharp discontinuity in the frequency domain, the inverse Fourier transform (its spatial representation) resembles a sinc function, which oscillates. When convolved with the image, this causes a severe "ringing" artifact (visible ripples around sharp edges).
Butterworth Lowpass Filter (BLPF):
- Definition: A BLPF uses a smooth transfer function that lacks a sharp discontinuity. The transition between passband and stopband is controlled by the filter's order .
- Profile: Formula: . It curves smoothly down to zero.
- Ringing Effect: For low orders (like or $2$), the transition is smooth enough that ringing is completely imperceptible. As the order approaches infinity, the BLPF approaches the ideal filter, and ringing becomes apparent. Hence, it is preferred over ILPF as it minimizes unwanted visual artifacts.