Unit 2 - Notes
INT250
Unit 2: Understanding Hard Disks and File Systems
1. Different Types of Disk Drives
Digital forensics requires a deep understanding of physical storage media, as the physical mechanism of data storage dictates how data is recovered, wiped, or analyzed.
A. Hard Disk Drives (HDD)
HDDs are electromechanical data storage devices that store digital data using magnetic storage.
- Components:
- Platters: Circular disks coated with magnetic material. Data is stored on both sides.
- Spindle: The motor that spins the platters (typically at 5400, 7200, or 10,000 RPM).
- Read/Write Heads: Magnetic heads that float on an air cushion (air bearing) nanometers above the platter surface.
- Actuator Arm: Moves the heads across the platters to access different tracks.
- Data Storage Physics: Data is recorded by magnetizing ferromagnetic material directionally to represent binary bits (0s and 1s).
- Forensic Implication: Deleted data remains physically on the disk until overwritten. Magnetic remanence (though largely theoretical on modern drives) refers to residual magnetic fields.
B. Solid State Drives (SSD)
SSDs use integrated circuit assemblies to store data persistently, typically using flash memory. They have no moving parts.
- Components:
- NAND Flash: Non-volatile storage memory cells.
- Controller: An embedded processor that manages data storage, error correction, and wear leveling.
- DRAM Cache: Used for buffering data.
- Key Concepts:
- Wear Leveling: The controller distributes writes across memory cells to prevent premature failure of specific blocks. This scatters data fragments physically, complicating physical extraction.
- TRIM: An OS command that tells the SSD which data blocks are no longer in use. Forensic Implication: When a file is deleted on a TRIM-enabled SSD, the drive may wipe the cells almost immediately, making file recovery impossible.
C. Hybrid Drives (SSHD)
- Combines a small amount of fast NAND flash memory with a traditional HDD.
- The firmware learns which files are most frequently accessed and caches them on the flash memory for speed.
- Forensic Implication: Artifacts may exist in the NAND cache that are not currently present or are different from the data on the spinning platter.
2. Logical Structure of a Disk
While the physical structure involves platters and chips, the logical structure involves how the Operating System (OS) addresses that space.
A. Addressing Methods
- CHS (Cylinder-Head-Sector): Old method. Addressed data by physical geometry.
- LBA (Logical Block Addressing): Modern method. The drive controller presents the storage as a linear sequence of blocks (usually 512 bytes or 4 KB), numbered 0 to N. The OS does not need to know the physical geometry.
B. Geometry Definitions
- Tracks: Concentric circles on a platter.
- Cylinders: The aggregate of all tracks at the same position across all platters.
- Sectors: The smallest addressable unit of storage (traditionally 512 bytes).
- Clusters (Allocation Units): A group of sectors managed by the file system. A file system reads/writes in clusters, not sectors.
- Note: If a file is 1 KB and the cluster size is 4 KB, 3 KB is wasted. This is called Slack Space.
C. Partitioning Schemes
- MBR (Master Boot Record):
- Located at the first sector (LBA 0).
- Supports max disk size of 2 TB.
- Supports only 4 primary partitions.
- GPT (GUID Partition Table):
- Part of the UEFI standard.
- Supports virtually unlimited disk size (Zettabytes).
- Stores multiple copies of the header for redundancy.
- Identifies partitions via a Globally Unique Identifier (GUID).
3. Booting Process of Windows and Linux
Understanding the boot process helps forensic analysts identify bootkits, rootkits, and startup configurations.
A. The Boot Phases (General)
- Power On: PSU supplies power.
- POST (Power-On Self-Test): Hardware check.
- BIOS/UEFI Handover: The firmware looks for a boot device.
B. Windows Boot Process (BIOS/MBR Legacy)
- BIOS loads the MBR code.
- MBR finds the Active Partition and loads its Volume Boot Record (VBR).
- VBR loads BOOTMGR (Windows Boot Manager).
- BOOTMGR reads
Boot Configuration Data(BCD). - BOOTMGR runs
winload.exe. - Winload loads the NTOSKRNL.EXE (Kernel) and
hal.dll. - SMSS.exe (Session Manager) starts, creating the user environment.
C. Windows Boot Process (UEFI/GPT)
- UEFI firmware reads the EFI System Partition (ESP).
- Loads
\EFI\Microsoft\Boot\bootmgfw.efi. - Proceeds to load Windows Kernel.
D. Linux Boot Process
- BIOS/UEFI loads the Bootloader (usually GRUB2).
- GRUB2 allows user selection and loads the Linux Kernel (
vmlinuz) and the initrd (Initial RAM Disk). - Kernel initializes hardware and mounts the root file system.
- Init System starts (the first process, PID 1).
- Old: SysVinit (sequentially runs scripts).
- Modern: Systemd (parallel service startup).
4. Various File Systems of Windows and Linux
The file system determines how data is indexed, stored, and retrieved.
A. Windows File Systems
- FAT (File Allocation Table) - FAT12/FAT16/FAT32:
- Structure: Reserved Area, FAT Area (linked list of clusters), Root Directory, Data Area.
- Limitations: FAT32 has a max file size of 4GB. No built-in security permissions.
- Forensics: When a file is deleted, the first character of the filename is replaced with
0xE5(sigma), and the FAT entry is zeroed out.
- exFAT (Extended FAT): Optimized for flash drives; removes the 4GB file limit.
- NTFS (New Technology File System): Standard for modern Windows.
- MFT (Master File Table): The heart of NTFS. Every file (including the MFT itself) is a record in the MFT.
- Attributes: Everything is an attribute (File Name, Standard Information, Data). Small files are stored entirely within the MFT entry (Resident Data).
- Journaling: Keeps a log ($LogFile) of changes to maintain integrity. Excellent for forensic timeline reconstruction.
- ADS (Alternate Data Streams): Allows hiding data behind a file (e.g.,
file.txt:hidden.exe).
B. Linux File Systems
- ext (Extended File System) - ext2/ext3/ext4:
- Inodes (Index Nodes): The fundamental metadata structure. Contains owner, permissions, timestamps, and pointers to data blocks. The filename is not in the inode; it is in the directory entry.
- Superblock: Contains global file system info (block size, total blocks).
- Journalling: ext3 and ext4 act as journaling file systems.
- XFS: High-performance 64-bit journaling file system, common in enterprise Linux (RHEL).
- Btrfs (B-tree FS): Focuses on fault tolerance, repair, and snapshotting.
5. Examining File System Using Autopsy
Autopsy is an open-source digital forensics platform and a GUI for The Sleuth Kit (TSK).
A. Key Functions
- Case Management: Creates a structured environment to hold disk images and analysis results.
- Ingest Modules: Automated scripts that run against the data source:
- Hash Lookup: Identifies known bad (malware) or known good (OS files) using MD5/SHA1.
- Keyword Search: Indexes text for searching.
- Recent Activity: Extracts registry data, browser history, and recent documents.
- File Type Identification: Identifies files by magic numbers/signatures, not just extensions.
B. Analysis Views
- Tree View: Navigates the file hierarchy (including deleted files, often marked with a red 'X').
- Result Viewer: Displays detailed lists of artifacts (e.g., "Web History", "Installed Programs").
- Content Viewer:
- Hex: View raw binary.
- Text: View ASCII/Unicode strings.
- Metadata: View Inode/MFT data, timestamps (MAC times).
6. Understanding Storage Systems
Forensic analysts often encounter storage configurations more complex than a single hard drive.
A. RAID (Redundant Array of Independent Disks)
Combines multiple physical disks into a single logical unit.
- RAID 0 (Striping): Data splits across drives. Fast, but no redundancy. If one drive fails, all data is lost.
- RAID 1 (Mirroring): Data is duplicated on two drives. High redundancy.
- RAID 5 (Striping with Parity): Requires min. 3 drives. Distributed parity allows recovery if one drive fails.
- Forensic Implication: To analyze a RAID array, the analyst must acquire all disks and reconstruct the RAID virtually using software (like EnCase or X-Ways) using the correct stripe size and order.
B. NAS (Network Attached Storage)
- A dedicated file storage appliance connected to a network.
- Operates at the File Level (NFS, SMB/CIFS).
- Appears as a mapped network drive.
C. SAN (Storage Area Network)
- A dedicated high-speed network that provides block-level network access to storage.
- Operates at the Block Level (iSCSI, Fibre Channel).
- Appears to the server as a local disk, not a network drive.
7. Understanding Encoding Standards and Hex Editors
A. Encoding Standards
Computers store data as bits; encoding maps these bits to human-readable characters.
- ASCII: 7-bit encoding. Represents English characters (0-127).
- Unicode: Handles global languages.
- UTF-8: Variable width (1 to 4 bytes). Standard for the web.
- UTF-16: Uses 2 or 4 bytes. Common in Windows environments.
- Endianness: The order of bytes.
- Big Endian: Most significant byte first (e.g., network traffic).
- Little Endian: Least significant byte first (e.g., Intel/Windows).
0x1234is stored as34 12.
B. Hex Editors
Tools that allow a user to view and edit the raw binary data of a file. (e.g., HxD, WinHex).
- Hexadecimal Notation: Base-16 system (0-9, A-F). Used because it represents binary data more concisely (1 Hex digit = 4 bits; 2 Hex digits = 1 Byte).
- File Signatures (Magic Numbers): The first few bytes of a file that identify its format, regardless of the file extension.
- JPEG:
FF D8 FF - PNG:
89 50 4E 47 - ZIP/docx/xlsx:
50 4B 03 04(PK..) - PDF:
25 50 44 46(%PDF)
- JPEG:
- Forensic Use:
- Carving: Recovering files based on headers/footers when the file system table is corrupted.
- Verification: Checking if a user changed a file extension to hide data (e.g., renaming
bomb_plans.doctoholiday.jpg). A hex editor reveals the true header.