Unit 6 - Notes
CSE306
Unit 6: TRANSPORT LAYER, APPLICATION LAYER & RECENT TRENDS
1. Transport Layer Services
The Transport Layer is the fourth layer of the OSI model. Its primary function is to provide process-to-process communication, ensuring that data transferred between two devices reaches the specific application process (e.g., a browser tab or an email client) running on those devices.
Core Services
-
Process-to-Process Delivery (Port Addressing):
- The Data Link layer handles node-to-node delivery (MAC addresses).
- The Network layer handles host-to-host delivery (IP addresses).
- The Transport layer handles delivery to a specific process using Port Numbers (16-bit addresses).
- Well-known ports: 0–1023 (e.g., HTTP: 80, HTTPS: 443).
-
Multiplexing and Demultiplexing:
- Multiplexing (Sender side): Gathering data chunks from different sockets, encapsulating each with a header, and passing segments to the network layer.
- Demultiplexing (Receiver side): Delivering received segments to the correct socket based on port numbers.
-
Segmentation and Reassembly:
- The transport layer divides the stream of data received from the application layer into manageable data units called Segments.
- Each segment contains a sequence number to allow the receiver to reassemble the message correctly and identify lost packets.
-
Flow Control:
- Prevents the sender from overwhelming the receiver if the receiver is processing data slower than the sender is transmitting.
- TCP uses a sliding window mechanism for this.
-
Error Control:
- Ensures the entire message arrives without error (damage, loss, or duplication).
- Mechanisms include checksums and Automatic Repeat Request (ARQ) retransmission strategies.
-
Connection Control:
- Connection-Oriented: Establishes a connection (handshake) before sending data (e.g., TCP).
- Connectionless: Sends data without setup; no guarantee of delivery (e.g., UDP).
2. TCP (Transmission Control Protocol)
TCP is a connection-oriented, reliable transport protocol. It provides a full-duplex, byte-stream service.
TCP Header Format
The TCP header is variable in length (minimum 20 bytes, maximum 60 bytes with options).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Field Descriptions:
- Source/Destination Port (16 bits each): Identifies the sending and receiving applications.
- Sequence Number (32 bits): If the SYN flag is set, this is the initial sequence number. Otherwise, it is the accumulated sequence number of the first data byte in this segment.
- Acknowledgment Number (32 bits): If the ACK flag is set, this is the value of the next sequence number the sender is expecting to receive.
- Data Offset (HLEN) (4 bits): Indicates the size of the TCP header in 32-bit words. Minimum value is 5 (5 * 4 = 20 bytes).
- Reserved (6 bits): For future use (set to 0).
- Control Flags (6 bits):
- URG: Urgent pointer field is significant.
- ACK: Acknowledgment field is significant.
- PSH: Push function (asks receiver to push data to application immediately).
- RST: Reset the connection (abort).
- SYN: Synchronize sequence numbers (initiate connection).
- FIN: No more data from sender (terminate connection).
- Window Size (16 bits): Used for flow control. Specifies the number of bytes the sender is willing to accept (Receive Window).
- Checksum (16 bits): Used for error detection (covers header, data, and a pseudo-header).
- Urgent Pointer (16 bits): Points to the end of urgent data if URG flag is set.
TCP Handshaking Operation
1. Connection Establishment (Three-Way Handshake)
Used to initialize sequence numbers and allocate buffers.
- SYN (Client → Server):
- The client sets the SYN flag to 1.
- It generates a random Initial Sequence Number (ISN), say
x. - State: Client enters
SYN-SENT.
- SYN + ACK (Server → Client):
- The server receives the SYN, sets SYN and ACK flags to 1.
- It acknowledges the client's ISN by sending
Ack = x + 1. - It generates its own ISN, say
y. - State: Server enters
SYN-RCVD.
- ACK (Client → Server):
- The client sets the ACK flag to 1.
- It acknowledges the server's ISN by sending
Ack = y + 1. - Sequence number is now
x + 1. - State: Connection Established.
2. Connection Termination (Four-Way Handshake)
Either side can initiate termination.
- FIN (Client → Server): Client sends a segment with FIN flag set. (Client enters
FIN-WAIT-1). - ACK (Server → Client): Server acknowledges the FIN. (Server enters
CLOSE-WAIT; Client entersFIN-WAIT-2). The connection is now "half-closed" (Server can still send data, but Client cannot). - FIN (Server → Client): Once the server is done sending data, it sends its own FIN. (Server enters
LAST-ACK). - ACK (Client → Server): Client acknowledges the server's FIN. (Client enters
TIME-WAITto ensure ACK reaches server; Server closes).
3. UDP (User Datagram Protocol)
UDP is a connectionless, unreliable transport protocol. It adds minimal addressing and checksumming to IP but provides no guarantees of delivery, ordering, or duplicate protection.
UDP Characteristics
- No Connection Setup: No handshaking delay (lower latency).
- Small Header: Only 8 bytes (lower overhead compared to TCP's 20 bytes).
- No Flow/Congestion Control: UDP pumps data as fast as desired (useful for real-time apps).
UDP Header Format (8 Bytes)
0 7 8 15 16 23 24 31
+--------+--------+--------+--------+
| Source | Destination |
| Port | Port |
+--------+--------+--------+--------+
| | |
| Length | Checksum |
+--------+--------+--------+--------+
- Source Port (16 bits): Optional. If used, indicates the port of the sending process.
- Destination Port (16 bits): Required. Indicates the port of the receiving process.
- Length (16 bits): The length of the user datagram (header + data) in bytes. Minimum value is 8.
- Checksum (16 bits): Used to detect errors over the header and data. Optional in IPv4, mandatory in IPv6.
4. Domain Name System (DNS)
DNS is the "phonebook" of the Internet. It translates human-friendly domain names (e.g., www.google.com) into IP addresses (e.g., 142.250.190.46).
Hierarchy
DNS uses a distributed, hierarchical database:
- Root DNS Servers: The top of the tree (represented by a dot
.). There are 13 logical root server IP addresses worldwide. - Top-Level Domain (TLD) Servers: Handle domains like
.com,.org,.edu,.uk. - Authoritative DNS Servers: Servers that actually hold the records for a specific domain (e.g., Google's DNS servers holding records for
google.com).
How Resolution Works
- Recursive Query: The client asks the Local DNS Server (Resolver) "What is the IP of X?" and expects the final answer. The resolver does the legwork.
- Iterative Query: The resolver asks a Root/TLD server, which replies "I don't know, but here is the IP of the server that might know." The resolver then queries that next server.
Common DNS Records
- A Record: Maps a hostname to an IPv4 address.
- AAAA Record: Maps a hostname to an IPv6 address.
- CNAME (Canonical Name): Maps an alias to a true domain name (e.g.,
www.example.com->example.com). - MX (Mail Exchange): Specifies the mail server responsible for accepting email messages on behalf of a domain.
- NS (Name Server): Specifies the authoritative name servers for the domain.
5. E-Mail (Electronic Mail)
Email architecture relies on three major components and specific protocols for sending and retrieving messages.
Architecture Components
- User Agent (UA): The software users interact with (e.g., Outlook, Gmail web interface). Used to compose and read mail.
- Mail Transfer Agent (MTA): Server software responsible for moving mail from the sender's server to the recipient's server.
- Message Access Agent (MAA): Pulls the email from the server to the client's inbox.
Protocols
-
SMTP (Simple Mail Transfer Protocol):
- Port: 25 (or 587 for submission).
- Function: Push protocol. Used by the sender's UA to send mail to the sender's MTA, and by the sender's MTA to transfer mail to the recipient's MTA.
- Text-based, command-response protocol (e.g.,
HELO,MAIL FROM,RCPT TO,DATA).
-
POP3 (Post Office Protocol v3):
- Port: 110.
- Function: Pull protocol. Downloads email from the server to the local device.
- Behavior: Usually deletes the mail from the server after downloading (poor for multi-device access).
-
IMAP (Internet Message Access Protocol):
- Port: 143.
- Function: Pull protocol.
- Behavior: Syncs email. Keeps mail on the server and organizes it in folders. The client manipulates the view of the server. Good for multi-device access.
-
MIME (Multipurpose Internet Mail Extensions): Extends SMTP (which originally only supported 7-bit ASCII text) to allow non-text attachments (images, video, audio).
6. FTP (File Transfer Protocol)
FTP is a standard network protocol used for the transfer of computer files between a client and server on a computer network.
Architecture
FTP uses a Client-Server model and is unique because it establishes two separate TCP connections:
- Control Connection (Port 21):
- Used for sending commands (e.g., login, change directory, list files) and receiving responses.
- Remains open for the entire session ("stateful").
- Data Connection (Port 20):
- Used strictly for transferring the actual file data.
- Opened and closed for each file transfer.
Modes of Operation
- Active Mode: The client opens a port and tells the server. The Server initiates the data connection back to the client. (Often blocked by client-side firewalls).
- Passive Mode: The server opens a random port and tells the client. The Client initiates the data connection to the server. (Firewall friendly).
7. Software Defined Wide Area Network (SD-WAN)
SD-WAN is a virtual WAN architecture that allows enterprises to leverage any combination of transport services (MPLS, LTE/5G, Broadband) to securely connect users to applications.
Key Concepts
- Decoupling Control and Data Planes:
- Unlike traditional routers where control logic is in every device, SD-WAN centralizes the control logic (Control Plane) in a software controller.
- The hardware devices (Data Plane) simply forward traffic based on policies from the controller.
- Transport Independence:
- Can overlay on top of any physical connection (MPLS, Internet, 4G). It creates a virtual overlay network.
- Application-Aware Routing:
- Intelligent path selection.
- Example: Critical VoIP traffic might be routed via MPLS, while bulk file backups are routed via cheap broadband. If the MPLS line fails or degrades (jitter/latency), the software automatically switches VoIP to broadband.
Benefits
- Cost Reduction: Reduces reliance on expensive MPLS lines by utilizing cheaper broadband.
- Performance: Optimizes traffic flow for cloud applications (Office 365, Salesforce) by allowing direct internet breakout rather than backhauling traffic to a data center.
- Agility: Zero-touch provisioning allows new branch offices to be set up quickly.
8. Content Delivery Network (CDN)
A CDN is a geographically distributed group of servers which work together to provide fast delivery of Internet content.
The Problem
If a server is in New York and a user is in Tokyo, data packets must travel a long distance (high latency). If thousands of users access the server simultaneously, the server may crash (bottleneck).
The Solution (CDN Architecture)
- Origin Server: The original source where the web application lives.
- Edge Servers (Points of Presence - PoPs): Servers located physically closer to end-users (e.g., inside ISPs or internet exchange points).
How it Works
- Caching: When a user requests content (like a Netflix video or a large image), the CDN stores a copy of that content on the Edge Server closest to the user.
- Request Routing: Subsequent requests from other users in that region are served directly from the Edge Server, not the Origin.
Benefits
- Reduced Latency: Content travels a shorter physical distance.
- Higher Availability/Scalability: CDNs can absorb massive spikes in traffic (e.g., DDoS attacks or viral events) without crashing the origin server.
- Bandwidth Savings: Offloads traffic from the origin server, reducing hosting costs.