Unit 2 - Notes

CSE205

Unit 2: Linked Lists

1. Introduction to Linked Lists

A Linked List is a linear data structure where elements are not stored at contiguous memory locations. Instead, the elements are linked using pointers. It forms a series of connected nodes, where each node stores the data and the address of the next node.

Structure of a Node

A node typically consists of two parts:

  1. Data Field: Stores the actual value or information.
  2. Link (or Next) Field: Stores the memory address (reference) of the next node in the sequence.

Comparison with Arrays

Feature Arrays Linked Lists
Memory Contiguous blocks. Fixed size (Static). Non-contiguous. Dynamic size.
Insertion/Deletion Expensive (requires shifting elements). Efficient (requires changing pointers).
Access Random access (O(1)) via index. Sequential access (O(n)).
Memory Utilization inefficient if array is not full. Efficient, but extra memory required for pointers.

2. Memory Representation and Allocation

Memory Representation

In memory, nodes are scattered. The logical order is maintained by the pointers.

  • Let INFO[K] represent the data of node K.
  • Let LINK[K] represent the address of the next node.
  • START (or HEAD) is a pointer variable that stores the address of the first node.
  • The last node's link field contains NULL (indicating the end of the list).

Visualization:

TEXT
[START] -> [Data|Address] -> [Data|Address] -> [Data|NULL]
Address:      1000              2050             3010

Memory Allocation

Linked lists utilize Dynamic Memory Allocation. Memory is reserved at runtime from the heap.

In C/C++:

  • Allocation (malloc): Allocates a block of memory of the specified size.
    C
        struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
        
  • Deallocation (free): Releases the memory back to the heap to prevent memory leaks.
    C
        free(ptr);
        
  • Structure Definition:
    C
        struct Node {
            int data;
            struct Node *next;
        };
        

3. Traversal

Traversal involves visiting every node in the list exactly once to perform a specific operation (e.g., printing, counting, searching).

Algorithm:

  1. Initialize a pointer PTR to START.
  2. Repeat steps 3 and 4 while PTR is not NULL.
  3. Process PTR->data (e.g., print it).
  4. Move PTR to the next node: PTR = PTR->next.
  5. Stop.

Time Complexity: O(n)


4. Insertion

Insertion allows adding new elements to the list. There are three primary scenarios.

A. Insertion at the Beginning

  1. Create a new node NEW_NODE.
  2. Set NEW_NODE->next = START.
  3. Set START = NEW_NODE.
    • Complexity: O(1)

B. Insertion at the End

  1. Create NEW_NODE and set NEW_NODE->next = NULL.
  2. If the list is empty, set START = NEW_NODE.
  3. Else, traverse to the last node (PTR).
  4. Set PTR->next = NEW_NODE.
    • Complexity: O(n) (unless a 'Tail' pointer is maintained).

C. Insertion after a Specific Node (or at specific position)

  1. Traverse to the desired node LOC.
  2. Create NEW_NODE.
  3. Set NEW_NODE->next = LOC->next.
  4. Set LOC->next = NEW_NODE.

5. Deletion

Deletion involves removing a node and freeing its memory.

A. Deletion from the Beginning

  1. If START is NULL, Underflow.
  2. Set TEMP = START.
  3. Set START = START->next.
  4. Free TEMP.

B. Deletion from the End

  1. Traverse to the second-to-last node (PREV), so that PREV->next is the last node.
  2. Set TEMP = PREV->next.
  3. Set PREV->next = NULL.
  4. Free TEMP.

C. Deletion of a Specific Node

To delete a node with value X:

  1. Maintain two pointers: PTR (current) and PREV (previous).
  2. Traverse until PTR->data == X.
  3. Set PREV->next = PTR->next.
  4. Free PTR.

6. Header Linked Lists

A Header Linked List contains a special node at the beginning called the Header Node.

  • The header node usually does not contain actual data.
  • It may contain metadata (e.g., number of nodes in the list).
  • The START pointer always points to the header node.

Types of Header Linked Lists

1. Grounded Header Linked List

In this type, the last node of the list contains NULL in its link field.

  • Structure: [Header Node] -> [Node A] -> [Node B] -> NULL
  • Traversal condition: Loop continues until PTR == NULL.

2. Circular Header Linked List

In this type, the last node points back to the Header Node instead of NULL.

  • Structure: [Header Node] -> [Node A] -> [Node B] --(link)--> [Header Node]
  • Traversal condition: Loop continues until PTR == START (or Header).
  • Advantage: It is easier to implement operations because the null pointer is removed; you never fall off the end of the list.

7. Two-Way (Doubly) Linked Lists

A Two-Way Linked List allows traversal in both directions (forward and backward).

Memory Representation

Each node contains three fields:

  1. PREV (Back Pointer): Address of the previous node.
  2. INFO (Data): The actual data.
  3. NEXT (Forward Pointer): Address of the next node.

Structure Definition:

C
struct Node {
    struct Node *prev;
    int data;
    struct Node *next;
};

Operations on Two-Way Linked Lists

A. Traversal

  • Forward: Start at HEAD, move using PTR = PTR->next.
  • Backward: Start at TAIL (or traverse to end), move using PTR = PTR->prev.

B. Insertion (After a specific node LOC)

Inserting NEW_NODE between LOC and LOC->next.

Steps:

  1. Create NEW_NODE and assign data.
  2. Set NEW_NODE->next = LOC->next. (Connect new node to successor)
  3. Set NEW_NODE->prev = LOC. (Connect new node to predecessor)
  4. If LOC->next is not NULL, set LOC->next->prev = NEW_NODE. (Link successor back to new node)
  5. Set LOC->next = NEW_NODE. (Link predecessor forward to new node)

C. Deletion (Specific node LOC)

Removing node LOC from the list.

Steps:

  1. Adjust Predecessor:
    LOC->prev->next = LOC->next
  2. Adjust Successor:
    If LOC->next is not NULL, set LOC->next->prev = LOC->prev.
  3. Free Memory:
    free(LOC)

Advantages of Doubly Linked Lists

  1. Can iterate in both directions.
  2. Deletion is more efficient if the pointer to the node to be deleted is given (because we have immediate access to the previous node via prev, strictly O(1) after finding the node).

Disadvantages

  1. Requires more memory per node (extra pointer).
  2. Insertion and deletion require more pointer manipulations (handling both next and prev).