Practice MCQ

Unit 6 - Notes

INT108 6 min read

Unit 6: Files and Exceptions; Regular Expressions

Part 1: File Handling

File handling is a crucial part of programming that allows the code to interact with permanent storage. In Python, file handling takes place with the built-in open() function.

1. Text Files

A text file stores data as a sequence of characters (strings). Python handles text files by decoding the bytes from the disk into a string format (usually using UTF-8 encoding).

Extension: Typically .txt, .py, .csv, etc.
Line Endings: Lines are terminated by the newline character \n.

2. Opening a File

To perform any operation on a file, it must first be opened.
Syntax: file_object = open("filename", "mode")

Common Modes:

'r': Read (Default). Opens for reading. Errors if file does not exist.
'w': Write. Opens for writing. Creates a new file or truncates (deletes content of) an existing file.
'a': Append. Opens for writing. The pointer is placed at the end of the file. Creates a new file if it does not exist.
'r+': Read and Write.

Best Practice (The with statement):
Using the with statement automatically closes the file, even if exceptions occur.

PYTHON

with open('example.txt', 'w') as file:
    file.write("Hello World")
# File is automatically closed here

3. Writing to a File

To write to a text file, we use the write() or writelines() methods.

write(string): Writes a single string to the file.
writelines(list_of_strings): Writes a list of strings to the file. Note: It does not automatically add newlines between strings.

PYTHON

lines = ["Line 1\n", "Line 2\n", "Line 3\n"]

with open('data.txt', 'w') as f:
    f.write("Header Line\n")
    f.writelines(lines)

4. Writing Variables

The write() method only accepts strings. To write integers, floats, or other objects, you must convert them to strings first using str() or f-strings.

PYTHON

score = 95
name = "Alice"

with open('results.txt', 'w') as f:
    # Incorrect: f.write(score) -> TypeError
    
    # Correct
    f.write(name + " scored " + str(score))
    # Or using f-string
    f.write(f"\n{name}: {score}")

5. Reading from a File

Python provides three main methods to read data:

read(size): Reads the entire file as a single string. If size is specified, it reads that many bytes.
readline(): Reads a single line (up to and including the \n).
readlines(): Reads all lines and returns them as a list of strings.

PYTHON

with open('data.txt', 'r') as f:
    content = f.read()  # Reads whole file
    print(content)

with open('data.txt', 'r') as f:
    for line in f:      # Memory efficient iteration
        print(line.strip())

6. Directories

To manage files, we often need to interact with directories (folders). This is handled by the os module.

os.getcwd(): Get Current Working Directory.
os.mkdir('folder_name'): Create a new directory.
os.listdir(): List all files and folders in the current directory.
os.path.join(): Intelligently join path components.

PYTHON

import os

current_dir = os.getcwd()
print(f"Current directory: {current_dir}")

# Create a directory if it doesn't exist
if not os.path.exists("new_folder"):
    os.mkdir("new_folder")

7. Pickling

Pickling is the process of converting a Python object hierarchy (like a dictionary or list) into a byte stream (serialization). Unpickling is the inverse operation. This is used to save complex data structures to a file.

Module: pickle
Mode: Must use binary modes ('wb' for write binary, 'rb' for read binary).

PYTHON

import pickle

data = {'name': 'John', 'age': 30, 'scores': [80, 90, 100]}

# Pickling (Saving)
with open('data.pickle', 'wb') as f:
    pickle.dump(data, f)

# Unpickling (Loading)
with open('data.pickle', 'rb') as f:
    loaded_data = pickle.load(f)

print(loaded_data['scores']) # Output: [80, 90, 100]

Part 2: Exception Handling

Exceptions are events that disrupt the normal flow of the program's execution. If not handled, the program crashes.

1. The `try-except` Block

The critical operation is placed inside the try block. If an error occurs, the flow transfers to the except block.

Syntax:

PYTHON

try:
    # Code that might raise an exception
except ExceptionType:
    # Code to run if exception occurs

2. Handling `ZeroDivisionError`

This error occurs when code attempts to divide a number by zero.

PYTHON

try:
    numerator = 10
    denominator = 0
    result = numerator / denominator
    print(result)
except ZeroDivisionError:
    print("Error: You cannot divide by zero.")

3. Handling `FileNotFoundError`

This error occurs when trying to open a file for reading that does not exist.

PYTHON

filename = "non_existent_file.txt"

try:
    with open(filename, 'r') as f:
        content = f.read()
except FileNotFoundError:
    print(f"Sorry, the file {filename} does not exist.")

4. The `else` Block

The else block is optional. It runs only if no exceptions were raised in the try block. It is useful for code that should only execute if the try block succeeded.

PYTHON

try:
    num = int(input("Enter a number: "))
except ValueError:
    print("That is not a number!")
else:
    # This runs only if the input was successfully converted to int
    print(f"You entered {num}. Great job!")

Part 3: Regular Expressions (Regex)

1. Concept of Regular Expressions

A Regular Expression (RegEx) is a sequence of characters that forms a search pattern. It is used for string searching and manipulation (validation, finding substrings, replacing text).

Module: re

2. Various Types of Regular Expressions

Regex relies on Metacharacters (characters with special meaning) and Special Sequences.

Common Metacharacters:

.: Matches any character except newline.
^: Starts with.
$: Ends with.
*: Zero or more occurrences.
+: One or more occurrences.
?: Zero or one occurrence.
[]: A set of characters (e.g., [a-z]).
|: Either/Or.
\: Escape character.

Special Sequences:

\d: Matches any digit (0-9).
\D: Matches any non-digit.
\w: Matches any alphanumeric character (a-z, 0-9, _).
\s: Matches whitespace (space, tab, newline).

3. Using the `match()` Function

The re.match() function checks for a match only at the beginning of the string. If the pattern is found elsewhere, match() returns None.

Returns: A match object if successful, None otherwise.

PYTHON

import re

pattern = r"Python"
text1 = "Python is fun"
text2 = "I love Python"

# Case 1
match1 = re.match(pattern, text1)
if match1:
    print("Match found at start!") # This prints

# Case 2
match2 = re.match(pattern, text2)
if match2:
    print("Match found!")
else:
    print("No match at the start.") # This prints

4. Web Scraping by using Regular Expressions

Web scraping involves extracting data from HTML content. While dedicated libraries like BeautifulSoup are common, Regular Expressions are powerful tools for finding specific patterns (like email addresses or hyperlinks) within raw HTML text.

Example Scenario: Extracting all email addresses from a snippet of HTML source code.

PYTHON

import re

html_content = """
<html>
<head><title>Contact Us</title></head>
<body>
    <p>Please support us at support@example.com.</p>
    <p>For sales inquiries, contact sales-team@business.org or admin@site.net.</p>
</body>
</html>
"""

# Regex breakdown:
# [\w\.-]+   : Matches word chars, dots, or dashes (username)
# @          : Matches the @ symbol
# [\w\.-]+   : Matches the domain name
# \.         : Matches the dot before the extension
# [a-zA-Z]+  : Matches the domain extension (com, org, etc.)
email_pattern = r"[\w\.-]+@[\w\.-]+\.[a-zA-Z]+"

# re.findall() returns a list of all non-overlapping matches
emails = re.findall(email_pattern, html_content)

print("Emails found:")
for email in emails:
    print(email)

Output:

TEXT

Emails found:
support@example.com
sales-team@business.org
admin@site.net

Unit 5

Unit 6 - Notes

Table of Contents

Unit 6: Files and Exceptions; Regular Expressions

Part 1: File Handling

1. Text Files

2. Opening a File

3. Writing to a File

4. Writing Variables

5. Reading from a File

6. Directories

7. Pickling

Part 2: Exception Handling

1. The try-except Block

2. Handling ZeroDivisionError

3. Handling FileNotFoundError

4. The else Block

Part 3: Regular Expressions (Regex)

1. Concept of Regular Expressions

2. Various Types of Regular Expressions

3. Using the match() Function

4. Web Scraping by using Regular Expressions

1. The `try-except` Block

2. Handling `ZeroDivisionError`

3. Handling `FileNotFoundError`

4. The `else` Block

3. Using the `match()` Function