Unit 5 - Notes

INT222

Unit 5: Introduction to PostgreSQL

1. Introduction to PostgreSQL Database

1.1 Overview

PostgreSQL (pronounced "post-gress-Q-L"), often simply referred to as Postgres, is a powerful, open-source object-relational database management system (ORDBMS). It has earned a strong reputation for reliability, feature robustness, and performance.

Unlike standard Relational Database Management Systems (RDBMS) like MySQL, PostgreSQL is Object-Relational. This means it supports features typically associated with object-oriented languages, such as table inheritance and function overloading, while adhering to standard SQL relational concepts.

1.2 Key Features and Capabilities

  • Open Source: It is free to use and modify under the PostgreSQL License (similar to MIT/BSD).
  • ACID Compliance: PostgreSQL is fully ACID (Atomicity, Consistency, Isolation, Durability) compliant, ensuring data validity and reliability even in the event of errors or power failures.
  • Extensibility: Users can define their own data types, custom functions, operators, and procedural languages (PL/pgSQL, PL/Python, etc.).
  • Concurrency: It uses Multi-Version Concurrency Control (MVCC). This allows readers to access data without being blocked by writers, and vice versa, significantly improving performance in high-traffic environments.
  • JSON Support: It offers advanced support for JSON and JSONB (Binary JSON), allowing it to function as a NoSQL database when necessary.
  • Standards Compliance: It adheres closely to the ANSI-SQL standards, making skills transferable to other systems.

1.3 Architecture

PostgreSQL uses a Client/Server model:

  1. Server Process (postgres): Manages database files, accepts connections from client applications, and performs database actions on behalf of the clients.
  2. Client Application: An application (web server, GUI tool, or command-line interface) that sends requests to the server.

2. PostgreSQL Installation

2.1 Prerequisites

  • Hardware: Minimal requirements (runs on almost any modern hardware).
  • OS: Cross-platform (Windows, Linux, macOS, Unix).
  • Port: Defaults to port 5432.

2.2 Installation on Windows

  1. Download: Visit the official EnterpriseDB (EDB) website and download the Windows installer.
  2. Run Installer: Execute the .exe file.
  3. Select Components: Ensure PostgreSQL Server, pgAdmin 4 (GUI tool), and Command Line Tools are checked.
  4. Data Directory: Choose where data will be stored.
  5. Superuser Password: You will be prompted to set a password for the default superuser, named postgres. Memorize this password.
  6. Port: Leave as default 5432.
  7. Locale: Select the default locale.

2.3 Installation on Linux (Ubuntu/Debian)

The preferred method is using the apt package manager.

BASH
# Update package index
sudo apt update

# Install postgresql and contrib (additional utilities)
sudo apt install postgresql postgresql-contrib

# Verify status
sudo systemctl status postgresql

2.4 Post-Installation Setup

After installation, a default user named postgres is created.

Accessing the Command Line Interface (psql):

  • Windows: Search for "SQL Shell (psql)" in the start menu.
  • Linux: Switch to the postgres user and run psql.
    BASH
        sudo -i -u postgres
        psql
        

pgAdmin 4:
Most installations include pgAdmin 4, a web-based GUI for managing PostgreSQL databases. It allows you to visualize tables, run queries, and manage users without using the command line.


3. Basic SQL Commands

Structured Query Language (SQL) is used to communicate with the database. In PostgreSQL, SQL commands are not case-sensitive, but keywords are usually written in UPPERCASE for readability.

3.1 Common Data Types

Before creating tables, it is essential to understand the data types available:

  • SERIAL: Auto-incrementing integer (commonly used for Primary Keys).
  • INTEGER / INT: Standard integer numbers.
  • VARCHAR(n): Variable-length character string with a limit of n.
  • TEXT: Variable-length character string with unlimited length.
  • BOOLEAN: True/False values.
  • DATE: Date only (YYYY-MM-DD).
  • TIMESTAMP: Date and time.
  • NUMERIC(p,s): Exact numbers with specified precision (useful for currency).

3.2 Database Management Commands

Create a Database:

SQL
CREATE DATABASE my_company;

List all Databases (psql command):

SQL
\l

Connect to a Database (psql command):

SQL
\c my_company

Drop (Delete) a Database:

SQL
DROP DATABASE my_company;

3.3 Table Management Commands

Create Table Syntax:

SQL
CREATE TABLE table_name (
    column1_name data_type constraint,
    column2_name data_type constraint,
    ...
);

Example: Creating an Employees Table

SQL
CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    department VARCHAR(50),
    salary NUMERIC(10, 2),
    hire_date DATE DEFAULT CURRENT_DATE
);

  • PRIMARY KEY: Uniquely identifies each record (implies NOT NULL and UNIQUE).
  • NOT NULL: Ensures the column cannot be empty.
  • UNIQUE: Ensures all values in the column are different.
  • DEFAULT: Provides a fallback value if none is specified.

Describe Table (psql command):

SQL
\d employees

Delete a Table:

SQL
DROP TABLE employees;


4. CRUD Operations

CRUD stands for Create, Read, Update, and Delete. These are the four fundamental operations implemented in persistent storage applications.

4.1 CREATE (Inserting Data)

The INSERT statement is used to add new rows to a table.

Syntax:

SQL
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);

Example 1: Single Row Insertion

SQL
INSERT INTO employees (first_name, last_name, email, department, salary)
VALUES ('John', 'Doe', 'john.doe@example.com', 'IT', 75000.00);

Note: We did not insert id because it is type SERIAL (auto-increment), and we didn't insert hire_date because it has a DEFAULT value.

Example 2: Multiple Row Insertion

SQL
INSERT INTO employees (first_name, last_name, email, department, salary)
VALUES 
    ('Alice', 'Smith', 'alice@example.com', 'HR', 60000.00),
    ('Bob', 'Johnson', 'bob@example.com', 'IT', 80000.00);

Returning Data after Insert:
PostgreSQL allows you to return the generated data immediately.

SQL
INSERT INTO employees (first_name, last_name, email) 
VALUES ('Jane', 'Doe', 'jane@example.com') 
RETURNING id, hire_date;

4.2 READ (Selecting Data)

The SELECT statement is used to fetch data from the database.

Select All Columns:

SQL
SELECT * FROM employees;

Select Specific Columns:

SQL
SELECT first_name, last_name, salary FROM employees;

Filtering with WHERE:

SQL
SELECT * FROM employees 
WHERE department = 'IT';

Logical Operators (AND, OR):

SQL
SELECT * FROM employees 
WHERE department = 'IT' AND salary > 78000;

Sorting Results (ORDER BY):

SQL
SELECT * FROM employees 
ORDER BY salary DESC; -- Descending order (High to Low)

Limiting Results (LIMIT/OFFSET):
Useful for pagination.

SQL
SELECT * FROM employees 
LIMIT 5 OFFSET 0; -- Get first 5 rows

Pattern Matching (LIKE):

  • %: Matches any sequence of characters.
  • _: Matches any single character.

SQL
-- Find employees whose last name starts with 'S'
SELECT * FROM employees 
WHERE last_name LIKE 'S%';

4.3 UPDATE (Modifying Data)

The UPDATE statement modifies existing records.

Important: Always use a WHERE clause. If you omit it, ALL records in the table will be updated.

Syntax:

SQL
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

Example:
Give 'John Doe' a raise and change his department.

SQL
UPDATE employees
SET salary = 82000.00, department = 'Senior IT'
WHERE email = 'john.doe@example.com';

4.4 DELETE (Removing Data)

The DELETE statement removes rows from a table.

Important: Like UPDATE, omitting the WHERE clause will wipe the entire table.

Syntax:

SQL
DELETE FROM table_name
WHERE condition;

Example:
Remove an employee by their unique ID.

SQL
DELETE FROM employees
WHERE id = 2;

Delete All Rows (but keep table structure):

SQL
DELETE FROM employees;
-- OR --
TRUNCATE TABLE employees; -- TRUNCATE is faster for emptying tables