Unit 5 - Notes
INT222
Unit 5: Introduction to PostgreSQL
1. Introduction to PostgreSQL Database
1.1 Overview
PostgreSQL (pronounced "post-gress-Q-L"), often simply referred to as Postgres, is a powerful, open-source object-relational database management system (ORDBMS). It has earned a strong reputation for reliability, feature robustness, and performance.
Unlike standard Relational Database Management Systems (RDBMS) like MySQL, PostgreSQL is Object-Relational. This means it supports features typically associated with object-oriented languages, such as table inheritance and function overloading, while adhering to standard SQL relational concepts.
1.2 Key Features and Capabilities
- Open Source: It is free to use and modify under the PostgreSQL License (similar to MIT/BSD).
- ACID Compliance: PostgreSQL is fully ACID (Atomicity, Consistency, Isolation, Durability) compliant, ensuring data validity and reliability even in the event of errors or power failures.
- Extensibility: Users can define their own data types, custom functions, operators, and procedural languages (PL/pgSQL, PL/Python, etc.).
- Concurrency: It uses Multi-Version Concurrency Control (MVCC). This allows readers to access data without being blocked by writers, and vice versa, significantly improving performance in high-traffic environments.
- JSON Support: It offers advanced support for JSON and JSONB (Binary JSON), allowing it to function as a NoSQL database when necessary.
- Standards Compliance: It adheres closely to the ANSI-SQL standards, making skills transferable to other systems.
1.3 Architecture
PostgreSQL uses a Client/Server model:
- Server Process (
postgres): Manages database files, accepts connections from client applications, and performs database actions on behalf of the clients. - Client Application: An application (web server, GUI tool, or command-line interface) that sends requests to the server.
2. PostgreSQL Installation
2.1 Prerequisites
- Hardware: Minimal requirements (runs on almost any modern hardware).
- OS: Cross-platform (Windows, Linux, macOS, Unix).
- Port: Defaults to port
5432.
2.2 Installation on Windows
- Download: Visit the official EnterpriseDB (EDB) website and download the Windows installer.
- Run Installer: Execute the
.exefile. - Select Components: Ensure PostgreSQL Server, pgAdmin 4 (GUI tool), and Command Line Tools are checked.
- Data Directory: Choose where data will be stored.
- Superuser Password: You will be prompted to set a password for the default superuser, named
postgres. Memorize this password. - Port: Leave as default
5432. - Locale: Select the default locale.
2.3 Installation on Linux (Ubuntu/Debian)
The preferred method is using the apt package manager.
# Update package index
sudo apt update
# Install postgresql and contrib (additional utilities)
sudo apt install postgresql postgresql-contrib
# Verify status
sudo systemctl status postgresql
2.4 Post-Installation Setup
After installation, a default user named postgres is created.
Accessing the Command Line Interface (psql):
- Windows: Search for "SQL Shell (psql)" in the start menu.
- Linux: Switch to the postgres user and run psql.
BASHsudo -i -u postgres psql
pgAdmin 4:
Most installations include pgAdmin 4, a web-based GUI for managing PostgreSQL databases. It allows you to visualize tables, run queries, and manage users without using the command line.
3. Basic SQL Commands
Structured Query Language (SQL) is used to communicate with the database. In PostgreSQL, SQL commands are not case-sensitive, but keywords are usually written in UPPERCASE for readability.
3.1 Common Data Types
Before creating tables, it is essential to understand the data types available:
- SERIAL: Auto-incrementing integer (commonly used for Primary Keys).
- INTEGER / INT: Standard integer numbers.
- VARCHAR(n): Variable-length character string with a limit of
n. - TEXT: Variable-length character string with unlimited length.
- BOOLEAN: True/False values.
- DATE: Date only (YYYY-MM-DD).
- TIMESTAMP: Date and time.
- NUMERIC(p,s): Exact numbers with specified precision (useful for currency).
3.2 Database Management Commands
Create a Database:
CREATE DATABASE my_company;
List all Databases (psql command):
\l
Connect to a Database (psql command):
\c my_company
Drop (Delete) a Database:
DROP DATABASE my_company;
3.3 Table Management Commands
Create Table Syntax:
CREATE TABLE table_name (
column1_name data_type constraint,
column2_name data_type constraint,
...
);
Example: Creating an Employees Table
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE,
department VARCHAR(50),
salary NUMERIC(10, 2),
hire_date DATE DEFAULT CURRENT_DATE
);
- PRIMARY KEY: Uniquely identifies each record (implies NOT NULL and UNIQUE).
- NOT NULL: Ensures the column cannot be empty.
- UNIQUE: Ensures all values in the column are different.
- DEFAULT: Provides a fallback value if none is specified.
Describe Table (psql command):
\d employees
Delete a Table:
DROP TABLE employees;
4. CRUD Operations
CRUD stands for Create, Read, Update, and Delete. These are the four fundamental operations implemented in persistent storage applications.
4.1 CREATE (Inserting Data)
The INSERT statement is used to add new rows to a table.
Syntax:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
Example 1: Single Row Insertion
INSERT INTO employees (first_name, last_name, email, department, salary)
VALUES ('John', 'Doe', 'john.doe@example.com', 'IT', 75000.00);
Note: We did not insert
id because it is type SERIAL (auto-increment), and we didn't insert hire_date because it has a DEFAULT value.
Example 2: Multiple Row Insertion
INSERT INTO employees (first_name, last_name, email, department, salary)
VALUES
('Alice', 'Smith', 'alice@example.com', 'HR', 60000.00),
('Bob', 'Johnson', 'bob@example.com', 'IT', 80000.00);
Returning Data after Insert:
PostgreSQL allows you to return the generated data immediately.
INSERT INTO employees (first_name, last_name, email)
VALUES ('Jane', 'Doe', 'jane@example.com')
RETURNING id, hire_date;
4.2 READ (Selecting Data)
The SELECT statement is used to fetch data from the database.
Select All Columns:
SELECT * FROM employees;
Select Specific Columns:
SELECT first_name, last_name, salary FROM employees;
Filtering with WHERE:
SELECT * FROM employees
WHERE department = 'IT';
Logical Operators (AND, OR):
SELECT * FROM employees
WHERE department = 'IT' AND salary > 78000;
Sorting Results (ORDER BY):
SELECT * FROM employees
ORDER BY salary DESC; -- Descending order (High to Low)
Limiting Results (LIMIT/OFFSET):
Useful for pagination.
SELECT * FROM employees
LIMIT 5 OFFSET 0; -- Get first 5 rows
Pattern Matching (LIKE):
%: Matches any sequence of characters._: Matches any single character.
-- Find employees whose last name starts with 'S'
SELECT * FROM employees
WHERE last_name LIKE 'S%';
4.3 UPDATE (Modifying Data)
The UPDATE statement modifies existing records.
Important: Always use a WHERE clause. If you omit it, ALL records in the table will be updated.
Syntax:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
Example:
Give 'John Doe' a raise and change his department.
UPDATE employees
SET salary = 82000.00, department = 'Senior IT'
WHERE email = 'john.doe@example.com';
4.4 DELETE (Removing Data)
The DELETE statement removes rows from a table.
Important: Like UPDATE, omitting the WHERE clause will wipe the entire table.
Syntax:
DELETE FROM table_name
WHERE condition;
Example:
Remove an employee by their unique ID.
DELETE FROM employees
WHERE id = 2;
Delete All Rows (but keep table structure):
DELETE FROM employees;
-- OR --
TRUNCATE TABLE employees; -- TRUNCATE is faster for emptying tables