Unit 2 - Notes

INT306 6 min read

Unit 2: Relational query language

Relational query languages are used to interact with Relational Database Management Systems (RDBMS). Structured Query Language (SQL) is the standard language for managing and manipulating relational databases. SQL is divided into several sub-languages based on the nature of the operations they perform.


1. SQL Sub-Languages: DDL, DML, DCL, and TCL

Data Definition Language (DDL)

DDL consists of SQL commands that can be used to define the database schema. It deals with descriptions of the database schema and is used to create and modify the structure of database objects in the database.

  • CREATE: Used to create the database or its objects (like table, index, function, views, store procedure, and triggers).
    SQL
        CREATE TABLE Students (
            StudentID INT,
            Name VARCHAR(100),
            Age INT
        );
        
  • ALTER: Used to alter the structure of the database (e.g., adding a new column, modifying an existing column, dropping a column).
    SQL
        ALTER TABLE Students ADD Email VARCHAR(255);
        ALTER TABLE Students DROP COLUMN Age;
        
  • DROP: Used to delete objects from the database entirely. This removes the table structure along with its data.
    SQL
        DROP TABLE Students;
        
  • TRUNCATE: Used to remove all records from a table, including all spaces allocated for the records are removed. The table structure remains intact.
    SQL
        TRUNCATE TABLE Students;
        
  • RENAME: Used to rename an object existing in the database.
    SQL
        RENAME TABLE Students TO CollegeStudents;
        

Data Manipulation Language (DML)

DML consists of SQL commands that deal with the manipulation of data present in the database. This includes most of the SQL statements used in day-to-day operations. (Note: SELECT is technically a Data Query Language (DQL) command, but is widely categorized under DML).

  • SELECT: Used to retrieve data from the database.
    SQL
        SELECT Name, Age FROM Students;
        
  • INSERT: Used to insert new data/records into a table.
    SQL
        INSERT INTO Students (StudentID, Name, Age) VALUES (1, 'Alice', 20);
        
  • UPDATE: Used to modify existing data within a table. Always use a WHERE clause to avoid updating all rows.
    SQL
        UPDATE Students SET Age = 21 WHERE StudentID = 1;
        
  • DELETE: Used to delete records from a database table. Always use a WHERE clause to avoid deleting all rows.
    SQL
        DELETE FROM Students WHERE StudentID = 1;
        

Data Control Language (DCL)

DCL includes commands that primarily deal with the rights, permissions, and other controls of the database system.

  • GRANT: Gives users access privileges to the database.
    SQL
        GRANT SELECT, INSERT ON Students TO 'username';
        
  • REVOKE: Withdraws the user's access privileges given by the GRANT command.
    SQL
        REVOKE INSERT ON Students FROM 'username';
        

Transaction Control Language (TCL)

TCL commands deal with the transaction within the database. A transaction is a single logical unit of work that contains one or more SQL statements.

  • COMMIT: Commits a transaction, saving all changes permanently to the database.
    SQL
        COMMIT;
        
  • ROLLBACK: Rolls back a transaction in case of any error occurs, reverting the database to its last committed state.
    SQL
        ROLLBACK;
        
  • SAVEPOINT: Sets a savepoint within a transaction to which you can later roll back without rolling back the entire transaction.
    SQL
        SAVEPOINT SP1;
        ROLLBACK TO SP1;
        

2. Database Keys

Keys are crucial in a relational database. They are used to uniquely identify any record or row of data from the table. They are also used to establish and identify relationships between tables.

  • Super Key: A single key or a combination of multiple keys that can uniquely identify a row in a table. All candidate keys are super keys, but not all super keys are candidate keys.
  • Candidate Key: A minimal super key. It is an attribute or set of attributes that can uniquely identify a tuple. A table can have multiple candidate keys. (e.g., Email or SSN or EmployeeID).
  • Primary Key: The candidate key selected by the database designer to uniquely identify the tuples in a table. There can be only one Primary Key per table. It cannot contain NULL values.
  • Alternate Key: The candidate keys that are not selected as the primary key are known as alternate keys.
  • Foreign Key: An attribute or a collection of attributes in one table that refers to the Primary Key in another table. It is used to link two tables together and enforce referential integrity.
  • Composite Key: A primary key composed of multiple columns used to identify a record uniquely. Used when a single column is not sufficient to uniquely identify a row.

3. Integrity Constraints

Integrity constraints are a set of rules used to maintain the quality of information. They ensure that the data insertion, updating, and other processes are performed in such a way that data integrity is not affected.

  • NOT NULL Constraint: Ensures that a column cannot have a NULL value.
  • UNIQUE Constraint: Ensures that all values in a column are different. (Unlike Primary Key, a Unique constraint can accept a single NULL value in some SQL dialects).
  • PRIMARY KEY Constraint: Uniquely identifies each record in a database table. (Implies NOT NULL and UNIQUE).
  • FOREIGN KEY Constraint: Prevents actions that would destroy links between tables. Ensures referential integrity.
  • CHECK Constraint: Ensures that all values in a column satisfy a specific condition.
  • DEFAULT Constraint: Sets a default value for a column when no value is specified.

Example combining constraints:

SQL
CREATE TABLE Employees (
    EmpID INT PRIMARY KEY,                       -- Primary Key
    FirstName VARCHAR(50) NOT NULL,              -- Not Null
    Email VARCHAR(100) UNIQUE,                   -- Unique
    Age INT CHECK (Age >= 18),                   -- Check
    DepartmentID INT,
    HireDate DATE DEFAULT CURRENT_DATE,          -- Default
    FOREIGN KEY (DepartmentID) REFERENCES Departments(DeptID) -- Foreign Key
);


4. SQL Basic Operations

SQL provides a robust set of operations to filter, sort, group, and aggregate data.

Filtering Data (WHERE Clause)

The WHERE clause is used to filter records and extract only those that fulfill a specified condition.

  • Operators: =, <, >, <=, >=, <> (Not equal).
  • Logical Operators: AND, OR, NOT.
  • Pattern Matching (LIKE): Used with wildcards (% for zero or more characters, _ for a single character).
  • List Matching (IN): Allows specifying multiple values in a WHERE clause.
  • Range Matching (BETWEEN): Selects values within a given range (inclusive).

SQL
SELECT * FROM Employees 
WHERE Age BETWEEN 25 AND 40 AND DepartmentID IN (1, 3);

SELECT * FROM Employees 
WHERE FirstName LIKE 'A%'; -- Starts with 'A'

Sorting Data (ORDER BY)

The ORDER BY keyword is used to sort the result-set in ascending or descending order. Sorts in ascending order by default (ASC).

SQL
SELECT * FROM Employees 
ORDER BY Age DESC, FirstName ASC;

Aggregate Functions

SQL provides aggregate functions to perform a calculation on a set of values and return a single value.

  • COUNT(): Returns the number of rows.
  • SUM(): Returns the total sum of a numeric column.
  • AVG(): Returns the average value of a numeric column.
  • MIN(): Returns the smallest value.
  • MAX(): Returns the largest value.

SQL
SELECT COUNT(EmpID), AVG(Age) FROM Employees;

Grouping Data (GROUP BY and HAVING)

  • GROUP BY: Groups rows that have the same values into summary rows (often used with aggregate functions).
  • HAVING: Added to SQL because the WHERE keyword cannot be used with aggregate functions. It filters groups created by the GROUP BY clause.

SQL
SELECT DepartmentID, COUNT(EmpID) as EmployeeCount, AVG(Age) as AverageAge
FROM Employees
GROUP BY DepartmentID
HAVING COUNT(EmpID) > 5;

(This query groups employees by their department, calculates the employee count and average age, but only displays departments that have more than 5 employees.)