Unit 2 - Notes
Unit 2: Relational query language
Relational query languages are used to interact with Relational Database Management Systems (RDBMS). Structured Query Language (SQL) is the standard language for managing and manipulating relational databases. SQL is divided into several sub-languages based on the nature of the operations they perform.
1. SQL Sub-Languages: DDL, DML, DCL, and TCL
Data Definition Language (DDL)
DDL consists of SQL commands that can be used to define the database schema. It deals with descriptions of the database schema and is used to create and modify the structure of database objects in the database.
- CREATE: Used to create the database or its objects (like table, index, function, views, store procedure, and triggers).
SQLCREATE TABLE Students ( StudentID INT, Name VARCHAR(100), Age INT ); - ALTER: Used to alter the structure of the database (e.g., adding a new column, modifying an existing column, dropping a column).
SQLALTER TABLE Students ADD Email VARCHAR(255); ALTER TABLE Students DROP COLUMN Age; - DROP: Used to delete objects from the database entirely. This removes the table structure along with its data.
SQLDROP TABLE Students; - TRUNCATE: Used to remove all records from a table, including all spaces allocated for the records are removed. The table structure remains intact.
SQLTRUNCATE TABLE Students; - RENAME: Used to rename an object existing in the database.
SQLRENAME TABLE Students TO CollegeStudents;
Data Manipulation Language (DML)
DML consists of SQL commands that deal with the manipulation of data present in the database. This includes most of the SQL statements used in day-to-day operations. (Note: SELECT is technically a Data Query Language (DQL) command, but is widely categorized under DML).
- SELECT: Used to retrieve data from the database.
SQLSELECT Name, Age FROM Students; - INSERT: Used to insert new data/records into a table.
SQLINSERT INTO Students (StudentID, Name, Age) VALUES (1, 'Alice', 20); - UPDATE: Used to modify existing data within a table. Always use a WHERE clause to avoid updating all rows.
SQLUPDATE Students SET Age = 21 WHERE StudentID = 1; - DELETE: Used to delete records from a database table. Always use a WHERE clause to avoid deleting all rows.
SQLDELETE FROM Students WHERE StudentID = 1;
Data Control Language (DCL)
DCL includes commands that primarily deal with the rights, permissions, and other controls of the database system.
- GRANT: Gives users access privileges to the database.
SQLGRANT SELECT, INSERT ON Students TO 'username'; - REVOKE: Withdraws the user's access privileges given by the GRANT command.
SQLREVOKE INSERT ON Students FROM 'username';
Transaction Control Language (TCL)
TCL commands deal with the transaction within the database. A transaction is a single logical unit of work that contains one or more SQL statements.
- COMMIT: Commits a transaction, saving all changes permanently to the database.
SQLCOMMIT; - ROLLBACK: Rolls back a transaction in case of any error occurs, reverting the database to its last committed state.
SQLROLLBACK; - SAVEPOINT: Sets a savepoint within a transaction to which you can later roll back without rolling back the entire transaction.
SQLSAVEPOINT SP1; ROLLBACK TO SP1;
2. Database Keys
Keys are crucial in a relational database. They are used to uniquely identify any record or row of data from the table. They are also used to establish and identify relationships between tables.
- Super Key: A single key or a combination of multiple keys that can uniquely identify a row in a table. All candidate keys are super keys, but not all super keys are candidate keys.
- Candidate Key: A minimal super key. It is an attribute or set of attributes that can uniquely identify a tuple. A table can have multiple candidate keys. (e.g.,
EmailorSSNorEmployeeID). - Primary Key: The candidate key selected by the database designer to uniquely identify the tuples in a table. There can be only one Primary Key per table. It cannot contain NULL values.
- Alternate Key: The candidate keys that are not selected as the primary key are known as alternate keys.
- Foreign Key: An attribute or a collection of attributes in one table that refers to the Primary Key in another table. It is used to link two tables together and enforce referential integrity.
- Composite Key: A primary key composed of multiple columns used to identify a record uniquely. Used when a single column is not sufficient to uniquely identify a row.
3. Integrity Constraints
Integrity constraints are a set of rules used to maintain the quality of information. They ensure that the data insertion, updating, and other processes are performed in such a way that data integrity is not affected.
- NOT NULL Constraint: Ensures that a column cannot have a NULL value.
- UNIQUE Constraint: Ensures that all values in a column are different. (Unlike Primary Key, a Unique constraint can accept a single NULL value in some SQL dialects).
- PRIMARY KEY Constraint: Uniquely identifies each record in a database table. (Implies NOT NULL and UNIQUE).
- FOREIGN KEY Constraint: Prevents actions that would destroy links between tables. Ensures referential integrity.
- CHECK Constraint: Ensures that all values in a column satisfy a specific condition.
- DEFAULT Constraint: Sets a default value for a column when no value is specified.
Example combining constraints:
CREATE TABLE Employees (
EmpID INT PRIMARY KEY, -- Primary Key
FirstName VARCHAR(50) NOT NULL, -- Not Null
Email VARCHAR(100) UNIQUE, -- Unique
Age INT CHECK (Age >= 18), -- Check
DepartmentID INT,
HireDate DATE DEFAULT CURRENT_DATE, -- Default
FOREIGN KEY (DepartmentID) REFERENCES Departments(DeptID) -- Foreign Key
);
4. SQL Basic Operations
SQL provides a robust set of operations to filter, sort, group, and aggregate data.
Filtering Data (WHERE Clause)
The WHERE clause is used to filter records and extract only those that fulfill a specified condition.
- Operators:
=,<,>,<=,>=,<>(Not equal). - Logical Operators:
AND,OR,NOT. - Pattern Matching (
LIKE): Used with wildcards (%for zero or more characters,_for a single character). - List Matching (
IN): Allows specifying multiple values in aWHEREclause. - Range Matching (
BETWEEN): Selects values within a given range (inclusive).
SELECT * FROM Employees
WHERE Age BETWEEN 25 AND 40 AND DepartmentID IN (1, 3);
SELECT * FROM Employees
WHERE FirstName LIKE 'A%'; -- Starts with 'A'
Sorting Data (ORDER BY)
The ORDER BY keyword is used to sort the result-set in ascending or descending order. Sorts in ascending order by default (ASC).
SELECT * FROM Employees
ORDER BY Age DESC, FirstName ASC;
Aggregate Functions
SQL provides aggregate functions to perform a calculation on a set of values and return a single value.
COUNT(): Returns the number of rows.SUM(): Returns the total sum of a numeric column.AVG(): Returns the average value of a numeric column.MIN(): Returns the smallest value.MAX(): Returns the largest value.
SELECT COUNT(EmpID), AVG(Age) FROM Employees;
Grouping Data (GROUP BY and HAVING)
- GROUP BY: Groups rows that have the same values into summary rows (often used with aggregate functions).
- HAVING: Added to SQL because the
WHEREkeyword cannot be used with aggregate functions. It filters groups created by theGROUP BYclause.
SELECT DepartmentID, COUNT(EmpID) as EmployeeCount, AVG(Age) as AverageAge
FROM Employees
GROUP BY DepartmentID
HAVING COUNT(EmpID) > 5;
(This query groups employees by their department, calculates the employee count and average age, but only displays departments that have more than 5 employees.)