Unit3 - Subjective Questions
CSE357 • Practice Questions with Detailed Answers
Differentiate between a File Processing System and a Database Management System (DBMS).
The differences between a File Processing System and a DBMS are as follows:
- Data Redundancy and Inconsistency:
- File System: Data is often duplicated in different files, leading to redundancy and inconsistency.
- DBMS: Controls redundancy through normalization and ensures data consistency.
- Data Access:
- File System: Accessing data requires writing specific programs; it is difficult to access data efficiently.
- DBMS: Provides sophisticated techniques (like SQL) to store and retrieve data efficiently.
- Data Integrity:
- File System: Integrity constraints are difficult to implement and maintain.
- DBMS: Enforces integrity constraints (e.g., Primary Keys, Foreign Keys) automatically.
- Security:
- File System: Security is generally weak; it is hard to restrict user access to specific data.
- DBMS: Provides a robust security subsystem with user authentication and authorization.
- Concurrency:
- File System: Does not support concurrent access well, leading to race conditions.
- DBMS: Handles concurrent access and transactions effectively using locking and isolation protocols.
Define RDBMS. What are the key characteristics that distinguish an RDBMS from a standard DBMS?
RDBMS (Relational Database Management System) is a type of DBMS based on the relational model introduced by E.F. Codd. In an RDBMS, data is represented in terms of tuples (rows) grouped into relations (tables).
Key Characteristics:
- Tabular Structure: Data is stored in tables (relations) consisting of rows and columns.
- Schema Independence: Physical data storage is independent of the logical data structure.
- Key Constraints: It utilizes Primary Keys for unique identification and Foreign Keys for establishing relationships between tables.
- Normalization: It supports normalization to reduce redundancy.
- SQL Support: It uses Structured Query Language (SQL) for data definition and manipulation.
- ACID Properties: It strictly follows Atomicity, Consistency, Isolation, and Durability for transactions.
Explain the terms Table, Field, and Record in the context of a Relational Database with an example.
In a Relational Database:
-
Table (Relation):
- A table is a collection of related data held in a structured format within a database. It consists of columns and rows.
- Example: A
Studenttable containing details of all students.
-
Field (Attribute/Column):
- A field represents a specific category of data within a table. It defines the type of data (integer, text, date) that can be stored.
- Example: In the
Studenttable,Student_ID,Name, andGradeare fields.
-
Record (Tuple/Row):
- A record is a single entry in a table. It contains specific data values for each field representing a single entity.
- Example: A row containing
{101, "Alice", "A"}is a record.
Visual Representation:
| Student_ID (Field) | Name (Field) | Grade (Field) | |
|---|---|---|---|
| 101 | Alice | A | <-- Record |
| 102 | Bob | B |
Categorize SQL commands into DDL, DML, and DCL. Give two examples for each.
SQL commands are categorized based on their functionality:
-
DDL (Data Definition Language):
- These commands are used to define or modify the structure of the database objects (schema).
- Examples:
CREATE: Creates a new table or database.ALTER: Modifies the structure of an existing table.DROP: Deletes a table or database.
-
DML (Data Manipulation Language):
- These commands are used for managing and manipulating data within the database tables.
- Examples:
INSERT: Adds new records to a table.UPDATE: Modifies existing records.SELECT: Retrieves data from the database.
-
DCL (Data Control Language):
- These commands deal with the rights, permissions, and other controls of the database system.
- Examples:
GRANT: Gives user's access privileges to the database.REVOKE: Withdraws user's access privileges.
Explain the concept of Database Normalization. Discuss the anomalies that may occur in an unnormalized database.
Database Normalization is the process of organizing data in a database. This involves creating tables and establishing relationships between those tables according to rules designed both to protect the data and to make the database more flexible by eliminating redundancy and inconsistent dependency.
Anomalies in Unnormalized Databases:
-
Insertion Anomaly:
- Occurs when certain attributes cannot be inserted into the database without the presence of other attributes.
- Example: In a combined
Student-Coursetable, we cannot add a new course unless a student enrolls in it.
-
Deletion Anomaly:
- Occurs when deleting a record causes the unintended loss of other data.
- Example: If a student is the only one enrolled in a specific course, deleting the student record might delete the course details entirely.
-
Update Anomaly:
- Occurs when data is redundant/duplicated. Updating a value in one location requires updates in all locations to maintain consistency.
- Example: If a professor's address changes, and it is stored in every student record they teach, failing to update all records leaves the database in an inconsistent state.
Describe the First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF) with necessary conditions.
1. First Normal Form (1NF):
- Condition: A relation is in 1NF if every attribute contains only atomic (indivisible) values. There should be no repeating groups or arrays.
- Goal: Eliminate duplicate columns from the same table and create separate tables for each group of related data.
2. Second Normal Form (2NF):
- Condition: A relation is in 2NF if it is in 1NF AND has no Partial Dependency.
- Explanation: No non-prime attribute (attribute not part of the candidate key) should be dependent on a proper subset of any candidate key. This applies to tables with composite primary keys.
- Goal:
3. Third Normal Form (3NF):
- Condition: A relation is in 3NF if it is in 2NF AND has no Transitive Dependency.
- Explanation: No non-prime attribute should be determined by another non-prime attribute.
- Rule: For a functional dependency , either:
- is a Super Key, OR
- is a Prime Attribute.
- Goal: Remove dependencies where and (implies determines via ).
What are Database Keys? Differentiate between Primary Key, Candidate Key, and Super Key.
Database Keys are attributes or sets of attributes that uniquely identify a record in a table or establish relationships between tables.
Differentiation:
-
Super Key:
- A set of one or more attributes that, taken collectively, allows us to identify uniquely a tuple in the relation.
- Example: In a table with ID and Name,
{ID},{ID, Name},{ID, Phone}are all Super Keys if ID is unique.
-
Candidate Key:
- A minimal Super Key. It is a Super Key for which no proper subset is a Super Key.
- It has no redundant attributes.
- Example: If
{ID}identifies a record, then{ID}is a Candidate Key.{ID, Name}is not, because{ID}alone suffices.
-
Primary Key:
- One specific Candidate Key chosen by the database designer to uniquely identify tuples within the table.
- It cannot contain
NULLvalues. - Example: From Candidate Keys
{Student_ID}and{Email},{Student_ID}might be chosen as the Primary Key.
Relationship:
Explain the ACID properties in the context of Transaction Management.
ACID properties ensure that database transactions are processed reliably. They are:
-
A - Atomicity:
- Definition: This property states that a transaction must be treated as an atomic unit. It implies "All or Nothing."
- Explanation: Either all operations of the transaction are reflected in the database, or none are. If a transaction fails halfway, the database rolls back to the state before the transaction started.
-
C - Consistency:
- Definition: The database must remain in a consistent state before and after the transaction.
- Explanation: The integrity constraints must be maintained so that the database does not violate any rules. (e.g., The total sum of money transferred between accounts remains constant).
-
I - Isolation:
- Definition: Multiple transactions occurring simultaneously must not affect each other's execution.
- Explanation: The intermediate state of a transaction should be invisible to other concurrent transactions. The result should be the same as if the transactions were executed serially.
-
D - Durability:
- Definition: Once a transaction is committed, the changes are permanent.
- Explanation: Even in the event of a system failure (crash or power loss) immediately after the commit, the data remains saved.
What is a Foreign Key? How does it enforce Referential Integrity?
Foreign Key:
A Foreign Key is a field (or collection of fields) in one table, that refers to the Primary Key in another table. It is used to link two tables together.
Referential Integrity:
Referential Integrity is a database concept that ensures relationships between tables remain consistent. The Foreign Key enforces this by:
- Valid Reference: Ensuring that a value entered in the Foreign Key column actually exists in the referenced Primary Key column of the parent table (or is NULL).
- Preventing Orphan Records: It prevents the deletion of a record from the parent table if there are related records in the child table (unless
CASCADE DELETEis set).
Example:
- Parent Table:
Departments(DeptID [PK], Name) - Child Table:
Employees(EmpID, Name, DeptID [FK]) - Enforcement: You cannot assign an employee to a
DeptID(e.g., 50) ifDeptID50 does not exist in theDepartmentstable.
Draw and explain the State Transition Diagram of a transaction in DBMS.
A transaction in a DBMS goes through the following states:
-
Active:
- The initial state. The transaction stays in this state while it is executing read or write operations.
-
Partially Committed:
- After the final operation is executed, but before the changes are permanently saved to the database on disk.
-
Failed:
- The state entered if a normal check discovers that the transaction can no longer proceed (due to hardware failure or logical error).
-
Aborted:
- After the transaction has failed and the database has been rolled back to the state prior to the start of the transaction. The transaction effectively ceases to exist.
-
Committed:
- The state after the transaction has successfully completed and changes are permanently recorded in the database.
Flow:
Explain the difference between DELETE, DROP, and TRUNCATE commands in SQL.
| Feature | DELETE | TRUNCATE | DROP |
|---|---|---|---|
| Type | DML (Data Manipulation Language) | DDL (Data Definition Language) | DDL (Data Definition Language) |
| Function | Deletes specific rows based on a condition (WHERE clause). | Removes all rows from a table. | Removes the entire table structure and data from the database. |
| Space | Does not free the space occupied by the table. | Frees the space occupied by the data. | Frees all space. |
| Rollback | Can be rolled back (if using transactions). | Cannot be rolled back (in many SQL implementations) as it is not transaction-safe log-wise. | Cannot be rolled back. |
| Performance | Slower (logs each row deletion). | Faster (minimal logging). | Fast. |
| Syntax | DELETE FROM table WHERE condition; |
TRUNCATE TABLE table; |
DROP TABLE table; |
What are SQL Joins? Explain Inner Join, Left Outer Join, Right Outer Join, and Full Outer Join.
SQL Joins are used to combine rows from two or more tables, based on a related column between them.
-
Inner Join:
- Returns records that have matching values in both tables.
- Venn Diagram: Intersection of Set A and Set B.
SELECT * FROM A INNER JOIN B ON A.id = B.id;
-
Left (Outer) Join:
- Returns all records from the left table (A), and the matched records from the right table (B). If there is no match, the result is NULL on the right side.
SELECT * FROM A LEFT JOIN B ON A.id = B.id;
-
Right (Outer) Join:
- Returns all records from the right table (B), and the matched records from the left table (A). If there is no match, the result is NULL on the left side.
SELECT * FROM A RIGHT JOIN B ON A.id = B.id;
-
Full (Outer) Join:
- Returns all records when there is a match in either left or right table records. It combines the result of both Left and Right joins.
SELECT * FROM A FULL OUTER JOIN B ON A.id = B.id;
Define Functional Dependency. Explain Trivial and Non-Trivial Functional Dependencies.
Functional Dependency (FD):
It is a relationship between attributes in a table. It is denoted as . This means that the value of attribute set uniquely determines the value of attribute set . Here, is the determinant and is the dependent.
-
Trivial Functional Dependency:
- A dependency is trivial if is a subset of ().
- Example: If we have
{Student_ID, Name}, then{Student_ID, Name} \rightarrow Student_IDis trivial becauseStudent_IDis part of the determinant.
-
Non-Trivial Functional Dependency:
- A dependency is non-trivial if is NOT a subset of .
- Example:
Student_IDName.Nameis not a subset ofStudent_ID. This conveys actual information.
Discuss Boyce-Codd Normal Form (BCNF). How is it different from 3NF?
Boyce-Codd Normal Form (BCNF) is an advanced version of 3NF, often called 3.5NF. A table is in BCNF if it is in 3NF and satisfies a stricter condition for functional dependencies.
Condition:
For every non-trivial functional dependency , must be a Super Key.
Difference from 3NF:
- 3NF Rule: is allowed if is a Super Key OR if is a Prime Attribute (part of a candidate key).
- BCNF Rule: is allowed ONLY if is a Super Key.
Implication: BCNF removes anomalies in tables where a prime attribute depends on a non-prime attribute (which 3NF allows).
What is Concurrency Control? Why is it needed in a DBMS? Explain with the "Lost Update Problem".
Concurrency Control is the procedure in DBMS for managing simultaneous operations without conflicting with each other. It ensures that database transactions are performed concurrently without violating data integrity.
Need for Concurrency Control:
Without it, simultaneous transactions can lead to data inconsistency. Common problems include the Lost Update, Dirty Read, and Unrepeatable Read.
The Lost Update Problem:
Occurs when two transactions update the same data item, but one update overwrites the other.
- Transaction T1 reads .
- Transaction T2 reads .
- T1 updates to and writes it.
- T2 updates to and writes it.
- Result: The value is 80. The addition of 50 by T1 is "lost" because T2 overwrote the value based on the old read. Correct value should be 130.
Explain the Two-Phase Locking (2PL) protocol.
Two-Phase Locking (2PL) is a concurrency control method that guarantees serializability. It ensures that if transactions lock data items, they follow a specific protocol divided into two phases:
-
Growing Phase:
- In this phase, a transaction may obtain locks (Shared or Exclusive) but may not release any lock.
- The number of locks held by the transaction increases.
-
Locked Point:
- The point where the transaction has acquired all necessary locks.
-
Shrinking Phase:
- In this phase, a transaction may release locks but may not obtain any new locks.
- Once a lock is released, the transaction enters the shrinking phase and cannot acquire locks again.
Benefit: 2PL ensures conflict serializability.
Drawback: It does not prevent deadlocks.
What is a Deadlock in DBMS? Describe methods to handle deadlocks.
Deadlock:
A deadlock is a situation where two or more transactions are waiting indefinitely for one another to give up locks.
- Example: T1 holds Lock A and waits for Lock B. T2 holds Lock B and waits for Lock A. Neither can proceed.
Methods to Handle Deadlocks:
-
Deadlock Prevention:
- Ensures the system never enters a deadlock state.
- Wait-Die Scheme: Older transaction waits for younger; younger dies if it requests resource held by older.
- Wound-Wait Scheme: Older transaction wounds (rolls back) younger transaction to get resource.
-
Deadlock Detection and Recovery:
- The system allows deadlocks to occur but periodically checks for them.
- Wait-for Graph: A cycle in the wait-for graph indicates a deadlock.
- Recovery: Once detected, the system selects a Victim transaction to roll back to break the cycle.
Write a SQL query to create a table named Employee with the following fields: EmpID (Integer, Primary Key), Name (VarChar), Salary (Decimal), and JoinDate (Date). Also, write a query to increase the salary of all employees by 10%.
1. Create Table Query:
sql
CREATE TABLE Employee (
EmpID INT PRIMARY KEY,
Name VARCHAR(100),
Salary DECIMAL(10, 2),
JoinDate DATE
);
2. Update Salary Query:
sql
UPDATE Employee
SET Salary = Salary * 1.10;
Explain the concept of Serializability in transaction management.
Serializability is the concept used to determine if a concurrent schedule of transactions produces the same result as if those transactions were executed serially (one after the other).
- Serial Schedule: Transactions are executed one by one (no interleaving). This is always consistent but inefficient.
- Concurrent Schedule: Operations of transactions are interleaved.
Goal:
A concurrent schedule is considered "correct" or Serializable if it is equivalent to some serial schedule.
Types:
- Conflict Serializability: Determined by checking if the order of conflicting operations (Read/Write on same data item by different transactions) can be swapped without changing the result. If a schedule has no cycles in its precedence graph, it is conflict serializable.
- View Serializability: A less strict condition based on the views (Read operations) of the transactions.
Discuss the Log-Based Recovery technique (Deferred vs. Immediate Update).
Log-Based Recovery relies on maintaining a log file (journal) that records every operation performed by transactions.
-
Deferred Database Modification (Deferred Update):
- Updates are not written to the database on disk until the transaction reaches its commit point.
- Log: Records
[Start, T1],[Write, T1, X, NewVal],[Commit, T1]. - Recovery: If a crash occurs before commit, no Undo is needed (changes weren't written). If after commit, Redo is performed using the log.
-
Immediate Database Modification (Immediate Update):
- Updates are applied to the database as they occur (even before commit), but a log record is written before the database update (Write-Ahead Logging).
- Log: Records both Old Value and New Value.
- Recovery: If a crash occurs, uncommitted transactions are Undone (rolled back using Old Value). Committed transactions are Redone (using New Value) to ensure durability.