1Who is widely recognized as the 'Father of Data Warehousing'?
A.E.F. Codd
B.Peter Chen
C.Bill Inmon
D.Ralph Kimball
Correct Answer: Bill Inmon
Explanation:
Bill Inmon is credited with coining the term and defining the first specific principles of the Data Warehouse.
Incorrect! Try again.
2A Data Warehouse is defined as a subject-oriented, integrated, time-variant, and ____ collection of data.
A.Volatile
B.Non-volatile
C.Transient
D.Virtual
Correct Answer: Non-volatile
Explanation:
The standard definition involves four key characteristics, one of which is non-volatile, meaning data is not updated or deleted once entered.
Incorrect! Try again.
3Which characteristic of a Data Warehouse implies that data is organized around major entities such as customer, product, and sales?
A.Time-variant
B.Subject-oriented
C.Integrated
D.Non-volatile
Correct Answer: Subject-oriented
Explanation:
Subject-oriented means the data is categorized by specific business subjects rather than by application functions.
Incorrect! Try again.
4In the context of Data Warehousing, what does 'Integrated' mean?
A.Data is collected from various sources and standardized
B.Data is accessible via the web
C.Data is encrypted for security
D.Data is stored in a single file
Correct Answer: Data is collected from various sources and standardized
Explanation:
Integration involves ensuring consistent naming conventions, formatting, and encoding when combining data from multiple heterogeneous sources.
Incorrect! Try again.
5What is the primary purpose of the 'Staging Area' in Data Warehouse architecture?
A.To store meta-data only
B.To provide query access to end-users
C.To store data permanently
D.To clean and transform data before loading it into the warehouse
Correct Answer: To clean and transform data before loading it into the warehouse
Explanation:
The staging area is a temporary location where data is processed (ETL) before being moved into the actual data warehouse.
Incorrect! Try again.
6Which type of data represents the 'Time-variant' nature of a Data Warehouse?
A.Historical data spanning a specific period
B.Current real-time data only
C.Future predictive data
D.Temporary session data
Correct Answer: Historical data spanning a specific period
Explanation:
Time-variant means every data unit in the warehouse is associated with a specific point in time, allowing for historical analysis.
Incorrect! Try again.
7What is a 'Data Mart'?
A.A complete enterprise-wide data warehouse
B.A tool for data mining
C.A backup storage device
D.A subset of a data warehouse focused on a specific department or function
Correct Answer: A subset of a data warehouse focused on a specific department or function
Explanation:
A Data Mart is a scaled-down version of a data warehouse usually designed for a specific business line or team (e.g., Marketing or Sales).
Incorrect! Try again.
8Which architecture approach advocates building the Enterprise Data Warehouse (EDW) first and then creating Data Marts?
A.Spiral Approach
B.Agile Approach
C.Bottom-up Approach
D.Top-down Approach
Correct Answer: Top-down Approach
Explanation:
The Top-down approach, associated with Bill Inmon, suggests building a centralized EDW first and then deriving data marts from it.
Incorrect! Try again.
9The 'Bottom-up' approach to Data Warehousing, which starts with Data Marts, is associated with:
A.Ralph Kimball
B.Bill Inmon
C.Larry Ellison
D.Edgar Codd
Correct Answer: Ralph Kimball
Explanation:
Ralph Kimball is the proponent of the Dimensional Modeling (Bottom-up) approach, which connects data marts via a data bus.
Incorrect! Try again.
10What does OLAP stand for?
A.Overlay Analytical Platform
B.Offline Analysis Protocol
C.Online Advanced Programming
D.Online Analytical Processing
Correct Answer: Online Analytical Processing
Explanation:
OLAP stands for Online Analytical Processing, which allows users to analyze information from multiple database systems at the same time.
Incorrect! Try again.
11Which table in a schema contains the quantitative data (numbers/metrics) for analysis?
A.Dimension Table
B.Index Table
C.Fact Table
D.Meta Table
Correct Answer: Fact Table
Explanation:
A Fact Table contains the measurements, metrics, or facts of a business process (e.g., sales amount, quantity sold).
Incorrect! Try again.
12Which table contains descriptive attributes used for query filtering and labeling?
A.Measure Table
B.Dimension Table
C.Fact Table
D.Summary Table
Correct Answer: Dimension Table
Explanation:
Dimension tables contain textual descriptors of the business, such as product names, customer addresses, or dates.
Incorrect! Try again.
13In a Star Schema, how are tables arranged?
A.Multiple fact tables sharing dimension tables
B.A central dimension table connected to multiple fact tables
C.Normalized tables linked in a chain
D.A central fact table connected to multiple dimension tables
Correct Answer: A central fact table connected to multiple dimension tables
Explanation:
A Star Schema consists of a central fact table surrounded by dimension tables, resembling a star shape.
Incorrect! Try again.
14What distinguishes a Snowflake Schema from a Star Schema?
A.The Dimension tables are normalized
B.The Fact table is normalized
C.It does not support SQL
D.It uses only one table
Correct Answer: The Dimension tables are normalized
Explanation:
In a Snowflake Schema, dimension tables are normalized (split into multiple related tables) to reduce redundancy, unlike the denormalized dimensions in a Star Schema.
Incorrect! Try again.
15A 'Fact Constellation' schema is also known as:
A.Galaxy Schema
B.Snowflake Schema
C.Star Schema
D.Bus Schema
Correct Answer: Galaxy Schema
Explanation:
A Fact Constellation, or Galaxy Schema, contains more than one fact table, sharing dimension tables between them.
Incorrect! Try again.
16What is 'Meta-data' in the context of a Data Warehouse?
A.The actual business data
B.Data about data
C.Deleted data
D.Backup data
Correct Answer: Data about data
Explanation:
Meta-data provides information about the structure, source, and meaning of the data stored in the warehouse.
Incorrect! Try again.
17Which OLAP operation allows moving from summary data to detailed data?
A.Drill-down
B.Dice
C.Roll-up
D.Slice
Correct Answer: Drill-down
Explanation:
Drill-down is the operation of navigating from less detailed data (summary) to more detailed data.
Incorrect! Try again.
18Which OLAP operation performs aggregation on a data cube (e.g., climbing up the hierarchy)?
A.Slice
B.Roll-up
C.Drill-down
D.Pivot
Correct Answer: Roll-up
Explanation:
Roll-up summarizes data by climbing up the concept hierarchy or by dimension reduction.
Incorrect! Try again.
19Selecting a single value for one dimension to create a subset of the cube is called:
A.Slicing
B.Pivoting
C.Drilling
D.Dicing
Correct Answer: Slicing
Explanation:
Slicing performs a selection on one dimension of the given cube, resulting in a sub-cube.
Incorrect! Try again.
20Which type of OLAP stores data in a multidimensional array-based storage engine?
A.ROLAP
B.DOLAP
C.HOLAP
D.MOLAP
Correct Answer: MOLAP
Explanation:
MOLAP (Multidimensional OLAP) stores data in optimized multidimensional array storage rather than in a relational database.
ROLAP relies on relational databases to store data and allows OLAP queries via SQL.
Incorrect! Try again.
22HOLAP attempts to combine the best features of:
A.ROLAP and MOLAP
B.OLTP and OLAP
C.Data Mining and Data Warehousing
D.Star and Snowflake Schemas
Correct Answer: ROLAP and MOLAP
Explanation:
Hybrid OLAP (HOLAP) combines the scalability of ROLAP with the speed of computation of MOLAP.
Incorrect! Try again.
23Which of the following is NOT a typical application of a Data Warehouse?
A.Decision Support Systems (DSS)
B.Transaction Processing (OLTP)
C.Trend Analysis
D.Reporting and Forecasting
Correct Answer: Transaction Processing (OLTP)
Explanation:
Data Warehouses are designed for OLAP (Analysis), not OLTP (day-to-day transactional processing like ATM withdrawals).
Incorrect! Try again.
24What is the acronym ETL short for?
A.Enter, Transform, List
B.Evaluate, Transfer, Load
C.Extract, Test, Lock
D.Extract, Transform, Load
Correct Answer: Extract, Transform, Load
Explanation:
ETL is the process of extracting data from source systems, transforming it to fit the warehouse schema, and loading it into the warehouse.
Incorrect! Try again.
25In the context of ETL, 'Transformation' includes:
A.Backing up data to tape
B.Installing database software
C.Cleaning, filtering, and aggregating data
D.Giving users read access
Correct Answer: Cleaning, filtering, and aggregating data
Explanation:
Transformation involves changing data into a proper format, cleaning errors, and aggregating it for storage.
Incorrect! Try again.
26What is an ODS in Data Warehousing architecture?
A.Online Decision System
B.Operational Data Store
C.Offline Data Storage
D.Object Data Source
Correct Answer: Operational Data Store
Explanation:
An Operational Data Store (ODS) is a database designed to integrate data from multiple sources for additional operations on the data, often used for near real-time reporting.
Incorrect! Try again.
27Which utility is primarily used to refresh a data warehouse with new data?
A.Index Creator
B.Query Manager
C.Security Manager
D.Data Loading Utility
Correct Answer: Data Loading Utility
Explanation:
Data loading utilities are used to insert new data into the warehouse during the ETL process.
Incorrect! Try again.
28What is the 'Grain' of a fact table?
A.The level of detail represented by a single row
B.The number of columns
C.The size of the table on disk
D.The type of database used
Correct Answer: The level of detail represented by a single row
Explanation:
Granularity or 'Grain' refers to the level of detail of the data stored (e.g., daily sales vs. monthly sales).
Incorrect! Try again.
29Which schema results in the highest performance for queries due to fewer joins?
A.Hierarchical Schema
B.Star Schema
C.Snowflake Schema
D.Normalized Schema
Correct Answer: Star Schema
Explanation:
The Star Schema uses denormalized dimension tables, requiring fewer joins than a Snowflake schema, usually resulting in faster query performance.
Incorrect! Try again.
30What type of metadata allows tools to understand how to navigate the data (e.g., mapping source to target)?
A.Business Metadata
B.Social Metadata
C.Administrative Metadata
D.Technical/Operational Metadata
Correct Answer: Technical/Operational Metadata
Explanation:
Technical metadata includes details about data lineage, mapping, transformation rules, and storage structures.
Incorrect! Try again.
31A 'Dependent Data Mart' is sourced from:
A.External flat files only
B.User input
C.Directly from operational systems
D.The Enterprise Data Warehouse
Correct Answer: The Enterprise Data Warehouse
Explanation:
Dependent data marts draw their data from a central Enterprise Data Warehouse, ensuring consistency.
Incorrect! Try again.
32An 'Independent Data Mart' is sourced from:
A.Another Data Mart
B.The backup server
C.Directly from operational source systems
D.The Enterprise Data Warehouse
Correct Answer: Directly from operational source systems
Explanation:
Independent data marts are built directly from source systems without a central DW, often leading to 'islands of data'.
Incorrect! Try again.
33Which of the following is a security concern in Data Warehousing?
A.Ensuring data is not time-variant
B.Reducing data redundancy
C.Controlling access to sensitive aggregated data
D.Keeping data volatile
Correct Answer: Controlling access to sensitive aggregated data
Explanation:
Security in DW involves ensuring that only authorized users can access specific aggregated data or detailed sensitive information.
Incorrect! Try again.
34What is the purpose of a 'Surrogate Key' in a dimension table?
A.To store the date
B.To replace the natural primary key with an internal system integer
C.To encrypt the data
D.To link to the internet
Correct Answer: To replace the natural primary key with an internal system integer
Explanation:
Surrogate keys are system-generated unique identifiers used to handle slowly changing dimensions and decouple the warehouse from source system key changes.
Incorrect! Try again.
35In the Three-Tier Data Warehouse Architecture, what constitutes the bottom tier?
A.Front-end tools
B.Data Warehouse Database Server
C.Web Interface
D.OLAP Server
Correct Answer: Data Warehouse Database Server
Explanation:
The bottom tier is the database server that houses the actual data warehouse.
Incorrect! Try again.
36In the Three-Tier Data Warehouse Architecture, what is the middle tier?
A.Data Mining Tools
B.OLAP Server
C.Reporting Tools
D.Relational Database
Correct Answer: OLAP Server
Explanation:
The middle tier is typically an OLAP server (either ROLAP or MOLAP) that provides the analytical view of the data.
Incorrect! Try again.
37Which of the following is a benefit of Data Warehousing?
A.Increases transactional processing speed
B.Enhances business intelligence and decision making
C.Reduces storage requirements
D.Replaces the need for backups
Correct Answer: Enhances business intelligence and decision making
Explanation:
The primary benefit is providing consolidated, historical data that supports better business analysis and decision-making.
Incorrect! Try again.
38What is 'Data Scrubbing'?
A.Cleaning data to remove inconsistencies and errors
B.Deleting data permanently
C.Compressing data
D.Merging data marts
Correct Answer: Cleaning data to remove inconsistencies and errors
Explanation:
Data scrubbing (or cleansing) involves detecting and correcting (or removing) corrupt or inaccurate records.
Incorrect! Try again.
39The operation of rotating the data axes in an OLAP cube to view data from different perspectives is called:
A.Dice
B.Slice
C.Drill-up
D.Pivot
Correct Answer: Pivot
Explanation:
Pivoting (or rotating) changes the dimensional orientation of the report or page display.
Incorrect! Try again.
40Which backup strategy involves copying the entire data warehouse?
A.Differential Backup
B.Partial Backup
C.Full Backup
D.Incremental Backup
Correct Answer: Full Backup
Explanation:
A Full Backup copies every piece of data in the system, regardless of whether it has changed.
Incorrect! Try again.
41Why is backup crucial for a Data Warehouse despite it being a read-mostly system?
A.To increase query speed
B.Data is volatile
C.To recover from hardware failure or data corruption
D.To allow write operations
Correct Answer: To recover from hardware failure or data corruption
Explanation:
Even though DWs are read-heavy, rebuilding a massive warehouse from sources can take days or weeks; backups ensure quick recovery.
Incorrect! Try again.
42A 'Virtual Data Warehouse' is:
A.A simulation of a database
B.An empty database
C.A cloud-based physical storage
D.A set of views over operational databases
Correct Answer: A set of views over operational databases
Explanation:
A Virtual Data Warehouse allows users to query operational systems directly via a middleware view, without physically storing aggregated data separately.
Incorrect! Try again.
43Which tool category allows end-users to create reports and graphs from the Data Warehouse?
A.Database Management Systems
B.ETL Tools
C.Operating Systems
D.Data Access/Reporting Tools
Correct Answer: Data Access/Reporting Tools
Explanation:
These tools (like Tableau, PowerBI, Cognos) reside in the top tier and provide the interface for users to visualize data.
Incorrect! Try again.
44In a Fact Table, a 'Measure' that can be summed across all dimensions is called:
A.Semi-additive
B.Non-additive
C.Subtractive
D.Additive
Correct Answer: Additive
Explanation:
Additive measures (e.g., Sales Revenue) can be summed up across any dimension (time, geography, product).
Incorrect! Try again.
45A 'Semi-additive' measure is one that:
A.Cannot be summed at all
B.Can be summed across all dimensions
C.Can be summed across some dimensions but not others
D.Is always text
Correct Answer: Can be summed across some dimensions but not others
Explanation:
An example is 'Inventory Balance', which can be summed across products but not across time (adding inventory of Jan + Feb is meaningless).
Incorrect! Try again.
46Which utility is used to monitor the performance of the Data Warehouse?
A.System/Performance Monitor
B.Load Manager
C.Warehouse Manager
D.Query Manager
Correct Answer: System/Performance Monitor
Explanation:
System monitors track usage, query response times, and resource utilization.
Incorrect! Try again.
47The define-measure-analyze-improve-control cycle is often supported by data from:
A.Network Routers
B.Data Warehouses
C.Text Editors
D.Web Browsers
Correct Answer: Data Warehouses
Explanation:
Data Warehouses provide the metrics and historical data required for process improvement methodologies like Six Sigma.
Incorrect! Try again.
48What is the specific challenge of 'Data Marting' when done independently?
A.It is too expensive
B.It is too slow
C.It requires too much storage
D.It creates inconsistent 'islands of information'
Correct Answer: It creates inconsistent 'islands of information'
Explanation:
Independent data marts often use different definitions for the same metrics, leading to inconsistent data across the enterprise.
Incorrect! Try again.
49Which of the following best describes 'Business Metadata'?
A.Programming code for ETL
B.Physical database schema details
C.Log files of server activity
D.Terms, definitions, and ownership info for business users
Correct Answer: Terms, definitions, and ownership info for business users
Explanation:
Business metadata translates technical data into business terminology (e.g., definition of 'Gross Profit').
Incorrect! Try again.
50In the context of Informatica and DW, what is a 'Repository'?
A.The hardware where data is stored
B.The operational source system
C.The final report generated
D.A central database storing metadata and transformation rules
Correct Answer: A central database storing metadata and transformation rules
Explanation:
The Repository is the heart of ETL tools (like Informatica), storing the logic, mappings, and metadata for the data warehousing process.