1What is the primary definition of Data Integration?
A.The process of deleting duplicate data from a database
B.The process of combining data from different sources into a unified view
C.The process of backing up data to the cloud
D.The process of encrypting data for security
Correct Answer: The process of combining data from different sources into a unified view
Explanation:Data integration involves combining data residing in different sources and providing users with a unified view of them.
Incorrect! Try again.
2Which of the following is a key driver/need for Data Integration in an enterprise?
A.To increase the size of the hard drives used
B.To isolate departmental data silos
C.To facilitate business intelligence and decision-making
D.To decrease the speed of the network
Correct Answer: To facilitate business intelligence and decision-making
Explanation:Data integration allows organizations to analyze combined data, leading to better business intelligence and informed decision-making.
Incorrect! Try again.
3In the context of data integration, what does 'Heterogeneity' refer to?
A.Data that is exactly the same across all systems
B.Differences in hardware, operating systems, and data models across sources
C.The speed at which data is transferred
D.The security protocols used for data
Correct Answer: Differences in hardware, operating systems, and data models across sources
Explanation:Heterogeneity refers to the diversity in data representation, storage, and semantics across different source systems.
Incorrect! Try again.
4What does ETL stand for?
A.Extract, Transform, Load
B.Enter, Transfer, Load
C.Execute, Transmit, Log
D.Extract, Test, Lock
Correct Answer: Extract, Transform, Load
Explanation:ETL stands for Extract, Transform, Load, a traditional approach to data integration.
Incorrect! Try again.
5In the ELT approach, where does the transformation of data primarily occur?
A.In the source system
B.In a separate staging server before loading
C.In the target data warehouse
D.In the middleware application
Correct Answer: In the target data warehouse
Explanation:In ELT (Extract, Load, Transform), data is loaded into the target system first, and the transformation happens there using the target's processing power.
Incorrect! Try again.
6Which integration approach provides a real-time, unified view of data without physically moving the data to a central repository?
A.Data Warehousing
B.Data Virtualization (Federation)
C.Manual Data Entry
D.Tape Backup
Correct Answer: Data Virtualization (Federation)
Explanation:Data virtualization or federation leaves data in source systems and retrieves it on-the-fly to create a virtual unified view.
Incorrect! Try again.
7What is 'Semantic Heterogeneity' in data integration?
A.Using different operating systems
B.Using different network cables
C.Differences in the meaning or interpretation of data (e.g., synonyms, homonyms)
D.Differences in file compression formats
Correct Answer: Differences in the meaning or interpretation of data (e.g., synonyms, homonyms)
Explanation:Semantic heterogeneity occurs when there is a disagreement about the meaning, interpretation, or intended use of the same or related data.
Incorrect! Try again.
8Which of the following is an advantage of a Data Warehouse?
A.It is optimized for transaction processing
B.It provides a volatile, changing view of data
C.It supports historical analysis and reporting
D.It slows down analytical queries
Correct Answer: It supports historical analysis and reporting
Explanation:A Data Warehouse is designed to store historical data to support analysis, reporting, and decision-making.
Incorrect! Try again.
9What creates a 'Single Version of the Truth' (SVOT)?
A.Keeping data in silos
B.Effective Data Integration
C.Using Excel spreadsheets
D.Avoiding data cleaning
Correct Answer: Effective Data Integration
Explanation:Effective data integration resolves inconsistencies across sources to provide a single, accurate reference point for data, known as the Single Version of the Truth.
Incorrect! Try again.
10Which technology is commonly used for Application-based Integration?
Explanation:EAI uses middleware to enable different applications to share data and business processes.
Incorrect! Try again.
11What is a Data Warehouse?
A.A system for recording daily business transactions
B.A subject-oriented, integrated, time-variant, and non-volatile collection of data
C.A temporary cache for web browsers
D.A physical storage room for hard drives
Correct Answer: A subject-oriented, integrated, time-variant, and non-volatile collection of data
Explanation:This is the standard definition of a Data Warehouse provided by Bill Inmon.
Incorrect! Try again.
12In the context of a Data Warehouse, what does 'Non-volatile' mean?
A.Data disappears when power is lost
B.Data is updated in real-time by users
C.Data is not changed once loaded; it is read-only for analysis
D.Data is highly explosive
Correct Answer: Data is not changed once loaded; it is read-only for analysis
Explanation:Non-volatile means that once data is entered into the warehouse, it is not updated or deleted row-by-row; it is kept for historical comparison.
Incorrect! Try again.
13What is the primary difference between OLTP and OLAP?
A.OLTP is for analysis; OLAP is for transactions
B.OLTP is for transactions; OLAP is for analysis
C.OLTP is slower than OLAP
D.There is no difference
Correct Answer: OLTP is for transactions; OLAP is for analysis
Explanation:OLTP (Online Transaction Processing) manages day-to-day operations, while OLAP (Online Analytical Processing) is used for analysis and reporting.
Incorrect! Try again.
14What is a Data Mart?
A.A market where data is bought and sold
B.A subset of a data warehouse oriented to a specific business line or team
C.A type of database virus
D.The hardware component of a database
Correct Answer: A subset of a data warehouse oriented to a specific business line or team
Explanation:A Data Mart is a focused version of a data warehouse designed for a specific department (e.g., Sales, HR).
Incorrect! Try again.
15Which type of Data Mart draws data directly from operational sources without a central Data Warehouse?
A.Dependent Data Mart
B.Independent Data Mart
C.Hybrid Data Mart
D.Integrated Data Mart
Correct Answer: Independent Data Mart
Explanation:An Independent Data Mart is built directly from source systems without utilizing a central enterprise data warehouse.
Incorrect! Try again.
16What is a 'Dependent Data Mart'?
A.A mart sourced from the central Data Warehouse
B.A mart that relies on manual data entry
C.A mart that cannot function without internet
D.A mart sourced directly from flat files
Correct Answer: A mart sourced from the central Data Warehouse
Explanation:Dependent Data Marts receive their data from the Enterprise Data Warehouse, ensuring consistency across the organization.
Incorrect! Try again.
17What is the role of a Staging Area in data integration?
A.To permanently store data
B.To visualize data for the end-user
C.To hold data temporarily for processing and cleaning before loading into the warehouse
D.To run the operating system
Correct Answer: To hold data temporarily for processing and cleaning before loading into the warehouse
Explanation:The staging area is an intermediate storage area used for data processing (cleaning, transformation) during the ETL process.
Incorrect! Try again.
18Which dimension of Data Quality refers to the data correctly representing the real-world object or event?
A.Timeliness
B.Accuracy
C.Volume
D.Security
Correct Answer: Accuracy
Explanation:Accuracy measures the degree to which data correctly describes the 'real world' object or event being described.
Incorrect! Try again.
19Data Completeness refers to:
A.Whether all required data is present
B.Whether the data is encrypted
C.Whether the data is stored in the cloud
D.Whether the data format is JSON
Correct Answer: Whether all required data is present
Explanation:Completeness ensures that there are no missing values or records that are required for the business process.
Incorrect! Try again.
20Which process involves examining data to understand its structure, content, and quality?
A.Data Profiling
B.Data Encryption
C.Data Compression
D.Data Transmission
Correct Answer: Data Profiling
Explanation:Data profiling is the process of reviewing source data to understand structure, content, and quality anomalies before integration.
Incorrect! Try again.
21What is Data Cleansing (Scrubbing)?
A.Wiping the hard drive clean
B.The process of detecting and correcting corrupt or inaccurate records
C.Removing old hardware
D.Organizing cables in the server room
Correct Answer: The process of detecting and correcting corrupt or inaccurate records
Explanation:Data cleansing involves identifying incorrect, incomplete, or irrelevant parts of the data and replacing, modifying, or deleting the dirty data.
Incorrect! Try again.
22What is 'Metadata'?
A.Data about data
B.Big data
C.Mobile data
D.Encrypted data
Correct Answer: Data about data
Explanation:Metadata provides information about other data, such as its source, format, owner, and lineage.
Incorrect! Try again.
23Which approach to Data Warehousing is known as the 'Top-Down' approach?
A.Kimball Approach
B.Inmon Approach
C.Agile Approach
D.Waterfall Approach
Correct Answer: Inmon Approach
Explanation:Bill Inmon advocates a Top-Down approach: build the Enterprise Data Warehouse first, then build Data Marts from it.
Incorrect! Try again.
24Which approach to Data Warehousing is known as the 'Bottom-Up' approach?
A.Kimball Approach
B.Inmon Approach
C.Spiral Approach
D.Cloud Approach
Correct Answer: Kimball Approach
Explanation:Ralph Kimball advocates a Bottom-Up approach: start by building dimensional Data Marts that can be conformed together.
Incorrect! Try again.
25What is Change Data Capture (CDC)?
A.Taking a photo of the database
B.Identifying and capturing only the data that has changed since the last extraction
C.Changing the database password
D.Capturing data from the internet
Correct Answer: Identifying and capturing only the data that has changed since the last extraction
Explanation:CDC optimizes integration by only moving data that has been inserted, updated, or deleted, rather than moving the whole database every time.
Incorrect! Try again.
26Which of the following is NOT a benefit of high Data Quality?
A.Improved customer relations
B.Increased operational costs
C.Better decision making
D.Regulatory compliance
Correct Answer: Increased operational costs
Explanation:High data quality typically reduces operational costs (less rework, fewer errors). Increased costs are a result of poor data quality.
Incorrect! Try again.
27What is 'Deduplication'?
A.Creating a backup copy
B.Removing duplicate copies of repeating data
C.Doubling the storage capacity
D.Splitting data into two tables
Correct Answer: Removing duplicate copies of repeating data
Explanation:Deduplication is a data quality technique to eliminate redundant copies of data.
Incorrect! Try again.
28The 'Time-variant' characteristic of a Data Warehouse implies that:
A.The warehouse only operates during business hours
B.Data is stored with a time element (historical perspective)
C.Queries must be run within a specific time limit
D.The system clock is synchronized
Correct Answer: Data is stored with a time element (historical perspective)
Explanation:Time-variant means every unit of data in the warehouse is relevant to a specific moment in time, allowing historical trend analysis.
Incorrect! Try again.
29Which technology allows for data integration via standard XML-based messages over the web?
A.Web Services (SOAP/REST)
B.FTP
C.Floppy Disks
D.Direct Memory Access
Correct Answer: Web Services (SOAP/REST)
Explanation:Web services provide a standard means of interoperating between different software applications running on a variety of platforms using XML/JSON.
Incorrect! Try again.
30What is a major disadvantage of 'Manual Data Integration'?
A.High cost of software tools
B.Prone to human error and difficult to scale
C.Requires complex installation
D.Too fast for humans to track
Correct Answer: Prone to human error and difficult to scale
Explanation:Manual integration (copy-pasting or manual entry) is slow, unscalable, and very likely to introduce errors.
Incorrect! Try again.
31In Data Integration, what is a 'Connector' or 'Adapter'?
A.A physical cable
B.A software component that allows the integration tool to communicate with specific data sources
C.A user who connects systems
D.A power supply unit
Correct Answer: A software component that allows the integration tool to communicate with specific data sources
Explanation:Connectors/Adapters are drivers or software modules that enable connection to specific databases (e.g., Oracle, SAP, Salesforce).
Incorrect! Try again.
32Which of the following refers to 'Data Consistency'?
A.Data is stored on a consistent hardware platform
B.Data values are the same across all systems and copies
C.Data is consistently backed up
D.Data is accessed consistently every day
Correct Answer: Data values are the same across all systems and copies
Explanation:Consistency means that data across the enterprise is synchronized and does not contradict itself.
Incorrect! Try again.
33What is 'Latency' in the context of data integration?
A.The size of the data
B.The time delay between data generation and its availability for use
C.The cost of the integration tool
D.The number of users
Correct Answer: The time delay between data generation and its availability for use
Explanation:Latency refers to the lag time. Low latency implies near real-time integration; high latency implies batch processing.
Incorrect! Try again.
34What is 'Subject-Oriented' in the context of Data Warehousing?
A.Organized around applications (e.g., Payroll app)
B.Organized around major entities like Customer, Product, Sales
C.Organized around file types
D.Organized around storage media
Correct Answer: Organized around major entities like Customer, Product, Sales
Explanation:Subject-oriented means the data is grouped by business subjects (Customer, Product) rather than by the software application that generated it.
Incorrect! Try again.
35Which schema is most commonly associated with Data Marts and Warehouses?
A.Star Schema
B.XML Schema
C.Network Schema
D.Hierarchical Schema
Correct Answer: Star Schema
Explanation:The Star Schema is the simplest style of data mart schema, consisting of one or more fact tables referencing dimension tables.
Correct Answer: A table containing quantitative measurements (numbers/metrics)
Explanation:Fact tables contain the metrics, measurements, or facts of a business process (e.g., sales amount, quantity sold).
Incorrect! Try again.
37What is a 'Dimension Table'?
A.A table containing measurements
B.A table containing descriptive attributes (context for facts)
C.A table for temporary calculations
D.A table for metadata only
Correct Answer: A table containing descriptive attributes (context for facts)
Explanation:Dimension tables contain descriptive attributes (e.g., Customer Name, Product Category) that provide context to the facts.
Incorrect! Try again.
38Batch processing in data integration means:
A.Processing data one record at a time instantly
B.Processing data in large groups at scheduled intervals
C.Processing data manually
D.Processing data via email
Correct Answer: Processing data in large groups at scheduled intervals
Explanation:Batch processing collects data over a period and processes it all at once (e.g., nightly loads).
Incorrect! Try again.
39Which of the following is a symptom of poor Data Quality?
A.Reports are trusted by executives
B.Marketing mail sent to the wrong addresses
C.Fast query performance
D.Seamless system integration
Correct Answer: Marketing mail sent to the wrong addresses
Explanation:Sending mail to the wrong address is a classic example of inaccurate or outdated data.
Incorrect! Try again.
40What is 'Granularity' in a Data Warehouse?
A.The level of detail of the data
B.The texture of the hard drive
C.The cost of the storage
D.The security level
Correct Answer: The level of detail of the data
Explanation:Granularity refers to the level of detail or summary of the data (e.g., daily sales vs. monthly sales).
Incorrect! Try again.
41EII stands for:
A.Enterprise Information Integration
B.Electronic Internet Interface
C.Early Information Input
D.Enterprise Internal Internet
Correct Answer: Enterprise Information Integration
Explanation:EII is the ability to support a unified view of data and information across the enterprise (often via virtualization).
Incorrect! Try again.
42Which of the following is considered 'Unstructured Data'?
A.Rows in a SQL database
B.An Excel spreadsheet with headers
C.Emails, videos, and social media posts
D.A CSV file
Correct Answer: Emails, videos, and social media posts
Explanation:Unstructured data does not have a predefined data model or schema, unlike relational database tables.
Incorrect! Try again.
43Informatica PowerCenter is primarily used for:
A.Graphic Design
B.Data Integration / ETL
C.Operating System Management
D.Word Processing
Correct Answer: Data Integration / ETL
Explanation:Informatica PowerCenter is a leading enterprise ETL tool used for data integration.
Incorrect! Try again.
44What is the relationship between Data Governance and Data Quality?
A.They are unrelated
B.Data Governance provides the policies and roles to ensure Data Quality
C.Data Quality eliminates the need for Governance
D.Governance reduces Data Quality
Correct Answer: Data Governance provides the policies and roles to ensure Data Quality
Explanation:Governance sets the rules, responsibilities, and standards, while Data Quality is the execution and measurement of those standards.
Incorrect! Try again.
45Why is a Data Mart often faster to query than a Data Warehouse?
A.It uses better hardware
B.It holds less data and is optimized for specific queries
C.It does not use indexes
D.It is connected directly to the CPU
Correct Answer: It holds less data and is optimized for specific queries
Explanation:Data Marts contain a subset of data relevant to a specific domain, resulting in smaller volume and optimized structures for that domain's queries.
Incorrect! Try again.
46The 'Integrated' characteristic of a Data Warehouse means:
A.It is built on a single chip
B.Data from various sources is converted to a standard format/naming convention
C.It includes email integration
D.It is integrated with the printer network
Correct Answer: Data from various sources is converted to a standard format/naming convention
Explanation:Integration ensures that encoding inconsistencies (e.g., m/f vs. 0/1) are resolved so data is uniform.
Incorrect! Try again.
47Which comes first in the standard ETL process?
A.Load
B.Transform
C.Extract
D.Analyze
Correct Answer: Extract
Explanation:The process order is Extract (from source), Transform (clean/format), then Load (into target).
Incorrect! Try again.
48What is the purpose of a 'Surrogate Key' in a Data Warehouse?
A.To encrypt the data
B.To replace the natural primary key with a unique internal system identifier
C.To unlock the server room
D.To link to the internet
Correct Answer: To replace the natural primary key with a unique internal system identifier
Explanation:Surrogate keys are system-generated unique keys used in the warehouse to decouple it from changes in source system keys (natural keys).
Incorrect! Try again.
49What is 'Data Transformation'?
A.Moving data from A to B
B.Converting data from source format to destination format (e.g., calculation, filtering)
C.Deleting data
D.Archiving data
Correct Answer: Converting data from source format to destination format (e.g., calculation, filtering)
Explanation:Transformation involves applying business rules, cleaning, aggregating, or reformatting data.
Incorrect! Try again.
50Which of the following is a 'Target System' in an ETL flow?
A.The operational database where transactions happen
B.The Data Warehouse where data is loaded
C.The flat file containing raw logs
D.The legacy mainframe system
Correct Answer: The Data Warehouse where data is loaded
Explanation:In an ETL flow, the source is where data comes from, and the target (or destination) is where data is loaded, typically the Data Warehouse.
Incorrect! Try again.
Give Feedback
Help us improve by sharing your thoughts or reporting issues.