Unit 2 - Practice Quiz

INT323 50 Questions
0 Correct 0 Wrong 50 Left
0/50

1 What is the primary definition of Data Integration?

A. The process of encrypting data for security
B. The process of backing up data to the cloud
C. The process of combining data from different sources into a unified view
D. The process of deleting duplicate data from a database

2 Which of the following is a key driver/need for Data Integration in an enterprise?

A. To facilitate business intelligence and decision-making
B. To decrease the speed of the network
C. To isolate departmental data silos
D. To increase the size of the hard drives used

3 In the context of data integration, what does 'Heterogeneity' refer to?

A. The speed at which data is transferred
B. Data that is exactly the same across all systems
C. The security protocols used for data
D. Differences in hardware, operating systems, and data models across sources

4 What does ETL stand for?

A. Enter, Transfer, Load
B. Extract, Test, Lock
C. Extract, Transform, Load
D. Execute, Transmit, Log

5 In the ELT approach, where does the transformation of data primarily occur?

A. In a separate staging server before loading
B. In the middleware application
C. In the source system
D. In the target data warehouse

6 Which integration approach provides a real-time, unified view of data without physically moving the data to a central repository?

A. Data Virtualization (Federation)
B. Data Warehousing
C. Tape Backup
D. Manual Data Entry

7 What is 'Semantic Heterogeneity' in data integration?

A. Using different operating systems
B. Differences in file compression formats
C. Differences in the meaning or interpretation of data (e.g., synonyms, homonyms)
D. Using different network cables

8 Which of the following is an advantage of a Data Warehouse?

A. It slows down analytical queries
B. It is optimized for transaction processing
C. It supports historical analysis and reporting
D. It provides a volatile, changing view of data

9 What creates a 'Single Version of the Truth' (SVOT)?

A. Using Excel spreadsheets
B. Avoiding data cleaning
C. Keeping data in silos
D. Effective Data Integration

10 Which technology is commonly used for Application-based Integration?

A. CD-ROMs
B. Standalone Firewalls
C. Printers
D. Enterprise Application Integration (EAI)

11 What is a Data Warehouse?

A. A subject-oriented, integrated, time-variant, and non-volatile collection of data
B. A physical storage room for hard drives
C. A temporary cache for web browsers
D. A system for recording daily business transactions

12 In the context of a Data Warehouse, what does 'Non-volatile' mean?

A. Data is highly explosive
B. Data disappears when power is lost
C. Data is not changed once loaded; it is read-only for analysis
D. Data is updated in real-time by users

13 What is the primary difference between OLTP and OLAP?

A. OLTP is slower than OLAP
B. OLTP is for analysis; OLAP is for transactions
C. OLTP is for transactions; OLAP is for analysis
D. There is no difference

14 What is a Data Mart?

A. The hardware component of a database
B. A subset of a data warehouse oriented to a specific business line or team
C. A market where data is bought and sold
D. A type of database virus

15 Which type of Data Mart draws data directly from operational sources without a central Data Warehouse?

A. Independent Data Mart
B. Integrated Data Mart
C. Dependent Data Mart
D. Hybrid Data Mart

16 What is a 'Dependent Data Mart'?

A. A mart sourced directly from flat files
B. A mart that cannot function without internet
C. A mart that relies on manual data entry
D. A mart sourced from the central Data Warehouse

17 What is the role of a Staging Area in data integration?

A. To hold data temporarily for processing and cleaning before loading into the warehouse
B. To visualize data for the end-user
C. To run the operating system
D. To permanently store data

18 Which dimension of Data Quality refers to the data correctly representing the real-world object or event?

A. Volume
B. Timeliness
C. Security
D. Accuracy

19 Data Completeness refers to:

A. Whether the data is stored in the cloud
B. Whether the data is encrypted
C. Whether the data format is JSON
D. Whether all required data is present

20 Which process involves examining data to understand its structure, content, and quality?

A. Data Transmission
B. Data Encryption
C. Data Profiling
D. Data Compression

21 What is Data Cleansing (Scrubbing)?

A. Removing old hardware
B. The process of detecting and correcting corrupt or inaccurate records
C. Wiping the hard drive clean
D. Organizing cables in the server room

22 What is 'Metadata'?

A. Big data
B. Encrypted data
C. Data about data
D. Mobile data

23 Which approach to Data Warehousing is known as the 'Top-Down' approach?

A. Inmon Approach
B. Agile Approach
C. Waterfall Approach
D. Kimball Approach

24 Which approach to Data Warehousing is known as the 'Bottom-Up' approach?

A. Inmon Approach
B. Cloud Approach
C. Kimball Approach
D. Spiral Approach

25 What is Change Data Capture (CDC)?

A. Changing the database password
B. Identifying and capturing only the data that has changed since the last extraction
C. Capturing data from the internet
D. Taking a photo of the database

26 Which of the following is NOT a benefit of high Data Quality?

A. Improved customer relations
B. Increased operational costs
C. Regulatory compliance
D. Better decision making

27 What is 'Deduplication'?

A. Creating a backup copy
B. Removing duplicate copies of repeating data
C. Doubling the storage capacity
D. Splitting data into two tables

28 The 'Time-variant' characteristic of a Data Warehouse implies that:

A. Queries must be run within a specific time limit
B. The warehouse only operates during business hours
C. Data is stored with a time element (historical perspective)
D. The system clock is synchronized

29 Which technology allows for data integration via standard XML-based messages over the web?

A. FTP
B. Direct Memory Access
C. Floppy Disks
D. Web Services (SOAP/REST)

30 What is a major disadvantage of 'Manual Data Integration'?

A. Too fast for humans to track
B. Prone to human error and difficult to scale
C. High cost of software tools
D. Requires complex installation

31 In Data Integration, what is a 'Connector' or 'Adapter'?

A. A power supply unit
B. A software component that allows the integration tool to communicate with specific data sources
C. A physical cable
D. A user who connects systems

32 Which of the following refers to 'Data Consistency'?

A. Data is consistently backed up
B. Data values are the same across all systems and copies
C. Data is accessed consistently every day
D. Data is stored on a consistent hardware platform

33 What is 'Latency' in the context of data integration?

A. The time delay between data generation and its availability for use
B. The number of users
C. The cost of the integration tool
D. The size of the data

34 What is 'Subject-Oriented' in the context of Data Warehousing?

A. Organized around storage media
B. Organized around applications (e.g., Payroll app)
C. Organized around major entities like Customer, Product, Sales
D. Organized around file types

35 Which schema is most commonly associated with Data Marts and Warehouses?

A. Star Schema
B. Hierarchical Schema
C. Network Schema
D. XML Schema

36 What is a 'Fact Table' in a Data Warehouse?

A. A table of users
B. A table of system logs
C. A table containing descriptive attributes (text)
D. A table containing quantitative measurements (numbers/metrics)

37 What is a 'Dimension Table'?

A. A table containing measurements
B. A table for metadata only
C. A table containing descriptive attributes (context for facts)
D. A table for temporary calculations

38 Batch processing in data integration means:

A. Processing data in large groups at scheduled intervals
B. Processing data one record at a time instantly
C. Processing data manually
D. Processing data via email

39 Which of the following is a symptom of poor Data Quality?

A. Fast query performance
B. Seamless system integration
C. Marketing mail sent to the wrong addresses
D. Reports are trusted by executives

40 What is 'Granularity' in a Data Warehouse?

A. The level of detail of the data
B. The cost of the storage
C. The texture of the hard drive
D. The security level

41 EII stands for:

A. Enterprise Information Integration
B. Early Information Input
C. Electronic Internet Interface
D. Enterprise Internal Internet

42 Which of the following is considered 'Unstructured Data'?

A. Emails, videos, and social media posts
B. A CSV file
C. An Excel spreadsheet with headers
D. Rows in a SQL database

43 Informatica PowerCenter is primarily used for:

A. Data Integration / ETL
B. Graphic Design
C. Operating System Management
D. Word Processing

44 What is the relationship between Data Governance and Data Quality?

A. Data Governance provides the policies and roles to ensure Data Quality
B. They are unrelated
C. Governance reduces Data Quality
D. Data Quality eliminates the need for Governance

45 Why is a Data Mart often faster to query than a Data Warehouse?

A. It is connected directly to the CPU
B. It does not use indexes
C. It uses better hardware
D. It holds less data and is optimized for specific queries

46 The 'Integrated' characteristic of a Data Warehouse means:

A. Data from various sources is converted to a standard format/naming convention
B. It includes email integration
C. It is integrated with the printer network
D. It is built on a single chip

47 Which comes first in the standard ETL process?

A. Extract
B. Transform
C. Load
D. Analyze

48 What is the purpose of a 'Surrogate Key' in a Data Warehouse?

A. To link to the internet
B. To unlock the server room
C. To replace the natural primary key with a unique internal system identifier
D. To encrypt the data

49 What is 'Data Transformation'?

A. Deleting data
B. Moving data from A to B
C. Archiving data
D. Converting data from source format to destination format (e.g., calculation, filtering)

50 Which of the following is a 'Target System' in an ETL flow?

A. The Data Warehouse where data is loaded
B. The flat file containing raw logs
C. The legacy mainframe system
D. The operational database where transactions happen