1

Define Data Science and explain why there is a growing need for it in the modern industry.

2

What is Big Data? Describe its characteristics using the 3Vs model.

3

Explain the Data Science Lifecycle in detail with a relevant Use Case.

4

Discuss the significant challenges associated with implementing Big Data solutions.

5

Explain the role of Apache Hadoop in Big Data and list its core components.

6

How is Tableau utilized in the field of Data Science?

7

Compare Structured, Semi-Structured, and Unstructured data with examples.

Feature	Structured Data	Semi-Structured Data	Unstructured Data
Definition	Highly organized, fixed format.	Contains tags/markers but no rigid schema.	No specific format or organization.
Storage	Relational Databases (RDBMS).	XML/JSON files, NoSQL databases.	Data Lakes, Flat files.
Ease of Search	Very Easy ( $O(\log n)$ with indexing).	Moderate.	Difficult, requires parsing.
Examples	SQL tables, Excel spreadsheets.	JSON, XML, CSV, HTML.	Images, Audio, Video, Emails, PDF.

8

Why is the R Language popular in Data Science?

9

Discuss the relationship between Cloud Computing and Big Data. What are the benefits of using the Cloud for Big Data?

10

Differentiate between the job roles of a Data Scientist and a Data Engineer.

11

Describe the core skills required for a professional to succeed in the field of Big Data.

12

Is Microsoft Excel still relevant in the era of Big Data? Justify your answer.

13

Elaborate on the applications of Data Science in the Healthcare sector.

14

What is Veracity and Value in the context of Big Data (expanding the 3Vs to 5Vs)?

15

Discuss the Data Preparation phase of the Data Science Lifecycle. Why is it considered the most time-consuming phase?

16

Explain the concept of HDFS (Hadoop Distributed File System) architecture.

17

How is Data Science applied in the E-commerce and Retail industry? Provide examples.

18

What are the key differences between Traditional Data and Big Data?

19

Define the role of a Data Architect.

20

Why is Machine Learning often integrated with Data Science?

Unit1 - Subjective Questions