Unit 5 - Notes

CSE121

Unit 5: Cloud Computing

1. Introduction to Cloud Computing

Definition

Cloud computing is the on-demand delivery of IT resources—including servers, storage, databases, networking, software, analytics, and intelligence—over the Internet ("the cloud") to offer faster innovation, flexible resources, and economies of scale.

Instead of buying, owning, and maintaining physical data centers and servers, organizations access technology services on an as-needed basis from a cloud provider like Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).

Key Characteristics (NIST Standard)

  1. On-Demand Self-Service: Users can provision computing capabilities (server time, network storage) automatically without requiring human interaction with the service provider.
  2. Broad Network Access: Capabilities are available over the network and accessed through standard mechanisms (mobile phones, tablets, laptops, workstations).
  3. Resource Pooling: The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model (different physical and virtual resources dynamically assigned according to demand).
  4. Rapid Elasticity: Capabilities can be elastically provisioned and released to scale rapidly outward and inward commensurate with demand.
  5. Measured Service: Cloud systems automatically control and optimize resource use by leveraging a metering capability (pay-per-use model).

2. Uses of Cloud Computing in Applications Services

Cloud computing has revolutionized how applications are built, deployed, and consumed.

Core Application Uses

  • File Storage and Backup: Storing files remotely to access them from anywhere (e.g., Google Drive, Dropbox, OneDrive). It facilitates disaster recovery and data redundancy.
  • Big Data Analytics: Processing vast amounts of data using scalable cloud processors to derive business intelligence (e.g., consumer behavior analysis).
  • Software Testing and Development: Developers can quickly spin up environments to test code and tear them down immediately after, reducing the cost of physical test labs.
  • Communication and Collaboration: Hosting email servers (Microsoft Exchange Online), video conferencing (Zoom), and collaborative documents (Google Workspace).
  • Streaming Services: Content Delivery Networks (CDNs) in the cloud allow for low-latency streaming of video and audio (e.g., Netflix, Spotify).
  • Disaster Recovery: replicating systems to a cloud environment to ensure business continuity during a physical site failure.

3. Platform Deployments & Types of Cloud Model Implementations

These are often referred to as Deployment Models. They define who has access to the cloud infrastructure and how it is managed.

A. Public Cloud

  • Description: The cloud infrastructure is provisioned for open use by the general public. It is owned, managed, and operated by a third-party cloud provider.
  • Examples: AWS, Microsoft Azure, Google Cloud.
  • Pros: Cost-effective (no hardware investment), high scalability, no maintenance.
  • Cons: Less control over security, potential compliance issues for sensitive industries.

B. Private Cloud

  • Description: The cloud infrastructure is provisioned for exclusive use by a single organization comprising multiple consumers (e.g., business units). It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off-premises.
  • Examples: VMware vSphere, OpenStack (hosted internally).
  • Pros: High security, control over data sovereignty, customizable.
  • Cons: Expensive (hardware and maintenance costs), harder to scale than public cloud.

C. Hybrid Cloud

  • Description: A composition of two or more distinct cloud infrastructures (private, community, or public) that remain unique entities but are bound together by technology that enables data and application portability.
  • Use Case: An organization keeps sensitive customer data on a Private Cloud (on-premise) but uses the Public Cloud for running app logic or "bursting" during high traffic.
  • Pros: Flexibility, optimization of existing infrastructure, security compliance.
  • Cons: Complex network configuration and management.

D. Community Cloud

  • Description: The infrastructure is shared by several organizations and supports a specific community that has shared concerns (e.g., mission, security requirements, policy, and compliance considerations).
  • Examples: Government clouds, healthcare consortiums.

4. Types of Cloud Services (Service Models)

The "Stack" of cloud computing typically falls into three main categories (SPI Model).

IaaS: Infrastructure as a Service

  • Definition: Provides the fundamental building blocks for cloud IT. It offers access to networking features, computers (virtual or on dedicated hardware), and data storage space.
  • User manages: OS, Middleware, Runtime, Data, Applications.
  • Provider manages: Virtualization, Servers, Storage, Networking.
  • Target User: System Administrators, Network Architects.
  • Examples: Amazon EC2, Google Compute Engine, Microsoft Azure Virtual Machines.

PaaS: Platform as a Service

  • Definition: Removes the need for organizations to manage the underlying infrastructure (usually hardware and operating systems) and allows you to focus on the deployment and management of your applications.
  • User manages: Data, Applications.
  • Provider manages: OS, Middleware, Runtime, Virtualization, Servers, Storage, Networking.
  • Target User: Developers.
  • Examples: Google App Engine, AWS Elastic Beanstalk, Heroku, Windows Azure (Web Apps).

SaaS: Software as a Service

  • Definition: Provides a completed product that is run and managed by the service provider. Users access the application via a web browser.
  • User manages: Nothing (except application settings/configuration).
  • Provider manages: Everything (Full Stack).
  • Target User: End Users.
  • Examples: Gmail, Salesforce, Dropbox, Zoom, Microsoft Office 365.

5. Virtualization

Virtualization is the fundamental technology that powers cloud computing. It allows a single physical instance of a resource or an application to be shared among multiple customers and organizations.

How it Works

Virtualization separates the operating system from the underlying hardware. A software layer called a Hypervisor sits on top of the physical hardware and allows multiple operating systems (Virtual Machines or VMs) to run simultaneously on the same hardware.

Types of Hypervisors

  1. Type 1 (Bare Metal): Installs directly on the hardware. Highly efficient. (e.g., VMware ESXi, Microsoft Hyper-V).
  2. Type 2 (Hosted): Installs on top of an existing Operating System. Used for personal testing. (e.g., Oracle VirtualBox, VMware Workstation).

Types of Virtualization

  • Server Virtualization: Partitioning a physical server into smaller virtual servers.
  • Storage Virtualization: Pooling physical storage from multiple network storage devices into a single storage device managed from a central console.
  • Network Virtualization: Reproducing a physical network in software (Software Defined Networking - SDN).
  • Desktop Virtualization (VDI): Hosting a desktop environment on a central server.

6. Data Analytics in Cloud

Cloud computing provides the processing power and storage scale required for modern data analytics.

Key Concepts

  • Data Lakes: Storing vast amounts of raw, unstructured data in the cloud (e.g., Amazon S3, Azure Data Lake).
  • Data Warehousing: Storing structured data optimized for queries and reporting (e.g., Amazon Redshift, Google BigQuery, Snowflake).
  • Cloud Processing: Using scalable compute clusters (like Apache Spark on Hadoop) to process data without buying supercomputers.

Benefits

  • Scalability: Analytics workloads are often "bursty." Cloud allows scaling up resources during heavy processing and scaling down afterward.
  • Cost: Separation of compute and storage reduces costs compared to traditional on-premise appliances.
  • Machine Learning Integration: Cloud providers offer pre-built ML models and APIs (vision, speech, text) that integrate directly with stored data.

7. Tools and Techniques for Implementing Cloud Computing

Implementing and managing cloud resources requires specific modern tools, often centered around automation.

Infrastructure as Code (IaC)

Managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools.

  • Tools: Terraform (provider agnostic), AWS CloudFormation, Ansible, Chef, Puppet.

Containerization

A lightweight form of virtualization. Instead of virtualizing the hardware (like a VM), containers virtualize the Operating System. They package code and dependencies together.

  • Tools: Docker (the standard for creating containers).

Orchestration

Managing the lifecycles of containers, especially in large, dynamic environments.

  • Tools: Kubernetes (K8s) - The industry standard for container orchestration; Docker Swarm.

CI/CD (Continuous Integration/Continuous Deployment)

Pipelines that automate the testing and deployment of code to the cloud.

  • Tools: Jenkins, GitLab CI, GitHub Actions, AWS CodePipeline.

Monitoring and Logging

Tracking the health and performance of cloud resources.

  • Tools: Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), AWS CloudWatch.

8. Job Roles and Skillset for Cloud Computing

As companies migrate to the cloud, the demand for specialized roles has surged.

Key Job Roles

  1. Cloud Architect: Designs the cloud environment and strategy. Decisions include selecting the cloud provider, designing the network topology, and ensuring high availability.
  2. Cloud Engineer: Implements and maintains the cloud infrastructure. Handles migration, maintenance, and troubleshooting.
  3. DevOps Engineer: Bridges the gap between development and operations. Focuses on automation, CI/CD pipelines, and IaC.
  4. Cloud Security Specialist: Ensures the cloud environment is secure against threats, manages identity and access (IAM), and ensures regulatory compliance.
  5. Data Engineer: Builds pipelines to move data into cloud storage and prepares it for analysis.

Required Skillset (Hard Skills)

  • Cloud Provider Proficiency: Deep knowledge of at least one major platform (AWS, Azure, or GCP).
  • Linux/Operating Systems: Most cloud backends run on Linux; command-line proficiency is essential.
  • Networking: Understanding of DNS, TCP/IP, VPNs, Firewalls, and Load Balancing.
  • Programming/Scripting: Python, Go, or Bash for automation and scripting.
  • Database Management: Knowledge of SQL (RDS) and NoSQL (DynamoDB, MongoDB) databases.
  • APIs and Web Services: Understanding REST, SOAP, and how services communicate.
  • Security: Understanding of IAM (Identity and Access Management), encryption, and security groups.

Required Skillset (Soft Skills)

  • Problem Solving: Troubleshooting complex distributed systems.
  • Adaptability: Cloud technologies change rapidly; continuous learning is required.
  • Business Acumen: Understanding cost management (FinOps) to prevent cloud bill shock.