Unit 1 - Notes

INT364

Unit 1: Cloud Fundamentals and Security

1. Cloud Architecting and AWS Global Infrastructure

Cloud Architecting

Cloud architecting refers to the practice of evaluating trade-offs and making technical decisions to build and deploy systems on a cloud platform. It involves designing solutions that are scalable, reliable, secure, and cost-efficient. Unlike on-premises infrastructure, cloud architecting allows for disposable resources, loose coupling, and treating infrastructure as code.

AWS Global Infrastructure

The physical infrastructure of AWS is broken down into specific geographic and logical categorizations.

1. Regions

A Region is a separate geographic area. Each Region has multiple, isolated locations known as Availability Zones.

  • Independence: Regions are completely independent of one another (except for billing). This ensures that a disaster in one region does not affect another.
  • Data Sovereignty: Data stored in a region never leaves that region unless explicitly moved by the user.
  • Selection Factors:
    • Compliance: Legal requirements regarding where data must reside.
    • Latency: Proximity to the end-user base to reduce delay.
    • Pricing: Services cost different amounts in different regions.
    • Service Availability: Not all AWS services are available in all regions immediately.

2. Availability Zones (AZs)

An Availability Zone is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region.

  • Isolation: AZs are physically separated by a meaningful distance (usually miles) to protect against local disasters (fires, floods) but close enough for low-latency replication (milliseconds).
  • Interconnectivity: AZs within a region are connected via high-bandwidth, low-latency networking.
  • Usage: Architects use multiple AZs to achieve High Availability (HA) and Fault Tolerance. If one AZ fails, traffic can be routed to another.

3. Edge Locations

Edge Locations are endpoints for AWS used for caching content.

  • CloudFront: Primarily used by Amazon CloudFront (Content Delivery Network - CDN) to deliver content to end-users with lower latency.
  • Count: There are significantly more Edge Locations than Regions or AZs.
  • Function: They cache static content (images, video) closer to the user to reduce the load on the origin server.

2. AWS Well-Architected Framework

The Well-Architected Framework provides a consistent approach for customers and partners to evaluate architectures and implement designs that will scale over time. It consists of six pillars.

The Six Pillars

  1. Operational Excellence:

    • Focuses on running and monitoring systems to deliver business value and continually improving processes and procedures.
    • Key topics: Automating changes, responding to events, and defining standards to manage daily operations (Infrastructure as Code).
  2. Security:

    • Focuses on protecting information and systems.
    • Key topics: Confidentiality and integrity of data, identifying and managing who can do what (IAM), protecting systems, and establishing controls to detect security events.
  3. Reliability:

    • Focuses on ensuring a workload performs its intended function correctly and consistently when it’s expected to.
    • Key topics: Distributed system design, recovery planning, and how to handle change (Self-healing systems).
  4. Performance Efficiency:

    • Focuses on using IT and computing resources efficiently.
    • Key topics: Selecting the right resource types and sizes based on workload requirements, monitoring performance, and making informed decisions to maintain efficiency as business needs evolve.
  5. Cost Optimization:

    • Focuses on avoiding unnecessary costs.
    • Key topics: Understanding and controlling where money is being spent, selecting the most appropriate and right number of resource types, and analyzing spend over time.
  6. Sustainability:

    • Focuses on minimizing the environmental impacts of running cloud workloads.
    • Key topics: Shared responsibility model for sustainability, understanding impact, and maximizing utilization to minimize required resources.

General Design Principles

  • Stop guessing capacity: Use auto-scaling rather than fixed server provisioning.
  • Test systems at production scale: Create duplicate environments on demand, complete testing, and decommission them.
  • Automate to make architectural experimentation easier: Use CloudFormation.
  • Allow for evolutionary architectures: decoupled systems allow components to change independently.
  • Drive architectures using data: Collect logs and metrics to inform choices.
  • Improve through game days: Simulate events (stress tests) to test procedures.

3. Identity and Access Management (IAM)

AWS IAM is the "front door" to the AWS cloud. It manages Authentication (who are you?) and Authorization (what can you do?). IAM is a global service; it is not bound to a specific region.

Core Components

  1. Users:

    • Entities that represent a person or service.
    • Root User: The email address used to create the AWS account. It has unlimited access. Best Practice: Lock away the root user keys and never use them for daily tasks.
    • IAM User: Created for individuals within the organization. They have no permissions by default.
  2. Groups:

    • Collections of IAM users.
    • Permissions attached to a group apply to all users within that group.
    • Example: A "Developers" group having EC2 access.
  3. Roles:

    • An identity that creates a set of permissions but is not associated with a specific person.
    • Roles are assumed by trusted entities (like an EC2 instance, a Lambda function, or a user from another account) for temporary access.
    • They rely on temporary security tokens, not long-term passwords or access keys.
  4. Credentials:

    • Console Password: For web interface access.
    • Access Keys (Access Key ID & Secret Access Key): For programmatic access (CLI, SDKs).

4. IAM Policies and Authorization Principles

IAM Policies are JSON documents that define permissions. When an IAM principal (user/role) makes a request, AWS evaluates these policies to allow or deny the request.

Policy Structure (JSON)

A standard policy contains the following elements:

  • Version: Policy language version (usually "2012-10-17").
  • Statement: The main container for the permission details.
    • Sid: Statement ID (optional).
    • Effect: Allow or Deny.
    • Principal: The user/account/role allowed access (usually in Resource-based policies).
    • Action: The specific API call (e.g., s3:ListBucket).
    • Resource: The specific AWS object (e.g., a specific S3 bucket ARN).
    • Condition: Optional constraints (e.g., only from a specific IP address).

Example Policy:

JSON
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:ListBucket",
      "Resource": "arn:aws:s3:::example_bucket"
    }
  ]
}

Authorization Principles

  1. Implicit Deny: By default, all requests are denied.
  2. Explicit Allow: An administrator must attach a policy with "Allow" for a user to perform an action.
  3. Explicit Deny: If any policy associated with the user contains a "Deny" statement for that action, the request is denied, regardless of any "Allow" statements. Explicit Deny trumps Explicit Allow.
  4. Least Privilege: Granting only the permissions required to perform a task and no more. This limits the blast radius if credentials are compromised.

Types of Policies

  • Identity-based Policies: Attached to Users, Groups, or Roles.
  • Resource-based Policies: Attached directly to the resource (e.g., S3 Bucket Policy).
  • Permissions Boundaries: Advanced feature to set the maximum permissions an IAM entity can have.

5. Federated Users and Managing Access Across AWS Accounts

Managing thousands of IAM users individually is inefficient. Federation allows users to use existing identities (corporate credentials, social logins) to access AWS.

Identity Federation

Federation enables you to manage users in a central Identity Provider (IdP) and grant them access to AWS resources without creating IAM users for them.

  • SAML 2.0 Federation: Used for enterprise integration (e.g., Active Directory, Okta). The user authenticates with their corporate IdP, which passes a SAML assertion to AWS STS (Security Token Service) to generate temporary credentials.
  • Web Identity Federation: Used for mobile apps or web apps authenticated via Amazon, Facebook, Google, or OpenID Connect (OIDC) providers. AWS Cognito is often used to facilitate this.

AWS IAM Identity Center (formerly AWS SSO)

The recommended service for managing human access to AWS. It centrally manages SSO access to multiple AWS accounts and business applications. It integrates with Active Directory.

Cross-Account Access

There are scenarios where User A in Account 1 needs to access Resource B in Account 2 (e.g., Dev account accessing Prod S3 bucket).

  1. AssumeRole: User A does not get a username/password for Account 2. Instead, an IAM Role is created in Account 2 that trusts Account 1.
  2. User A calls the STS AssumeRole API.
  3. User A receives temporary credentials to act with the permissions defined in the Role in Account 2.

AWS Organizations

A service to consolidate multiple AWS accounts into an organization that you create and centrally manage.

  • Consolidated Billing: Combine usage across accounts to get volume discounts.
  • Service Control Policies (SCPs): These are organizational guardrails. An SCP acts as a filter on permissions. Even if an IAM Admin in a child account grants full admin access, if the SCP denies access to S3, the user cannot access S3.
    • Note: SCPs never grant permissions; they only filter/limit them.

6. Encrypting Data at Rest

Encryption at rest protects data stored on disk (AWS infrastructure) from unauthorized access, ensuring confidentiality even if physical drives are stolen.

AWS Key Management Service (AWS KMS)

KMS is the primary service for managing encryption keys in AWS.

  • KMS Keys (formerly CMK): The logical representation of a master key.
  • Symmetric Keys: The same key is used for encryption and decryption. Used by most AWS services (S3, EBS, RDS).
  • Asymmetric Keys: A public/private key pair (RSA/ECC). Used for signing and verification or encryption outside of AWS.

Envelope Encryption

AWS KMS does not encrypt large data files directly. It uses Envelope Encryption for performance.

  1. Data Key: A unique key is generated to encrypt the actual data.
  2. Encryption: The data is encrypted with the Data Key.
  3. Envelope: The Data Key itself is encrypted using the root KMS Key.
  4. Storage: The encrypted data and the encrypted Data Key are stored together.
  5. Performance: This reduces network load on KMS, as only the small keys are sent over the network, not the massive data files.

Service Integration Examples

  • EBS (Elastic Block Store): When creating a volume, checking "Encrypt" triggers KMS to handle the encryption of data, IO, and snapshots automatically.
  • S3 (Simple Storage Service):
    • SSE-S3: AWS manages keys fully (AES-256).
    • SSE-KMS: AWS uses KMS keys (provides audit trail via CloudTrail).
    • SSE-C: Customer manages the keys, AWS performs the encryption.

7. AWS Security Services

AWS divides security responsibility between the cloud provider and the customer (Shared Responsibility Model). AWS provides specific tools to secure the customer's side.

Infrastructure Protection

  • AWS Shield: Managed DDoS (Distributed Denial of Service) protection.
    • Shield Standard: Free, always on. Protects against common L3/L4 attacks.
    • Shield Advanced: Paid service. Protects against sophisticated attacks, offers 24/7 access to DDoS response team (DRT), and cost protection.
  • AWS WAF (Web Application Firewall): Protects web applications from common exploits (SQL injection, Cross-Site Scripting). It operates at Layer 7 (Application Layer) and attaches to CloudFront, ALB, or API Gateway.
  • AWS Firewall Manager: Central management of WAF rules across all accounts in an AWS Organization.

Threat Detection and Monitoring

  • Amazon GuardDuty: An intelligent threat detection service. It analyzes logs (CloudTrail, VPC Flow Logs, DNS Logs) using Machine Learning to detect anomalies (e.g., cryptocurrency mining, unauthorized access from unusual locations).
  • Amazon Inspector: An automated vulnerability management service. It installs an agent on EC2 instances to scan for software vulnerabilities (CVEs) and unintended network exposure.

Data Protection

  • Amazon Macie: Uses Machine Learning to automatically discover, classify, and protect sensitive data (PII like credit card numbers, SSNs) in AWS S3. It helps prevent data leaks.
  • AWS Secrets Manager: Helps protect secrets needed to access applications (database credentials, API keys). It enables automatic rotation of secrets (e.g., changing the RDS password every 30 days automatically).

Compliance

  • AWS Artifact: A self-service portal for on-demand access to AWS’s compliance reports (ISO, PCI-DSS, SOC reports). It is used to prove to auditors that the underlying AWS infrastructure is secure.