Unit 5 - Notes
Unit 5: Continuous Integration (CI) with GitHub Actions
1. Understanding Workflow Automation
Workflow automation in the context of DevOps refers to the automatic execution of a sequence of tasks, processes, or scripts triggered by specific events. In software development, this primarily involves Continuous Integration (CI) and Continuous Deployment (CD).
GitHub Actions is a powerful, native CI/CD and automation platform integrated directly into GitHub. It allows developers to automate their software development lifecycle directly from their repositories, eliminating the need for third-party CI/CD tools like Jenkins or CircleCI.
2. Core Components and Directory Structure
Workflow Directory Structure
GitHub Actions relies on YAML files to define workflows. These files must be stored in a specific directory at the root of your repository:
.github/workflows/
Any .yml or .yaml file placed inside this directory is automatically recognized by GitHub as a workflow definition.
Key Components
A GitHub Actions architecture is built upon five foundational components:
- Workflows: An automated procedure added to your repository. Workflows are defined by a YAML file and contain one or more jobs. They are triggered by events.
- Events: Specific activities in a repository that trigger a workflow run (e.g., code push, issue creation, pull request).
- Jobs: A set of steps in a workflow that execute on the same runner. By default, a workflow with multiple jobs will run those jobs in parallel, but they can be configured to run sequentially based on dependencies.
- Steps: Individual tasks that run commands in a job. A step can either run a shell script (
run) or execute an action (uses). All steps in a job run on the same runner and share the same filesystem. - Actions: Standalone, reusable commands that perform a complex but frequently repeated task (e.g., checking out code, setting up a Node.js environment).
- Runners: The server (virtual machine or container) that runs your workflows when they're triggered.
3. Events and Workflow Triggers
Workflows are executed based on triggers defined in the on block of the YAML file.
Common Triggers
- Push: Triggers when code is pushed to a specified branch.
- Pull Request: Triggers when a PR is opened, synchronized, or reopened.
- Schedule: Triggers at scheduled times using POSIX cron syntax.
- Manual Workflow: Uses the
workflow_dispatchevent to allow users to trigger the workflow manually from the GitHub UI or GitHub CLI. It can also accept manual input parameters.
Example of Triggers:
on:
push:
branches:
- main
pull_request:
branches:
- main
schedule:
- cron: '0 0 * * *' # Runs daily at midnight UTC
workflow_dispatch:
inputs:
environment:
description: 'Environment to deploy to'
required: true
default: 'staging'
4. Jobs, Matrix Strategies, and Steps
Jobs & Multi-job Workflows
A workflow can have multiple jobs. By default, they run in parallel. You can use the needs keyword to create dependencies, forming a pipeline where a job only starts if its prerequisite jobs succeed.
jobs:
build:
runs-on: ubuntu-latest
steps:
- run: echo "Building..."
test:
needs: build # Will wait for 'build' to complete
runs-on: ubuntu-latest
steps:
- run: echo "Testing..."
Matrix Strategies
A matrix strategy allows you to use variables in a single job definition to automatically create multiple job runs that are based on the combinations of the variables. This is highly useful for testing code across multiple language versions or operating systems.
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [14.x, 16.x, 18.x]
os: [ubuntu-latest, windows-latest]
steps:
- uses: actions/checkout@v3
- name: Use Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node-version }}
Steps & Shell Commands
Steps execute inside the runner. The run keyword executes shell commands. You can specify the shell (bash, pwsh, python) or run multi-line scripts.
steps:
- name: Install Dependencies
run: |
npm install
npm run build
5. Actions, Language Setup, and Optimization
Using Marketplace Actions
The GitHub Marketplace hosts thousands of pre-built actions created by the community and verified publishers. You invoke them using the uses keyword.
actions/checkout@v3: Fetches your repository code into the runner.actions/upload-artifact@v3: Saves files generated during the build.
Language-Specific Actions
To compile or test code, the runner needs the correct language environment. Standard actions exist for this:
- Node.js:
actions/setup-node - Python:
actions/setup-python - Java:
actions/setup-java - Go:
actions/setup-go
Using Caching for Faster Builds
Dependencies (like node_modules or .m2 directories) can take a long time to download. The actions/cache action saves these directories between workflow runs, drastically reducing CI times.
steps:
- uses: actions/checkout@v3
- name: Cache node modules
uses: actions/cache@v3
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
6. Runners: Execution Environments
GitHub-Hosted Runners
GitHub provides virtual machines managed and maintained by GitHub.
- Pros: Zero maintenance, fresh isolated environment for every job, automatically updated.
- Cons: Limited hardware resources, IP addresses change (harder to whitelist), potential queuing delays.
- Usage:
runs-on: ubuntu-latest,windows-latest,macos-latest.
Self-Hosted Runners
You can install the GitHub Actions runner application on your own machines (on-premises, AWS EC2, Raspberry Pi, etc.).
- Pros: Highly customizable hardware, can be placed inside private networks (VPCs), persistent caches, no per-minute billing from GitHub.
- Cons: You manage the OS, updates, and scaling.
Runner Security & Management
- Ephemeral vs. Persistent: GitHub-hosted runners are ephemeral (destroyed after use). Self-hosted runners are usually persistent, meaning state can leak between runs.
- Security Risk: Never use self-hosted runners on public repositories for pull requests without strict approval workflows. Malicious actors can submit PRs containing code that executes on your private infrastructure.
- Network Security: Self-hosted runners communicate outbound to GitHub via HTTPS (port 443), eliminating the need to open inbound firewall ports.
7. Docker & GitHub Actions
GitHub Actions excels at building and publishing Docker containers.
Building Docker Images in CI
You can run standard Docker commands using the run step, or use official Docker actions like docker/build-push-action.
Pushing to Docker Hub
Requires setting up secrets (DOCKER_USERNAME, DOCKER_PASSWORD) in the GitHub repository settings.
steps:
- name: Log in to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Build and push Docker image
uses: docker/build-push-action@v3
with:
context: .
push: true
tags: user/app:latest
Pushing to GitHub Container Registry (GHCR)
GHCR is GitHub's native container registry. Authentication is handled automatically using the built-in GITHUB_TOKEN.
steps:
- name: Log in to GHCR
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push to GHCR
uses: docker/build-push-action@v3
with:
context: .
push: true
tags: ghcr.io/${{ github.repository }}/app:latest
8. Continuous Deployment (CD): Deploying to Servers/Cloud
GitHub Actions extends beyond CI to handle CD, pushing artifacts or containers to live environments.
Deployments to Servers/Clouds
Deployments can be achieved in multiple ways depending on the target infrastructure:
- SSH Deployments: Using actions like
appleboy/ssh-actionto connect to a VPS and execute pull/restart commands. - Cloud Providers (AWS, Azure, GCP): Providers offer official actions to authenticate and deploy.
- AWS:
aws-actions/configure-aws-credentials(Followed by commands likeaws ecs update-serviceoraws s3 sync). - Azure:
azure/login(Followed by Azure Web App deployment actions).
- AWS:
- Kubernetes: Using
azure/k8s-set-contextoraws-actions/amazon-eks-update-kubeconfigto connect to a cluster, followed bykubectl applycommands.
Best Practices for Deployment Workflows
- Environments: Use GitHub Environments to require manual approval before a deployment job runs.
- Secrets Management: Never hardcode credentials. Use GitHub Secrets to store SSH keys, cloud access keys, and passwords.
- OIDC (OpenID Connect): Instead of storing long-lived cloud credentials in GitHub, use OIDC to allow GitHub Actions to request short-lived, temporary access tokens from cloud providers (AWS IAM, Azure AD, GCP IAM). This significantly enhances security.