AWS Cloud Platform

Master Amazon Web Services — from core compute and storage to advanced serverless architectures and managed Kubernetes.

AWS Overview

Beginner

Amazon Web Services (AWS) is the world's most comprehensive cloud platform, offering 200+ services from data centers globally. As a DevOps engineer, you'll primarily work with compute, networking, storage, containers, and infrastructure automation services.

AWS CLI Setup

Most AWS operations in this guide use the AWS CLI. Install it and configure your credentials first.

# Install AWS CLI v2
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Configure credentials
aws configure
# AWS Access Key ID: AKIA...
# AWS Secret Access Key: ...
# Default region name: us-east-1
# Default output format: json

# Verify
aws sts get-caller-identity

IAM — Identity & Access Management

Beginner

IAM controls who can access what in your AWS account. It's the foundation of AWS security.

Key Concepts

# Create a user
aws iam create-user --user-name devops-deployer

# Create a policy (JSON)
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:Describe*",
                "ec2:RunInstances",
                "ec2:TerminateInstances",
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": "*"
        }
    ]
}

# Attach policy to user
aws iam attach-user-policy \
  --user-name devops-deployer \
  --policy-arn arn:aws:iam::policy/DeployerPolicy
Security Best Practice

Never use the root account for daily operations. Always use IAM roles instead of access keys where possible. Enable MFA on all human user accounts.

EC2 — Elastic Compute Cloud

Beginner

EC2 provides resizable virtual servers in the cloud. You can launch instances with different OS, CPU, memory, and storage configurations.

# Launch an EC2 instance
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.micro \
  --key-name my-key-pair \
  --security-group-ids sg-0123456789abcdef0 \
  --subnet-id subnet-0123456789abcdef0 \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=WebServer}]'

# List running instances
aws ec2 describe-instances \
  --filters "Name=instance-state-name,Values=running" \
  --query "Reservations[].Instances[].[InstanceId,PublicIpAddress,Tags[?Key=='Name'].Value|[0]]" \
  --output table

# Connect via SSH
ssh -i my-key-pair.pem ec2-user@

# Stop / Start / Terminate
aws ec2 stop-instances --instance-ids i-1234567890abcdef0
aws ec2 start-instances --instance-ids i-1234567890abcdef0
aws ec2 terminate-instances --instance-ids i-1234567890abcdef0

Instance Types Cheat Sheet

FamilyUse CaseExample
t3General purpose, burstableWeb servers, small apps
m6iGeneral purpose, steadyApp servers, backends
c6iCompute optimizedBatch processing, ML
r6iMemory optimizedDatabases, caching
g5GPU instancesML training, rendering

S3 — Simple Storage Service

Beginner

S3 is object storage with 99.999999999% (11 nines) durability. Use it for backups, static sites, data lakes, and artifact storage.

# Create a bucket
aws s3 mb s3://my-devops-artifacts-2026

# Upload files
aws s3 cp ./build/ s3://my-devops-artifacts-2026/builds/v1.0/ --recursive

# Sync a directory
aws s3 sync ./dist/ s3://my-website-bucket/ --delete

# List objects
aws s3 ls s3://my-devops-artifacts-2026/builds/

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket my-devops-artifacts-2026 \
  --versioning-configuration Status=Enabled

# Set lifecycle policy (move to Glacier after 90 days)
aws s3api put-bucket-lifecycle-configuration \
  --bucket my-devops-artifacts-2026 \
  --lifecycle-configuration '{
    "Rules": [{
      "ID": "ArchiveOldBuilds",
      "Status": "Enabled",
      "Transitions": [{
        "Days": 90,
        "StorageClass": "GLACIER"
      }],
      "Filter": {"Prefix": "builds/"}
    }]
  }'

VPC — Virtual Private Cloud

Beginner

A VPC is your isolated network in AWS. It contains subnets, route tables, internet gateways, and security groups.

# Create VPC
aws ec2 create-vpc --cidr-block 10.0.0.0/16 \
  --tag-specifications 'ResourceType=vpc,Tags=[{Key=Name,Value=Production-VPC}]'

# Create public subnet
aws ec2 create-subnet \
  --vpc-id vpc-xxx \
  --cidr-block 10.0.1.0/24 \
  --availability-zone us-east-1a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=Public-1a}]'

# Create private subnet
aws ec2 create-subnet \
  --vpc-id vpc-xxx \
  --cidr-block 10.0.10.0/24 \
  --availability-zone us-east-1a \
  --tag-specifications 'ResourceType=subnet,Tags=[{Key=Name,Value=Private-1a}]'

# Create Internet Gateway
aws ec2 create-internet-gateway
aws ec2 attach-internet-gateway --vpc-id vpc-xxx --internet-gateway-id igw-xxx

RDS — Relational Database Service

Intermediate
# Create a PostgreSQL RDS instance
aws rds create-db-instance \
  --db-instance-identifier production-db \
  --db-instance-class db.t3.medium \
  --engine postgres \
  --engine-version 15.4 \
  --master-username admin \
  --master-user-password MySecurePassword123 \
  --allocated-storage 100 \
  --storage-type gp3 \
  --vpc-security-group-ids sg-xxx \
  --db-subnet-group-name my-db-subnet-group \
  --multi-az \
  --storage-encrypted \
  --backup-retention-period 7

Lambda — Serverless Computing

Intermediate

AWS Lambda lets you run code without provisioning servers. You pay only for the compute time you consume.

# lambda_function.py
import json
import boto3

def handler(event, context):
    """Process S3 upload events"""
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']
        print(f"New file uploaded: s3://{bucket}/{key}")

        # Process the file...

    return {
        'statusCode': 200,
        'body': json.dumps('Processing complete')
    }
# Deploy Lambda function
zip function.zip lambda_function.py

aws lambda create-function \
  --function-name process-uploads \
  --runtime python3.12 \
  --handler lambda_function.handler \
  --role arn:aws:iam::role/lambda-execution-role \
  --zip-file fileb://function.zip \
  --timeout 30 \
  --memory-size 256

ECS & ECR — Container Services

Intermediate

ECS (Elastic Container Service) is AWS's container orchestration service. ECR (Elastic Container Registry) is a private Docker registry.

# Create ECR repository
aws ecr create-repository --repository-name myapp

# Push image to ECR
aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com

docker tag myapp:v1 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:v1
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:v1

# ECS Task Definition (JSON)
{
    "family": "webapp",
    "networkMode": "awsvpc",
    "requiresCompatibilities": ["FARGATE"],
    "cpu": "256",
    "memory": "512",
    "containerDefinitions": [{
        "name": "webapp",
        "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:v1",
        "portMappings": [{
            "containerPort": 3000,
            "protocol": "tcp"
        }],
        "logConfiguration": {
            "logDriver": "awslogs",
            "options": {
                "awslogs-group": "/ecs/webapp",
                "awslogs-region": "us-east-1",
                "awslogs-stream-prefix": "ecs"
            }
        }
    }]
}

Route 53 — DNS Service

Intermediate
# Create a hosted zone
aws route53 create-hosted-zone --name example.com --caller-reference unique-string

# Create an A record pointing to ALB
aws route53 change-resource-record-sets --hosted-zone-id Z123456 \
  --change-batch '{
    "Changes": [{
      "Action": "CREATE",
      "ResourceRecordSet": {
        "Name": "app.example.com",
        "Type": "A",
        "AliasTarget": {
          "HostedZoneId": "Z35SXDOTRQ7X7K",
          "DNSName": "my-alb-123.us-east-1.elb.amazonaws.com",
          "EvaluateTargetHealth": true
        }
      }
    }]
  }'

CloudFormation

Intermediate

CloudFormation is AWS's native Infrastructure as Code service. It creates and manages resources from YAML/JSON templates.

# cloudformation-template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Description: Web Application Stack

Parameters:
  Environment:
    Type: String
    AllowedValues: [dev, staging, production]
  InstanceType:
    Type: String
    Default: t3.micro

Resources:
  WebServerSecurityGroup:
    Type: AWS::EC2::SecurityGroup
    Properties:
      GroupDescription: Allow HTTP and SSH
      VpcId: !Ref VPC
      SecurityGroupIngress:
        - IpProtocol: tcp
          FromPort: 80
          ToPort: 80
          CidrIp: 0.0.0.0/0
        - IpProtocol: tcp
          FromPort: 443
          ToPort: 443
          CidrIp: 0.0.0.0/0

  WebServer:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: !Ref InstanceType
      ImageId: ami-0c55b159cbfafe1f0
      SecurityGroupIds:
        - !Ref WebServerSecurityGroup
      Tags:
        - Key: Name
          Value: !Sub "${Environment}-webserver"

Outputs:
  WebServerPublicIP:
    Value: !GetAtt WebServer.PublicIp
    Description: Public IP of the web server

EKS — Elastic Kubernetes Service

Advanced
# Create EKS cluster with eksctl
eksctl create cluster \
  --name production-cluster \
  --region us-east-1 \
  --version 1.29 \
  --nodegroup-name standard-workers \
  --node-type t3.medium \
  --nodes 3 \
  --nodes-min 2 \
  --nodes-max 10 \
  --managed

# Update kubeconfig
aws eks update-kubeconfig --name production-cluster --region us-east-1

# Verify
kubectl get nodes

# Install AWS Load Balancer Controller
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  -n kube-system \
  --set clusterName=production-cluster

AWS Well-Architected Framework

Advanced

The 6 pillars every DevOps engineer should know:

  1. Operational Excellence — Automate operations, respond to events, define standards
  2. Security — Protect data, systems, and assets through IAM, encryption, and compliance
  3. Reliability — Recover from failures, meet demand with auto-scaling and multi-AZ
  4. Performance Efficiency — Use compute resources efficiently, right-size instances
  5. Cost Optimization — Avoid unnecessary costs, use Reserved Instances and Savings Plans
  6. Sustainability — Minimize environmental impact of cloud workloads

Cost Optimization

Advanced
Cost Saving Strategies

A well-optimized AWS account can save 40-70% on compute costs alone through the right pricing models and resource management.