Skip to content

OCP on AWS Local Zones - hands on steps

Installing OpenShift on AWS in existing VPC extending compute nodes to Local Zones quickly.

The steps described here is a hands-on steps to achieve the goal of installing a cluster with customized VPC (existing network), then describing how to deploy an application exposing the ingress with dedicated Application Load Balancer controller. A quick validation will be done from a source outside the cluster, on the metropolitan region deployed the app.

The goal is not to explain each step. If you are looking for more details look at those documents:

Table of Contents

Creating the Network/VPC

  • Create the Network Stack: VPC and Local Zone subnet
export CLUSTER_REGION=us-east-1
export CLUSTER_NAME=byon-lz415-00

WORKDIR=${PWD}
export INSTALL_DIR=${WORKDIR}/install-${CLUSTER_NAME}
mkdir -p $INSTALL_DIR

TEMPLATES_BASE=https://raw.githubusercontent.com/openshift/installer
TEMPLATES_VERSION=master
TEMPLATES_PATH=upi/aws/cloudformation

TEMPLATE_URL=${TEMPLATES_BASE}/${TEMPLATES_VERSION}/${TEMPLATES_PATH}
TEMPLATES=( "01_vpc.yaml" )
TEMPLATES+=( "01.99_net_local-zone.yaml" )

for TEMPLATE in "${TEMPLATES[@]}"; do
  echo "Updating ${TEMPLATE}"
  curl -sL "${TEMPLATE_URL}/${TEMPLATE}" > "${WORKDIR}/${TEMPLATE}"
done

export STACK_VPC=${CLUSTER_NAME}-vpc
aws cloudformation create-stack \
  --region ${CLUSTER_REGION} \
  --stack-name ${STACK_VPC} \
  --template-body file://${WORKDIR}/01_vpc.yaml \
  --parameters \
    ParameterKey=VpcCidr,ParameterValue="10.0.0.0/16" \
    ParameterKey=AvailabilityZoneCount,ParameterValue=3 \
    ParameterKey=SubnetBits,ParameterValue=12

aws --region $CLUSTER_REGION cloudformation wait stack-create-complete --stack-name ${STACK_VPC}
aws --region $CLUSTER_REGION cloudformation describe-stacks --stack-name ${STACK_VPC}

# Choosing randomly the Local Zone available in the Region
AZ_NAME=$(aws --region $CLUSTER_REGION ec2 describe-availability-zones \
  --filters Name=opt-in-status,Values=opted-in Name=zone-type,Values=local-zone \
  | jq -r .AvailabilityZones[].ZoneName | shuf | head -n1)

AZ_NAME="us-east-1-nyc-1a"
AZ_SUFFIX=$(echo ${AZ_NAME/${CLUSTER_REGION}-/})

AZ_GROUP=$(aws --region $CLUSTER_REGION ec2 describe-availability-zones \
  --filters Name=zone-name,Values=$AZ_NAME \
  | jq -r .AvailabilityZones[].GroupName)

export STACK_LZ=${CLUSTER_NAME}-lz-${AZ_SUFFIX}
export ZONE_GROUP_NAME=${AZ_GROUP}
export VPC_ID=$(aws --region $CLUSTER_REGION cloudformation describe-stacks \
  --stack-name ${STACK_VPC} \
  | jq -r '.Stacks[0].Outputs[] | select(.OutputKey=="VpcId").OutputValue' )

export VPC_RTB_PUB=$(aws --region $CLUSTER_REGION cloudformation describe-stacks \
  --stack-name ${STACK_VPC} \
  | jq -r '.Stacks[0].Outputs[] | select(.OutputKey=="PublicRouteTableId").OutputValue' )

aws --region $CLUSTER_REGION ec2 modify-availability-zone-group \
    --group-name "${ZONE_GROUP_NAME}" \
    --opt-in-status opted-in

aws --region $CLUSTER_REGION cloudformation create-stack --stack-name ${STACK_LZ} \
     --template-body file://${WORKDIR}/01.99_net_local-zone.yaml \
     --parameters \
        ParameterKey=VpcId,ParameterValue="${VPC_ID}" \
        ParameterKey=PublicRouteTableId,ParameterValue="${VPC_RTB_PUB}" \
        ParameterKey=ZoneName,ParameterValue="${AZ_NAME}" \
        ParameterKey=SubnetName,ParameterValue="${CLUSTER_NAME}-public-${AZ_NAME}" \
        ParameterKey=PublicSubnetCidr,ParameterValue="10.0.128.0/20"

aws --region $CLUSTER_REGION cloudformation wait stack-create-complete --stack-name ${STACK_LZ}
aws --region $CLUSTER_REGION cloudformation describe-stacks --stack-name ${STACK_LZ}

mapfile -t SUBNETS < <(aws --region $CLUSTER_REGION cloudformation describe-stacks \
  --stack-name "${STACK_VPC}" \
  | jq -r '.Stacks[0].Outputs[0].OutputValue' | tr ',' '\n')

mapfile -t -O "${#SUBNETS[@]}" SUBNETS < <(aws --region $CLUSTER_REGION cloudformation describe-stacks \
  --stack-name "${STACK_VPC}" \
  | jq -r '.Stacks[0].Outputs[1].OutputValue' | tr ',' '\n')

# get the Local Zone subnetID
export SUBNET_ID=$(aws --region $CLUSTER_REGION cloudformation describe-stacks --stack-name "${STACK_LZ}" \
  | jq -r .Stacks[0].Outputs[0].OutputValue)


echo ${SUBNETS[*]}
SUBNETS+=(${SUBNET_ID})
echo ${SUBNETS[*]}
  • Append the LZ Subnet to the Subnets array - available only in the phase-1 of Local Zones implementation (not covered on HC blog):

Phase-1 means the installer should discovery the Local Zone subnet by it's ID, parse it and automatically create the Machine Sets for those zones

echo ${SUBNETS[*]}
SUBNETS+=(${SUBNET_ID})
echo ${SUBNETS[*]}

Setup the installer

Create the install-config.yaml

  • create the install-config setting subnet IDs

installing in existing VPC method (phase I)

export BASE_DOMAIN=devcluster.openshift.com
export SSH_PUB_KEY_FILE=$HOME/.ssh/id_rsa.pub

CLUSTER_NAME_VARIANT=${CLUSTER_NAME}1
INSTALL_DIR=${CLUSTER_NAME_VARIANT}
mkdir $INSTALL_DIR

cat <<EOF > ${INSTALL_DIR}/install-config.yaml
apiVersion: v1
publish: External
baseDomain: ${BASE_DOMAIN}
metadata:
  name: "${CLUSTER_NAME_VARIANT}"
platform:
  aws:
    region: ${CLUSTER_REGION}
    subnets:
$(for SB in ${SUBNETS[*]}; do echo "    - $SB"; done)
pullSecret: '$(cat ${PULL_SECRET_FILE} | awk -v ORS= -v OFS= '{$1=$1}1')'
sshKey: |
  $(cat ${SSH_PUB_KEY_FILE})
EOF
  • Default (without subnets)

installer creates the VPC and LZ subnets (phase II)

export BASE_DOMAIN=devcluster.openshift.com
export SSH_PUB_KEY_FILE=$HOME/.ssh/id_rsa.pub

INSTALL_DIR=${CLUSTER_NAME}-7
mkdir $INSTALL_DIR

cat <<EOF > ${INSTALL_DIR}/install-config.yaml
apiVersion: v1
publish: External
baseDomain: ${BASE_DOMAIN}
metadata:
  name: "${CLUSTER_NAME}"
platform:
  aws:
    region: ${CLUSTER_REGION}
pullSecret: '$(cat ${PULL_SECRET_FILE} |awk -v ORS= -v OFS= '{$1=$1}1')'
sshKey: |
  $(cat ${SSH_PUB_KEY_FILE})
EOF
  • Install Config setting HostedZone when installing in existing VPC with LZ subnets
# 1) Manual action: Create Private Hosted Zone associating to the existing VPC
# 2) Discovery the ID - or populate the variable hostedZone
PHZ_BASE_DOMAIN="${CLUSTER_NAME}.${BASE_DOMAIN}"
hosted_zone_id="$(aws route53 list-hosted-zones-by-name \
            --dns-name "${PHZ_BASE_DOMAIN}" \
            --query "HostedZones[? Config.PrivateZone != \`true\` && Name == \`${PHZ_BASE_DOMAIN}.\`].Id" \
            --output text)"

echo "${hosted_zone_id}"
hostedZone=$(basename "${hosted_zone_id}")

cat <<EOF > ${INSTALL_DIR}/install-config.yaml
apiVersion: v1
publish: External
baseDomain: ${BASE_DOMAIN}
metadata:
  name: "${CLUSTER_NAME}"
platform:
  aws:
    region: ${CLUSTER_REGION}
    hostedZone: $hostedZone
    subnets:
$(for SB in ${SUBNETS[*]}; do echo "    - $SB"; done)
pullSecret: '$(cat ${PULL_SECRET_FILE} |awk -v ORS= -v OFS= '{$1=$1}1')'
sshKey: |
  $(cat ${SSH_PUB_KEY_FILE})
EOF

Create the manifests

export BASE_DOMAIN=devcluster.openshift.com
export SSH_PUB_KEY_FILE=$HOME/.ssh/id_rsa.pub

CLUSTER_NAME_VARIANT=${CLUSTER_NAME}1
INSTALL_DIR=${CLUSTER_NAME_VARIANT}
mkdir $INSTALL_DIR

cat <<EOF > ${INSTALL_DIR}/install-config.yaml
apiVersion: v1
publish: External
baseDomain: ${BASE_DOMAIN}
metadata:
  name: "${CLUSTER_NAME_VARIANT}"
platform:
  aws:
    region: ${CLUSTER_REGION}
    subnets:
$(for SB in ${SUBNETS[*]}; do echo "    - $SB"; done)
compute:
- name: edge
  replicas: 0
pullSecret: '$(cat ${PULL_SECRET_FILE} | awk -v ORS= -v OFS= '{$1=$1}1')'
sshKey: |
  $(cat ${SSH_PUB_KEY_FILE})
EOF

# Installer version/path
export INSTALLER=./openshift-install
export RELEASE="quay.io/openshift-release-dev/ocp-release:4.15.0-ec.2-x86_64"

cp $INSTALL_DIR/install-config.yaml $INSTALL_DIR/install-config.yaml-bkp

# Process the manifests
OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE="$RELEASE" $INSTALLER version
OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE="$RELEASE" $INSTALLER create manifests --dir $INSTALL_DIR

# Review if MTU patch has been created
ls -ls $INSTALL_DIR/manifests/cluster-network-*
yq ea .spec.defaultNetwork $INSTALL_DIR/manifests/cluster-network-03-config.yml

# Review if MachineSet for Local Zone has been created
ls -la $INSTALL_DIR/openshift/99_openshift-cluster-api_worker-machineset*
cat $INSTALL_DIR/openshift/99_openshift-cluster-api_worker-machineset-3.yaml
  • Phase-0 only: manual patch to use lower MTU
cat <<EOF > $INSTALL_DIR/manifests/cluster-network-03-config.yml
apiVersion: operator.openshift.io/v1
kind: Network
metadata:
  name: cluster
spec:
  defaultNetwork:
    ovnKubernetesConfig:
      mtu: 1200
EOF
cat $INSTALL_DIR/manifests/cluster-network-03-config.yml
  • Create the cluster
OPENSHIFT_INSTALL_RELEASE_IMAGE_OVERRIDE="$RELEASE" \
  $INSTALLER create cluster --dir $INSTALL_DIR --log-level=debug

Setup the AWS LB Operator

ALB Operator prerequisites

  • The ALB Operator requires the Kubernetes Cluster tag on the VPC:
$ oc logs pod/aws-load-balancer-operator-controller-manager-56664699b4-kjdd4 -n aws-load-balancer-operator
I0207 20:32:30.393782       1 request.go:682] Waited for 1.0400816s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/ingress.operator.openshift.io/v1?timeout=32s
1.6758019517970793e+09  INFO    controller-runtime.metrics  Metrics server is starting to listen    {"addr": "127.0.0.1:8080"}
1.6758019518810503e+09  ERROR   setup   failed to get VPC ID    {"error": "no VPC with tag \"kubernetes.io/cluster/ocp-lz-2nnns\" found"}
main.main
    /remote-source/workspace/app/main.go:133
runtime.main
    /usr/lib/golang/src/runtime/proc.go:250

Tag the VPC:

CLUSTER_ID=$(oc get infrastructures cluster -o jsonpath='{.status.infrastructureName}')
aws ec2 create-tags --resources ${VPC_ID} \
  --tags Key="kubernetes.io/cluster/${CLUSTER_ID}",Value="shared" \
  --region ${CLUSTER_REGION}

Install the ALB Operator

Steps to install Application Load Balancer Operator.

Installing ALB Operator

Understanding the AWS Load Balancer Operator

Installing from OperatorHub using the CLI

Create the Credentials for the Operator

oc create namespace aws-load-balancer-operator
cat << EOF| oc create -f -
apiVersion: cloudcredential.openshift.io/v1
kind: CredentialsRequest
metadata:
  name: aws-load-balancer-operator
  namespace: openshift-cloud-credential-operator
spec:
  providerSpec:
    apiVersion: cloudcredential.openshift.io/v1
    kind: AWSProviderSpec
    statementEntries:
      - action:
          - ec2:DescribeSubnets
        effect: Allow
        resource: "*"
      - action:
          - ec2:CreateTags
          - ec2:DeleteTags
        effect: Allow
        resource: arn:aws:ec2:*:*:subnet/*
      - action:
          - ec2:DescribeVpcs
        effect: Allow
        resource: "*"
  secretRef:
    name: aws-load-balancer-operator
    namespace: aws-load-balancer-operator
  serviceAccountNames:
    - aws-load-balancer-operator-controller-manager
EOF

Install the Operator from OLM

cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: aws-load-balancer-operator
  namespace: aws-load-balancer-operator
spec:
  targetNamespaces:
  - aws-load-balancer-operator
EOF

Create the subscription

cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: aws-load-balancer-operator
  namespace: aws-load-balancer-operator
spec:
  channel: stable-v0
  installPlanApproval: Automatic 
  name: aws-load-balancer-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Check if the install plan has been automatic approved:

$ oc get installplan -n aws-load-balancer-operator
NAME            CSV                                 APPROVAL    APPROVED
install-qlsxz   aws-load-balancer-operator.v0.2.0   Automatic   true
install-x7vwn   aws-load-balancer-operator.v0.2.0   Automatic   true

Check if the operator has been created correctly:

$ oc get all -n aws-load-balancer-operator
NAME                                                                 READY   STATUS    RESTARTS   AGE
pod/aws-load-balancer-operator-controller-manager-56664699b4-j77js   2/2     Running   0          100s

NAME                                                                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/aws-load-balancer-operator-controller-manager-metrics-service   ClusterIP   172.30.57.143   <none>        8443/TCP   12m

NAME                                                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/aws-load-balancer-operator-controller-manager   1/1     1            1           12m

NAME                                                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/aws-load-balancer-operator-controller-manager-56664699b4   1         1         1       12m

Create the ALB Controller

Steps to create the ALB Controller.

Creating an instance of AWS Load Balancer Controller

cat <<EOF | oc create -f -
apiVersion: networking.olm.openshift.io/v1alpha1
kind: AWSLoadBalancerController 
metadata:
  name: cluster 
spec:
  subnetTagging: Auto 
  ingressClass: cloud 
  config:
    replicas: 2 
  enabledAddons: 
    - AWSWAFv2
EOF

Check if the controller was installed

$ oc get all -n aws-load-balancer-operator -l app.kubernetes.io/name=aws-load-balancer-operator
NAME                                                        READY   STATUS    RESTARTS   AGE
pod/aws-load-balancer-controller-cluster-67b6dd6974-6r6tp   1/1     Running   0          43s
pod/aws-load-balancer-controller-cluster-67b6dd6974-vw5vw   1/1     Running   0          43s

NAME                                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/aws-load-balancer-controller-cluster-67b6dd6974   2         2         2       44s

Deploy a sample APP on AWS Local Zones

Deploy a sample application and a service:

cat << EOF | oc create -f -
---
apiVersion: v1
kind: Namespace
metadata:
  name: lz-apps
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: lz-app-nyc-1
  namespace: lz-apps
spec:
  selector:
    matchLabels:
      app: lz-app-nyc-1
  replicas: 1
  template:
    metadata:
      labels:
        app: lz-app-nyc-1
        zone_group: us-east-1-nyc-1
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      nodeSelector:
        zone_group: us-east-1-nyc-1
      tolerations:
      - key: "node-role.kubernetes.io/edge"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
      containers:
        - image: openshift/origin-node
          command:
           - "/bin/socat"
          args:
            - TCP4-LISTEN:8080,reuseaddr,fork
            - EXEC:'/bin/bash -c \"printf \\\"HTTP/1.0 200 OK\r\n\r\n\\\"; sed -e \\\"/^\r/q\\\"\"'
          imagePullPolicy: Always
          name: echoserver
          ports:
            - containerPort: 8080
---
apiVersion: v1
kind: Service 
metadata:
  name:  lz-app-nyc-1 
  namespace: lz-apps
spec:
  ports:
    - port: 80
      targetPort: 8080
      protocol: TCP
  type: NodePort
  selector:
    app: lz-app-nyc-1
EOF

Create the Ingress with the Local Zone subnet:

cat << EOF | oc create -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-lz-nyc-1
  namespace: lz-apps
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: instance
    alb.ingress.kubernetes.io/subnets: ${SUBNET_ID}
  labels:
    zone_group: us-east-1-nyc-1
spec:
  ingressClassName: cloud
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: lz-app-nyc-1
                port:
                  number: 80
EOF

Test the endpoint (using ALB DNS):

$ HOST=$(oc get ingress -n lz-apps ingress-lz-nyc-1 --template='{{(index .status.loadBalancer.ingress 0).hostname}}')
$ echo $HOST
k8s-lzapps-ingressl-49a869b572-66443804.us-east-1.elb.amazonaws.com

$ curl $HOST
GET / HTTP/1.1
X-Forwarded-For: 179.181.81.124
X-Forwarded-Proto: http
X-Forwarded-Port: 80
Host: k8s-lzapps-ingressl-13226f2551-de.us-east-1.elb.amazonaws.com
X-Amzn-Trace-Id: Root=1-63e18147-1532a244542b04bc75ffd473
User-Agent: curl/7.61.1
Accept: */*

Test from different [source] locations:

export HOST=k8s-lzapps-ingressl-49a869b572-66443804.us-east-1.elb.amazonaws.com

# [1] NYC (outside AWS backbone)
$ curl -s http://ip-api.com/json/$(curl -s ifconfig.me) |jq -r '[.city, .countryCode]'
[
  "North Bergen",
  "US"
]
$ curl -sw "%{time_namelookup}   %{time_connect}     %{time_starttransfer}    %{time_total}\n" -o /dev/null $HOST
0.001452   0.004079     0.008914    0.009830

# [2] Within the Region (master nodes)
$ oc debug node/$(oc get nodes -l node-role.kubernetes.io/master -o jsonpath={'.items[0].metadata.name'}) -- chroot /host /bin/bash -c "\
hostname; \
curl -s http://ip-api.com/json/\$(curl -s ifconfig.me) |jq -r '[.city, .countryCode]';\
curl -sw \"%{time_namelookup}   %{time_connect}     %{time_starttransfer}    %{time_total}\\n\" -o /dev/null $HOST"
ip-10-0-54-118
[
  "Ashburn",
  "US"
]
0.002068   0.010196     0.019962    0.020985

# [3] London (outside AWS backbone)
$ curl -s http://ip-api.com/json/$(curl -s ifconfig.me) |jq -r '[.city, .countryCode]'
[
  "London",
  "GB"
]
$ curl -sw "%{time_namelookup}   %{time_connect}     %{time_starttransfer}    %{time_total}\n" -o /dev/null $HOST
0.003332   0.099921     0.197535    0.198802

# [4] Brazil
$ curl -s http://ip-api.com/json/$(curl -s ifconfig.me) |jq -r '[.city, .countryCode]'
[
  "Florianópolis",
  "BR"
]
$ curl -sw "%{time_namelookup}   %{time_connect}     %{time_starttransfer}    %{time_total}\n" -o /dev/null $HOST
0.022869   0.187408     0.355456    0.356435
Server / Client [1]NYC/US [2]AWS Region/use1 [3]London/UK [4]Brazil
us-east-1-nyc-1a 0.004079 0.010196 0.099921 0.187408

Deploy sample Local Zone app with persistent storage

To deploy an Application on Local Zone subnet, edge nodes, using persistent storage, you must check which kind of storage is available on the location. The most locations, when this article have been written supports only gp2 storage. Although the default Storage Class on OCP is gp3, so you must set the gp2-csi StorageClass on the deployment to consume EBS to your pods:

cat <<EOF | oc create -f -
kind: Namespace
apiVersion: v1
metadata:
  name: lz-app-ns
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: lz-pvc
  namespace: lz-app-ns
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: gp2-csi 
  volumeMode: Filesystem
---
apiVersion: apps/v1
kind: Deployment 
metadata:
  name: lz-app
  namespace: lz-app-ns 
spec:
  selector:
    matchLabels:
      app: lz-app
  replicas: 1
  template:
    metadata:
      labels:
        app: lz-app
        machine.openshift.io/zone-group: ${ZONE_GROUP_NAME} 
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      nodeSelector: 
        machine.openshift.io/zone-group: ${ZONE_GROUP_NAME}
      tolerations: 
      - key: "node-role.kubernetes.io/edge"
        operator: "Equal"
        value: ""
        effect: "NoSchedule"
      containers:
        - image: openshift/origin-node
          command:
           - "/bin/socat"
          args:
            - TCP4-LISTEN:8080,reuseaddr,fork
            - EXEC:'/bin/bash -c \"printf \\\"HTTP/1.0 200 OK\r\n\r\n\\\"; sed -e \\\"/^\r/q\\\"\"'
          imagePullPolicy: Always
          name: echoserver
          ports:
            - containerPort: 8080
          volumeMounts:
            - mountPath: "/mnt/storage"
              name: data
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: lz-pvc
EOF

cat <<EOF | oc create -f -
apiVersion: v1
kind: Service 
metadata:
  name: lz-app-svc
  namespace: lz-app-ns
spec:
  ports:
    - port: 80
      targetPort: 8080
      protocol: TCP
  type: NodePort
  selector: 
    app: lz-app
EOF

echo $ZONE_GROUP_NAME

oc get machines -n openshift-machine-api

oc get pvc -n lz-app-ns

oc get all -n lz-app-ns