HyperShift CCM Managed Security Groups Lab
This guide includes hands-on steps to explore HyperShift to:
- Build HyperShift binary
- Build/BYO HyperShift control plane operator image for custom installs
- Install HyperShift using custom images with feature set TechPreviewNoUpgrade
- Install hosted cluster with feature set TechPreviewNoUpgrade
- Uninstall HyperShift operator
Building
# building the binary
make
# building the operator (control-plane-operator) image
export REGISTRY=${REGISTRY:-quay.io/mrbraga}
make docker-build IMG=${REGISTRY}/hypershift-controlplane-operator:devel
Create Layered Hosted Cluster
Prerequisites: - OCP self-managed - KUBECONFIG variable from self-managed exported
Steps to create nested cluster to allow flexibility when destroying layers:
OCP Self Managed -> HCP operator -> Hosted -> HCP Operator -> e2e
# Globals
export AWS_CREDS="$AWS_SHARED_CREDENTIALS_FILE"
export AWS_DEFAULT_REGION=us-east-1
export CLUSTER_BASE_DOMAIN=splat.devcluster.openshift.com
export PULL_SECRET_FILE="${HOME}/.openshift/pull-secret-latest.json"
export SSH_PUB_KEY_FILE=$HOME/.ssh/id_rsa.pub
# Create OIDC generic bucket
export OIDC_BUCKET_NAME="hcp-e2e-oidc"
# Setup Bucket for OIDC discovery documents
bucket_policy_file=${OIDC_BUCKET_NAME}-oidc-workload-clusters_policy.json
aws s3api create-bucket --bucket ${OIDC_BUCKET_NAME}
aws s3api delete-public-access-block --bucket ${OIDC_BUCKET_NAME}
echo '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::${OIDC_BUCKET_NAME}/*"
}
]
}' | envsubst > ${bucket_policy_file}
aws s3api put-bucket-policy --bucket ${OIDC_BUCKET_NAME} --policy file://${bucket_policy_file}
#> Install Hosted Cluster to be used as management cluster (Layered)
export CLUSTER_PREFIX=hcp-e2e-v5
# Hypershift operator must be stable (no TP, development, etc)
./bin/hypershift install \
--oidc-storage-provider-s3-bucket-name="${OIDC_BUCKET_NAME}" \
--oidc-storage-provider-s3-credentials="${AWS_CREDS}" \
--oidc-storage-provider-s3-region="${AWS_DEFAULT_REGION}" \
--tech-preview-no-upgrade=true
# Create hosted cluster to host the management cluster
OCP_RELEASE_IMAGE=quay.io/openshift-release-dev/ocp-release:4.22.0-ec.1-x86_64
./bin/hypershift create cluster aws \
--name="${CLUSTER_PREFIX}" \
--region="${AWS_DEFAULT_REGION}" \
--node-pool-replicas=2 \
--base-domain="${CLUSTER_BASE_DOMAIN}" \
--pull-secret="${PULL_SECRET_FILE}" \
--aws-creds="${AWS_CREDS}" \
--ssh-key="${SSH_PUB_KEY_FILE}" \
--release-image="${OCP_RELEASE_IMAGE}" \
--feature-set=TechPreviewNoUpgrade
# Wait the cluster to be installed
oc get hostedclusters -A -w
# Extract the kubeconfig for the HC/mgr
./bin/hypershift create kubeconfig --name ${CLUSTER_PREFIX} > kubeconfig-${CLUSTER_PREFIX}
export KUBECONFIG_OLD=$KUBECONFIG
export KUBECONFIG=$PWD/kubeconfig-${CLUSTER_PREFIX}
# Ensure that the cluster is stable:
oc get co -w
BYO HyperShift Images
Step 1: Build the Control-Plane-Operator Image
export REGISTRY=${REGISTRY:-quay.io/mrbraga}
export TAG="feat-ccm-nlb-sg-$(git rev-parse --short HEAD)"
export CONTROL_PLANE_IMAGE=${REGISTRY}/hypershift-control-plane-operator:${TAG}
podman build -f Dockerfile.control-plane -t ${CONTROL_PLANE_IMAGE} .
Step 2: Push the Image to Your Registry
Step 3: Install HyperShift Operator with Custom Images
./bin/hypershift install \
--oidc-storage-provider-s3-bucket-name="${OIDC_BUCKET_NAME}" \
--oidc-storage-provider-s3-credentials="${AWS_CREDS}" \
--oidc-storage-provider-s3-region="${AWS_DEFAULT_REGION}" \
--tech-preview-no-upgrade=true \
--additional-operator-env-vars CONTROL_PLANE_OPERATOR_IMAGE=${CONTROL_PLANE_IMAGE} \
--development
# Scale up hypershift (--development flag does not start operator)
Step 4: Create Layered Hosted Cluster with Custom Operator
Note 1: Use
-control-plane-operator-imageonly when you are building it. Note 2: Use--feature-set=TechPreviewNoUpgradeonly when you want to change the feature set
HOSTED_CLUSTER_NAME="${CLUSTER_PREFIX}-hc1"
./bin/hypershift create cluster aws \
--name="${HOSTED_CLUSTER_NAME}" \
--region="${AWS_DEFAULT_REGION}" \
--node-pool-replicas=2 \
--base-domain="${CLUSTER_BASE_DOMAIN}" \
--pull-secret="${PULL_SECRET_FILE}" \
--aws-creds="${AWS_CREDS}" \
--ssh-key="${SSH_PUB_KEY_FILE}" \
--release-image="${OCP_RELEASE_IMAGE}" \
--control-plane-operator-image="${CONTROL_PLANE_IMAGE}" \
--feature-set=TechPreviewNoUpgrade
# Check the cluster information:
oc get --namespace clusters hostedclusters
# When completed, extract the credentials for workload cluster:
./bin/hypershift create kubeconfig --name ${HOSTED_CLUSTER_NAME} > kubeconfig-${HOSTED_CLUSTER_NAME}
# kubeconfig for management cluster
export KUBECONFIG_MGR=$KUBECONFIG
# kubeconfig for workload cluster
export KUBECONFIG=$PWD/kubeconfig-${HOSTED_CLUSTER_NAME}
Testing
CCM Upstream Tests
cd ${PATH_TO_CCM}
make e2e.test
./e2e.test --ginkgo.v --ginkgo.focus="loadbalancer NLB internal should be reachable with hairpinning traffic"
./e2e.test --ginkgo.v --ginkgo.focus="loadbalancer NLB should be reachable with target-node-labels"
HyperShift e2e
Reference: HyperShift e2e docs
make e2e
bin/test-e2e \
-test.v \
-test.run=TestCCMCreateCluster/Main/When_feature_set_is_TechPreviewNoUpgrade/LoadBalancer_service_should_have_security_groups_attached \
-test.timeout=2h \
--e2e.aws-region=${AWS_DEFAULT_REGION} \
--e2e.aws-credentials-file="${AWS_SHARED_CREDENTIALS_FILE}" \
--e2e.pull-secret-file="${PULL_SECRET_FILE}" \
--e2e.base-domain="${CLUSTER_BASE_DOMAIN}" \
--e2e.platform=AWS \
--e2e.latest-release-image="${OCP_RELEASE_IMAGE}" \
--e2e.aws-oidc-s3-bucket-name="${OIDC_BUCKET_NAME}" \
--e2e.aws-oidc-s3-region="${AWS_DEFAULT_REGION}"
Alternative test target with custom CPO image:
bin/test-e2e \
-test.v --ginkgo.vv \
-test.run=TestCreateCluster/Main/AWSCCMWithCustomizations \
-test.timeout=30m \
--e2e.platform=AWS \
--e2e.aws-region=${AWS_DEFAULT_REGION} \
--e2e.aws-credentials-file="${AWS_SHARED_CREDENTIALS_FILE}" \
--e2e.pull-secret-file="${PULL_SECRET_FILE}" \
--e2e.base-domain="${CLUSTER_BASE_DOMAIN}" \
--e2e.aws-oidc-s3-bucket-name="${OIDC_BUCKET_NAME}" \
--e2e.control-plane-operator-image="${CONTROL_PLANE_IMAGE}"
Manual NLB Verification
Create a test LoadBalancer Service in the guest cluster and verify security groups:
KUBECONFIG_HC=$PWD/kubeconfig-${HOSTED_CLUSTER_NAME}
oc --kubeconfig=${KUBECONFIG_HC} create namespace test-ccm-nlb-sg
oc --kubeconfig=${KUBECONFIG_HC} apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: test-nlb-service
namespace: test-ccm-nlb-sg
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
type: LoadBalancer
selector:
app: test
ports:
- port: 80
targetPort: 8080
protocol: TCP
EOF
oc --kubeconfig=${KUBECONFIG_HC} get svc -n test-ccm-nlb-sg test-nlb-service -w
Once the LoadBalancer has an external hostname, verify security groups are attached:
LB_HOSTNAME=$(oc --kubeconfig=${KUBECONFIG_HC} get svc -n test-ccm-nlb-sg test-nlb-service \
-o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
LB_NAME=$(echo ${LB_HOSTNAME} | cut -d'.' -f1 | rev | cut -d'-' -f2- | rev)
aws elbv2 describe-load-balancers --names ${LB_NAME} --region ${AWS_DEFAULT_REGION} \
--query 'LoadBalancers[0].SecurityGroups'
Cleanup:
oc --kubeconfig=${KUBECONFIG_HC} delete svc test-nlb-service -n test-ccm-nlb-sg
oc --kubeconfig=${KUBECONFIG_HC} delete namespace test-ccm-nlb-sg
Iteration Workflow
When iterating on CPO code changes without recreating the cluster:
# Rebuild with a new tag
export TAG="feat-ccm-nlb-sg-$(git rev-parse --short HEAD)"
export CONTROL_PLANE_IMAGE=${REGISTRY}/hypershift-control-plane-operator:${TAG}
podman build -f Dockerfile.control-plane -t ${CONTROL_PLANE_IMAGE} .
podman push ${CONTROL_PLANE_IMAGE}
# Patch the running HostedCluster to use the new image
oc patch hostedcluster ${HOSTED_CLUSTER_NAME} -n clusters --type=merge \
-p "{\"spec\":{\"controlPlaneOperatorImage\":\"${CONTROL_PLANE_IMAGE}\"}}"
Destroy HyperShift Operator
# Render the manifests that were installed
bin/hypershift install render > hypershift-manifests.yaml
# delete the controller
oc delete ns hypershift
# Delete using the rendered manifests
oc delete -f hypershift-manifests.yaml
Troubleshooting
Verify CPO Image in Use
oc get pods -n clusters-${HOSTED_CLUSTER_NAME} -l app=control-plane-operator
oc get deployment -n clusters-${HOSTED_CLUSTER_NAME} \
-l hypershift.openshift.io/control-plane-component \
-o jsonpath='{.items[*].spec.template.spec.containers[*].image}'
Verify NLBSecurityGroupMode in CCM Config
oc get configmap aws-cloud-config -n clusters-${HOSTED_CLUSTER_NAME} \
-o jsonpath='{.data.aws\.conf}' | grep NLBSecurityGroupMode
IAM Permissions for Control Plane Components
Extract credentials from the CCM pod to validate IAM permissions:
CCM_POD=$(oc get pod -l app=cloud-controller-manager \
-n clusters-${HOSTED_CLUSTER_NAME} \
-o jsonpath='{.items[0].metadata.name}')
COMPONENT_IAM_ROLE=$(oc exec ${CCM_POD} \
-c cloud-controller-manager \
-n clusters-${HOSTED_CLUSTER_NAME} \
-- cat /etc/kubernetes/secrets/cloud-provider/credentials \
| grep role_arn | awk -F"= " '{print$2}')
TOKEN_PATH=$(oc exec ${CCM_POD} \
-c cloud-controller-manager \
-n clusters-${HOSTED_CLUSTER_NAME} \
-- cat /etc/kubernetes/secrets/cloud-provider/credentials \
| grep web_identity | awk -F"= " '{print$2}')
COMPONENT_TOKEN=$(oc exec ${CCM_POD} \
-c cloud-controller-manager \
-n clusters-${HOSTED_CLUSTER_NAME} \
-- cat ${TOKEN_PATH})
Assume the role and get temporary credentials:
aws sts assume-role-with-web-identity \
--role-arn "${COMPONENT_IAM_ROLE}" \
--web-identity-token "${COMPONENT_TOKEN}" \
--role-session-name my-session \
--query "Credentials.[AccessKeyId,SecretAccessKey,SessionToken]" \
--output text \
| awk '{print "[default]\naws_region = us-east-1\naws_access_key_id = "$1"\naws_secret_access_key = "$2"\naws_session_token = "$3}' \
| tee ./workload-creds.conf
Verify identity and simulate permissions:
AWS_SHARED_CREDENTIALS_FILE=$PWD/workload-creds.conf \
aws sts get-caller-identity
aws iam simulate-principal-policy \
--policy-source-arn ${COMPONENT_IAM_ROLE} \
--action-names \
"elasticloadbalancing:DescribeLoadBalancers" \
"elasticloadbalancing:DescribeTargetGroupAttributes" \
| jq -r '(["EVAL_ACTION","EVAL_DECISION"] | (., map(length*"-"))),
(.EvaluationResults[] | [.EvalActionName, .EvalDecision ])
| @tsv' | column -t
Example output: