[draft/wip] Deploy low-latency applications on the edge with OpenShift, AWS Local Zones and ALB Ingress Controller
Article splited from Day-2, intendeed to cover only the ALB setup and Benchmark
info:
Status: WIP
PR to Collab (feel free to review)
Preview on Dev.to
Series Name: OpenShift in the edge
Series Post: #1 Setup Marchines;
Series post id: #2
Let's talk about delivering single-digit millisecond latency applications to end-users on OpenShift Cloud Platform clusters using AWS Local Zones.
{% details Expand to know how to create Machines running in AWS Local Zones %}
- ToDo
{% enddetails %}
AWS Local Zones were created to locate Cloud Infrastructure closer to the large cities and IT centers, helping businesses to deliver their solutions to end-users faster.
As always, you need to design your application architecture to take advantage of that feature without being limited or impacted negatively. So let's discuss some options to deploy your application using edge resources on Local Zones, avoidng requests being routed to other locations.
I will walk through the solution example which can rely on user-close infrastructure while describing the steps to create ingresses distributed between the big city-center networks, then deploy the application on the edge locations. Finally, I will deploy one intance of the "geo-app" (sample) application running in the nodes distributed on the edge, then run a few tests to collect the latency from different locations (users) to different zone groups (parent/main region and Local Zones).
Table Of Contents
Summary
User's story
- As a company with hybrid cloud architecture, I would like to process real-time machine learning models in AWS specialized instances closer to my application.
- As a Regional Bakery operating within the eastern US, I want to deliver custom cakes to the city where my customers are, advertising the closest stores with availability and an estimated delivery time.
- As a doctor depending on the telemedicine solutions, I would like to have fast results from the exams in order to make fast decisions in an emergency. Those kinds of systems demand ML processing with high computing power closer to the end users.
What you need to know
- ALB will be available to install through OLM
- You can install the Operator from Source Code
Resources available to Local Zones are limited when compared to the parent region. For example, compute instances are very limited in terms of sizes and types, and for ELB only the Application Load Balancer is supported. The price is also not the same: when we look at the compute price of c5d.2xlarge
in NYC (New York)
it is 20% more expensive than the parent region, N. Virginia (us-east-1
).
For that reason, it’s important to make sure that your architecture will take advantage of running close to the users.
To go deeper into the details about the limitations and pricing, check the Local Zones and EC2 pricing page.
Reference Architecture
The demo application used in this article takes advantage of the users' geo-location to deliver the content. Look at the diagram:
Requirements
Requirements:
- AWS CLI (used
1.20.52
) jq
(used1.5
)- [OpenShift client
oc
]
Considerations:
The version of components used in this post:
- OpenShift/Kubernetes cluster: 4.10.10
Is not a goal of this document to
{% details References to deploy the Application load Balancer Operator %}
- Application Load Balancer Operator
- ALB Operator: Build using Local Environment
- ALB Operator: Tutoria to use ALB
{% enddetails %}
{% details Quick steps to deploy the ALB Operator %}
Install the Application Load Balancer Operator
We will use the Application Load Balancer Operator to install the ALB Controller to be able to create ingress using the AWS Application Load Balancer. The installation will be performed from source, you can read more options on the project page.
Note: if you dont want to build from source, or already deployed the ALB Operator, you can jump to the application deployment section.
- Setup Local Development: https://github.com/openshift/aws-load-balancer-operator#local-development
- Follow the tutorial to install: https://github.com/openshift/aws-load-balancer-operator/blob/main/docs/tutorial.md
Quick steps after build the operand and operator container images and exported to your registry:
# Create the project to place the operator
oc new-project aws-load-balancer-operator
# Make sure the AWS Access Key are exported
cat << EOF > credentials
[default]
aws_access_key_id=${AWS_ACCESS_KEY_ID}
aws_secret_access_key=${AWS_SECRET_ACCESS_KEY}
EOF
# Create the secret used to deploy the ALB
# NOTE: you should not use your credentials in
# shared clusters, please take a look into [CCO](https://docs.openshift.com/container-platform/4.10/installing/installing_aws/manually-creating-iam.html)
# to create the `CredentialsRequest` instead.
oc create secret generic aws-load-balancer-operator -n aws-load-balancer-operator \
--from-file=credentials=credentials
# Export the image name to be used
export IMG=quay.io/mrbraga/aws-load-balancer-operator:latest
# Build operator (...)
make deploy
oc get all -n aws-load-balancer-operator
- Create the ALB Controller
cat <<EOF | envsubst | oc create -f -
apiVersion: networking.olm.openshift.io/v1alpha1
kind: AWSLoadBalancerController
metadata:
name: cluster
spec:
subnetTagging: Manual
ingressClass: cloud
config:
replicas: 2
enabledAddons:
- AWSShield
- AWSWAFv2
EOF
Done! Make sure the AWSLoadBalancerController
was created correctly.
You should be able to see the subnetIds discovered by operator using the cluster tags, something like this:
status:
(...)
ingressClass: cloud
subnets:
internal:
- subnet-01413cd7efad9178b
- subnet-022f16980e77cc293
- subnet-097b0dc75fc4fdc31
- subnet-0be66875994175dee
- subnet-0c0294c9de6751616
subnetTagging: Manual
tagged:
- subnet-002c105e0bb598472
- subnet-026977c5b3fb45435
- subnet-0479f5a0434d84f85
- subnet-05a634b3b37197f53
- subnet-0e3bbcb042149a250
{% enddetails %}
Deploy the Application
Now it's time for action! In this section we will deploy one sample application which will get the public clientIp
(from HTTP request headers) and the serverIp
(discovered when the app is initialized), returning it to the user.
The app is called geo-app
, feel free to change it by setting those environment variables:
Create the App Namespace:
Create the deployment function:
create_deployment() {
LOCATION=$1; shift
AZ_NAME=$1; shift
APP_NAME="${APP_BASE_NAME}-${LOCATION}"
cat <<EOF | envsubst | oc create -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: ${APP_NAME}
namespace: ${APP_NS}
spec:
selector:
matchLabels:
app: ${APP_NAME}
replicas: 1
template:
metadata:
labels:
app: ${APP_NAME}
zone_group: ${AZ_NAME::-1}
spec:
nodeSelector:
zone_group: ${AZ_NAME::-1}
tolerations:
- key: "node-role.kubernetes.io/edge"
operator: "Equal"
value: ""
effect: "NoSchedule"
containers:
- image: quay.io/mrbraga/go-geo-app:latest
imagePullPolicy: Always
name: ${APP_NAME}
ports:
- containerPort: 8000
---
apiVersion: v1
kind: Service
metadata:
name: ${APP_NAME}
namespace: ${APP_NS}
labels:
zone_group: ${AZ_NAME::-1}
spec:
ports:
- port: 80
targetPort: 8000
protocol: TCP
type: NodePort
selector:
app: ${APP_NAME}
EOF
}
Create the deployment for each:
Make sure the application are running for each location: - New York Local Zone resources
oc get pods -n ${APP_NS} -l zone_group=${AZ_GROUP_NYC} -o wide
oc get nodes -l topology.kubernetes.io/zone=${AZ_NAME_NYC}
- Miami Local Zone resources
oc get pods -n ${APP_NS} -l zone_group=${AZ_GROUP_MIA} -o wide
oc get nodes -l topology.kubernetes.io/zone=${AZ_NAME_MIA}
Create the ingress' function to setup the edge locations:
create_ingress() {
LOCATION=$1; shift
SUBNET=$1; shift
APP_NAME=${APP_BASE_NAME}-${LOCATION}
cat <<EOF | envsubst | oc create -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-lz-${LOCATION}
namespace: ${APP_NS}
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: instance
alb.ingress.kubernetes.io/subnets: ${SUBNET}
alb.ingress.kubernetes.io/healthcheck-path: /healthz
labels:
location: ${LOCATION}
spec:
ingressClassName: cloud
rules:
- http:
paths:
- path: /
pathType: Exact
backend:
service:
name: ${APP_NAME}
port:
number: 80
EOF
}
Create the ingress instances for each edge location:
Finally, let's create an instance of the same application in the default zones within the parent region (${REGION}
). Let's do it in an single script:
SUBNETS=(`aws ec2 describe-subnets \
--filters "Name=tag:Name,Values=${CLUSTER_ID}-public-${REGION}a" \
--query 'Subnets[].SubnetId' \
--output text`)
SUBNETS+=(`aws ec2 describe-subnets \
--filters "Name=tag:Name,Values=${CLUSTER_ID}-public-${REGION}b" \
--query 'Subnets[].SubnetId' \
--output text`)
SUBNETS+=(`aws ec2 describe-subnets \
--filters "Name=tag:Name,Values=${CLUSTER_ID}-public-${REGION}c" \
--query 'Subnets[].SubnetId' \
--output text`)
SUBNET=$(echo ${SUBNETS[@]} |tr ' ' ',')
APP_NAME="${APP_BASE_NAME}-main"
ZONE_GROUP="${REGION}"
cat <<EOF | envsubst | oc create -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: ${APP_NAME}
namespace: ${APP_NS}
spec:
selector:
matchLabels:
app: ${APP_NAME}
replicas: 1
template:
metadata:
labels:
app: ${APP_NAME}
zone_group: ${ZONE_GROUP}
spec:
containers:
- image: quay.io/mrbraga/go-geo-app:latest
imagePullPolicy: Always
name: ${APP_NAME}
ports:
- containerPort: 8000
---
apiVersion: v1
kind: Service
metadata:
name: ${APP_NAME}
namespace: ${APP_NS}
labels:
zone_group: ${ZONE_GROUP}
spec:
ports:
- port: 80
targetPort: 8000
protocol: TCP
type: NodePort
selector:
app: ${APP_NAME}
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-main
namespace: ${APP_NS}
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: instance
alb.ingress.kubernetes.io/subnets: ${SUBNET}
alb.ingress.kubernetes.io/healthcheck-path: /healthz
labels:
location: ${ZONE_GROUP}
spec:
ingressClassName: cloud
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: ${APP_NAME}
port:
number: 80
EOF
Wait for the load balancers be provisioned and the ADDRESS
is available:
Get the Ingress' URL for each location:
APP_URL_MAIN=$(oc get ingress \
-l location=${REGION} -n ${APP_NS} \
-o jsonpath='{.items[0].status.loadBalancer.ingress[0].hostname} ')
APP_URL_NYC=$(oc get ingress \
-l location=nyc -n ${APP_NS} \
-o jsonpath='{.items[0].status.loadBalancer.ingress[0].hostname} ')
APP_URL_MIA=$(oc get ingress \
-l location=mia -n ${APP_NS} \
-o jsonpath='{.items[0].status.loadBalancer.ingress[0].hostname} ')
Make sure you can reach both deployments:
$ curl -s http://${APP_URL_MAIN}/ |jq .serverInfo.address
"3.233.82.41"
$ curl -s http://${APP_URL_MIA}/ |jq .serverInfo.address
"64.187.128.146"
$ curl -s http://${APP_URL_NYC}/ |jq .serverInfo.address
"15.181.162.164"
Test and Benchmark
Now it's time to make some measurements with curl
.
We've run tests from 3 different sources to measure the total time to 3 different targets/zone groups (NYC, MIA and parent region): - Client#1's Location: Florianópolis/Brazil - Client#2's Location: Digital Ocean/NYC3 (New York) - Client#3's Location: Digital Ocean/SFN (San Francisco)
Setup curl
Create the file curl-format-all.txt
:
cat <<EOF> curl-format-all.txt
time_namelookup: %{time_namelookup} Sec\n
time_connect: %{time_connect} Sec\n
time_appconnect: %{time_appconnect} Sec\n
time_pretransfer: %{time_pretransfer} Sec\n
time_redirect: %{time_redirect} Sec\n
time_starttransfer: %{time_starttransfer} Sec\n
----------\n
time_total: %{time_total} Sec\n
EOF
Create the file curl-format-table.txt
to return the attributes in an single-line tabulation format:
cat <<EOF> curl-format-table.txt
%{time_namelookup} %{time_connect} %{time_starttransfer} %{time_total}
EOF
Create the functions used to test:
curl_all() {
curl -w "@curl-format-all.txt" -o /dev/null -s ${1}
}
curl_table() {
curl -w "@curl-format-table.txt" -o /dev/null -s ${1}
}
Also create the function to collect data points, it will send 10 requests, one by second:
curl_batch() {
echo -e " \tTime\t\t\t DNS\t Connect\t TTFB\t Total\t "
for x in $(seq 1 10) ; do
echo -ne "$(date)\t $(curl_table "${1}")\n";
sleep 1;
done
}
Testing from (Client#1
) / Brazil
Client's information: - IP Address' Geo Location
$ curl -s ${APP_URL_NYC} |jq .clientInfo.geoIP
{
"as": "AS18881 TELEFÔNICA BRASIL S.A",
"city": "Florianópolis",
"country": "Brazil",
"countryCode": "BR",
"isp": "TELEFÔNICA BRASIL S.A",
"lat": -27.6147,
"lon": -48.4976,
"org": "Global Village Telecom",
"query": "191.x.y.z",
"region": "SC",
"regionName": "Santa Catarina",
"status": "success",
"timezone": "America/Sao_Paulo",
"zip": "88000"
}
The Server's location is not precise, as the AWS' Public IP is not exactly from server's location, but the Parent's region. So the IP Address' location are the same between all locations (main region's zone and local zones), impacting in the real value of the calculation
$ curl -s ${APP_URL_NYC} |jq .distance
{
"kilometers": 7999.167714502702,
"miles": 4970.452379666934,
"nauticalMiles": 4316.340846502765
}
Summary of latency between Brazilian's client and servers:
$ curl_all "http://${APP_URL_MAIN}"
time_namelookup: 0.001797 Sec
time_connect: 0.166161 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.166185 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.335583 Sec
----------
time_total: 0.335633 Sec
$ curl_all "http://${APP_URL_NYC}"
time_namelookup: 0.001948 Sec
time_connect: 0.179021 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.179048 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.361730 Sec
----------
time_total: 0.361774 Sec
$ curl_all "http://${APP_URL_MIA}"
time_namelookup: 0.001920 Sec
time_connect: 0.132105 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.132164 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.287794 Sec
----------
time_total: 0.287832 Sec
Testing from (Client#2
) / NYC
Client's information: - IP Address' Geo Location
$ curl -s ${APP_URL_NYC} |jq .clientInfo.geoIP
{
"as": "AS14061 DigitalOcean, LLC",
"city": "Clifton",
"country": "United States",
"countryCode": "US",
"isp": "DigitalOcean, LLC",
"lat": 40.8364,
"lon": -74.1403,
"org": "Digital Ocean",
"query": "138.197.77.73",
"region": "NJ",
"regionName": "New Jersey",
"status": "success",
"timezone": "America/New_York",
"zip": "07014"
}
- Distance:
$ curl -s ${APP_URL_NYC} |jq .distance
{
"kilometers": 348.0206044278185,
"miles": 216.24997789647117,
"nauticalMiles": 187.79148080529555
}
Summary of latency between NY's client and servers:
$ curl_all "http://${APP_URL_MAIN}"
time_namelookup: 0.001305 Sec
time_connect: 0.009792 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.009842 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.022289 Sec
----------
time_total: 0.022362 Sec
$ curl_all "http://${APP_URL_NYC}"
time_namelookup: 0.001284 Sec
time_connect: 0.003788 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.003832 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.015924 Sec
----------
time_total: 0.015983 Sec
$ curl_all "http://${APP_URL_MIA}"
time_namelookup: 0.001417 Sec
time_connect: 0.032383 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.032434 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.089609 Sec
----------
time_total: 0.089729 Sec
Now we can see the advantage of operating in the edge delivering applications to a client's close to the server, some insights from the values above:
- The time to connect to NYC zone was 3x faster than the parent region, and 10x faster than the location far from the user
- The TTFB (time_starttransfer
) was also 3 times faster than the parent region
- The total time to deliver close to the user was about 30% faster than the parent region
- The TTFB did not reported, but the server can be improved as the backend does some processing when calculating the GeoIP
Testing from (Client#3
) / California
Client's information: - IP Address' Geo Location
$ curl -s ${APP_URL_NYC} |jq .clientInfo.geoIP
{
"as": "AS14061 DigitalOcean, LLC",
"city": "Santa Clara",
"country": "United States",
"countryCode": "US",
"isp": "DigitalOcean, LLC",
"lat": 37.3931,
"lon": -121.962,
"org": "DigitalOcean, LLC",
"query": "143.244.176.204",
"region": "CA",
"regionName": "California",
"status": "success",
"timezone": "America/Los_Angeles",
"zip": "95054"
}
- Distance:
$ curl -s ${APP_URL_NYC} |jq .distance
{
"kilometers": 3850.508486723844,
"miles": 2392.5950491155672,
"nauticalMiles": 2077.7295406519584
}
Summary of latency between California's client and servers:
$ curl_all "http://${APP_URL_MAIN}"
time_namelookup: 0.001552 Sec
time_connect: 0.072307 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.072370 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.146625 Sec
----------
time_total: 0.146696 Sec
$ curl_all "http://${APP_URL_NYC}"
time_namelookup: 0.003108 Sec
time_connect: 0.073105 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.073180 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.152823 Sec
----------
time_total: 0.152941 Sec
$ curl_all "http://${APP_URL_MIA}"
time_namelookup: 0.001725 Sec
time_connect: 0.070220 Sec
time_appconnect: 0.000000 Sec
time_pretransfer: 0.070274 Sec
time_redirect: 0.000000 Sec
time_starttransfer: 0.164575 Sec
----------
time_total: 0.164649 Sec
Benchmark Review
Let's move to the end of this benchmark by collecting more data points to normalize the results from the client and servers described above.
The script below tests each server and should be executed on each client:
run_batch() {
URL="${1}"; shift
LOC_SHORT="${1}"; shift
LOCATION=$(curl -s ${URL} |jq -r ".clientInfo.geoIP | ( .countryCode + \"-\" + .region )")
FILE_OUT="curl-batch_${LOCATION}-${LOC_SHORT}.txt"
echo -e "\n ${URL}: " > tee -a ${FILE_OUT}
curl_batch "${URL}" | tee -a ${FILE_OUT}
}
run_batch "${APP_URL_MAIN}" "main"
run_batch "${APP_URL_NYC}" "nyc"
run_batch "${APP_URL_MIA}" "mia"
You can find the raw files here.
Average in milliseconds:
Client / Server | Main | NYC | MIA |
---|---|---|---|
BR-SC | 342.6579 | 352.2908 | 299.3756 |
US-NY | 22.2609 | 16.0002 | 92.5647 |
US-CA | 149.5462 | 152.1854 | 164.0881 |
Percentage comparing the parent region (negative is slower):
Client / Server | Main | NYC | MIA |
---|---|---|---|
BR-SC | 0.00% | -2.81% | 12.63% |
US-NY | 0.00% | 28.12% | -315.82% |
US-CA | 0.00% | -1.76% | -9.72% |
Difference in ms
between parent region (negative is slower):
Client / Server | Main | NYC | MIA |
---|---|---|---|
BR-SC | 0 | -9.6329 | 43.2823 |
US-NY | 0 | 6.2607 | -70.3038 |
US-CA | 0 | -2.6392 | -14.5419 |
Conclusion
TODO review and remove from original post
One of the biggest challenges to deliver solutions is to improve the application performance. This includes many layers, one of which is the infrastructure. Having an option to deliver low latency to the end-users with low code efforts is a big advantage in time to market.
There are points to improve like the limitation of infrastructure and the pricing. At the same time, if the demand grows more infrastructure can be provisioned and more options, services, and competitors could be available.
As we can see in the first part of this post, you can extend an existing OpenShift Cluster deployed with IPI to AWS Local Zones without any issues, then use an Application Load Balancer Operator to deliver the applications to the edge located in big cities.
Looking at the results of the benchmark, the improvement between the parent region is slightly high when the client is close to the Local Zone, otherwise, there's no benefit to delivering it as the resources will be more expensive on the edge compared to the parent zone.
Anyway, you can see how easy it is now to create modern Kubernetes applications with currently easy access to the edge networks.
Next topics to review: - Use the Route53 Geolocation routing policy to deliver a single endpoint to users. It's not a part of this research to check the precision can cover the distance between the Local Zone and parent region - You can take advantage of ephemeral storage available on the most instances in Local Zones, so if you are processing files on the board, the disk IO will be extremely faster and the solution cheaper. - Check the AWS Wavelength, one more opportunity to deliver low latency applications directly from RAN (Radio Access Networks) / 5G devices
If you would like to explore more any topic described here, feel free to leave a comment!
Thanks to reach the end of this research with me! :)
References