In this guide, we’ll walk you through how to install Droid's agent into your Kubernetes cluster. Then we’ll deploy a sample application to show off what it can do.
This guide use Linkerd(viz extension) + Flagger + loadtester test application.
Step 1: Install Pre-requisites
Please make sure you've got the pre-requisites For this quickstart tutorial, install :
Step 2: Bootstrap
Install the load testing service to generate traffic during the canary analysis:
kubectl apply -k https://github.com/fluxcd/flagger//kustomize/tester?ref=main
If you want to install a demo test app. Create a deployment and a horizontal pod autoscaler:
git clone https://github.com/HybridK8s/demos && cd demos/droid helm upgrade -i test-app test-app -n test
Step 3: Pair cluster with HybridK8s Droid
Login to HybridK8s Console. On the "Clusters" page, click on New Cluster, it requires:
- Add your cluster name, environment.
- Choose Mesh type as Linkerd
- Prometheus Metric store URL (optional) : If you're using Prometheus metric store already in your cluster, you can add the url otherwise leave it empty.
- Choose service type as Loadbalancer.
- Click Create
You should be able to see cluster details like cluster key(separate for each cluster for security reasons) and other details you added.
Follow the commands on the cluster detail page to install an agent in the cluster :
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml helm repo add hybridk8s https://hybridk8s.github.io/agent-chart && helm repo update && kubectl create ns agent
Please ensure you use the right Cluster key.
helm upgrade -i hybridk8s-agent -n agent hybridk8s/agent --set config.AGENT_AGENTINFO_APIKEY=<CLUSTER_KEY>
🎉 Congrats! Milestone achieved! 🎯
Step 4: Applying Canary
Create a canary custom resource for the test-app deployment.
Here's a template
canary.yaml you can add (ideally in the helm chart directory).
Just make sure to add your Cluster API Key in the
canary.yaml before applying :
apiVersion: flagger.app/v1beta1 kind: Canary metadata: name: test-app namespace: test spec: # deployment reference targetRef: apiVersion: apps/v1 kind: Deployment name: test-app # the maximum time in seconds for the canary deployment # to make progress before it is rollback (default 600s) progressDeadlineSeconds: 800 service: # ClusterIP port number port: 80 # container port number or name (optional) targetPort: 8080 analysis: # schedule interval (default 60s) interval: 60s # max number of failed metric checks before rollback threshold: 1 # max traffic percentage routed to canary # percentage (0-100) maxWeight: 50 # canary increment step # percentage (0-100) stepWeight: 5 # Linkerd Prometheus checks webhooks: - name: load-test type: rollout url: http://flagger-loadtester.test/ metadata: cmd: "hey -z 60m -q 100 -c 2 http://test-app.demo/test" - name: verify type: rollout url: https://api.hybridk8s.tech/api/flagger/verify timeout: 600s metadata: api_key: "<CLUSTER_KEY>" app: "demo-app-1" primary: "test-app-primary" canary: "test-app" container: "test-app" duration: "60"
Now apply the canary to the cluster.
kubectl apply -f ./canary.yaml
Go grab a cup of coffee ☕️ ... it'll take a few minutes to brew the magic! ✨
Step 5: Let's Try making a Faulty Deployment
Check if the test-app canary and primary endpoints are Initalized completely 🏁 If canary is still being initialized, take a sip ☕️ , wait for a minute! ⏰
kubectl describe canary -n test test-app
Once canary is successfully initialised. 🏁
Let's try to change the docker image tag to faulty in the test-app. We can assume it to be similar to an error being introduced in any deployment.
helm upgrade -i test-app test-app -n test --set image.tag=faulty
☕️ ... Take some sips! It'll take a few minutes to realise the magic! ✨
You can see the magic happening via CLI or Linkerd dashboard.
CLI fans, use :
kubectl describe canary -n test test-app
Visualisation admirers, use :
linkerd viz dashboard
We can see traffic splitting 🚦, response rates and other metrics.
After a few minutes the canary will fail 🛑 and automatically rollback 🔄 because Droid automatically compared the primary metrics and logs with the canary metrics and logs. Things didn't seem better/fine. You can see why the deployment failed in detail on the platform.
In case of metric failures :
In case of log failures :
Happy deploying! ✨☕️