Getting Started

In this guide, weโ€™ll walk you through how to install Droid's agent into your Kubernetes cluster. Then weโ€™ll deploy a sample application to show off what it can do.

This guide use Linkerd(viz extension) + Flagger + loadtester test application.

Step 1: Install Pre-requisites

Please make sure you've got the pre-requisites For this quickstart tutorial, install :

Step 2: Bootstrap

Install the load testing service to generate traffic during the canary analysis:

kubectl apply -k

If you want to install a demo test app. Create a deployment and a horizontal pod autoscaler:

git clone && cd demos/droid

helm upgrade -i test-app test-app -n test

Step 3: Pair cluster with HybridK8s Droid

Login to HybridK8s Console. On the "Clusters" page, click on New Cluster, it requires:

  • Add your cluster name, environment.
  • Choose Mesh type as Linkerd
  • Prometheus Metric store URL (optional) : If you're using Prometheus metric store already in your cluster, you can add the url otherwise leave it empty.
  • Choose service type as Loadbalancer.
  • Click Create

You should be able to see cluster details like cluster key(separate for each cluster for security reasons) and other details you added.

Follow the commands on the cluster detail page to install an agent in the cluster :

kubectl apply -f
helm repo add hybridk8s && helm repo update && kubectl create ns agent

Please ensure you use the right Cluster key.

helm upgrade -i hybridk8s-agent -n agent hybridk8s/agent --set config.AGENT_AGENTINFO_APIKEY=<CLUSTER_KEY>

๐ŸŽ‰ Congrats! Milestone achieved! ๐ŸŽฏ

Step 4: Applying Canary

Create a canary custom resource for the test-app deployment.

Here's a template canary.yaml you can add (ideally in the helm chart directory). Just make sure to add your Cluster API Key in the canary.yaml before applying :

kind: Canary
  name: test-app
  namespace: test
  # deployment reference
    apiVersion: apps/v1
    kind: Deployment
    name: test-app
  # the maximum time in seconds for the canary deployment
  # to make progress before it is rollback (default 600s)
  progressDeadlineSeconds: 800
    # ClusterIP port number
    port: 80
    # container port number or name (optional)
    targetPort: 8080
    # schedule interval (default 60s)
    interval: 60s
    # max number of failed metric checks before rollback
    threshold: 1
    # max traffic percentage routed to canary
    # percentage (0-100)
    maxWeight: 50
    # canary increment step
    # percentage (0-100)
    stepWeight: 5
    # Linkerd Prometheus checks
    - name: load-test
      type: rollout
      url: http://flagger-loadtester.test/
        cmd: "hey -z 60m -q 100 -c 2 http://test-app.demo/test"
    - name: verify
      type: rollout
      timeout: 600s
        api_key: "<CLUSTER_KEY>"
        app: "demo-app-1"
        primary: "test-app-primary"
        canary: "test-app"
        container: "test-app"
        duration: "60"

Now apply the canary to the cluster.

kubectl apply -f ./canary.yaml

Go grab a cup of coffee โ˜•๏ธ ... it'll take a few minutes to brew the magic! โœจ

Step 5: Let's Try making a Faulty Deployment

Check if the test-app canary and primary endpoints are Initalized completely ๐Ÿ If canary is still being initialized, take a sip โ˜•๏ธ , wait for a minute! โฐ

kubectl describe canary -n test test-app

Once canary is successfully initialised. ๐Ÿ

Let's try to change the docker image tag to faulty in the test-app. We can assume it to be similar to an error being introduced in any deployment.

helm upgrade -i test-app test-app -n test --set image.tag=faulty

โ˜•๏ธ ... Take some sips! It'll take a few minutes to realise the magic! โœจ

You can see the magic happening via CLI or Linkerd dashboard.

CLI fans, use :

kubectl describe canary -n test test-app

Visualisation admirers, use :

linkerd viz dashboard

We can see traffic splitting ๐Ÿšฆ, response rates and other metrics.

After a few minutes the canary will fail ๐Ÿ›‘ and automatically rollback ๐Ÿ”„ because Droid automatically compared the primary metrics and logs with the canary metrics and logs. Things didn't seem better/fine. You can see why the deployment failed in detail on the platform.

In case of metric failures :

In case of log failures :

Happy deploying! โœจโ˜•๏ธ

results matching ""

    No results matching ""