Ansible Operator Tutorial

An in-depth walkthrough of building and running an Ansible-based operator.

NOTE: If your project was created with an operator-sdk version prior to v1.0.0 please migrate, or consult the legacy docs.

Prerequisites

  • Go through the installation guide.
  • Make sure your user is authorized with cluster-admin permissions.
  • An accessible image registry for various operator images (ex. hub.docker.com, quay.io) and be logged in to your command line environment.
    • example.com is used as the registry Docker Hub namespace in these examples. Replace it with another value if using a different registry or namespace.
    • Authentication and certificates if the registry is private or uses a custom CA.

Overview

We will create a sample project to let you know how it works and this sample will:

  • Create a Memcached Deployment if it doesn’t exist
  • Ensure that the Deployment size is the same as specified by the Memcached CR spec
  • Update the Memcached CR status using the status writer with the names of the CR’s pods

Create a new project

Use the CLI to create a new memcached-operator project:

mkdir memcached-operator
cd memcached-operator
operator-sdk init --plugins=ansible --domain example.com

Among the files generated by this command is a Kubebuilder PROJECT file. Subsequent operator-sdk commands (and help text) run from the project root read this file and are aware that the project type is Ansible.

Next, we will create a Memcached API.

operator-sdk create api --group cache --version v1alpha1 --kind Memcached --generate-role

The scaffolded operator has the following structure:

  • Memcached Custom Resource Definition, and a sample Memcached resource.
  • A “Manager” that reconciles the state of the cluster to the desired state
    • A reconciler, which is an Ansible Role or Playbook.
    • A watches.yaml file, which connects the Memcached resource to the memcached Ansible Role.

See scaffolded files reference and watches reference for more detailed information.

Modify the Manager

Now we need to provide the reconcile logic, in the form of an Ansible Role, which will run every time a Memcached resource is created, updated, or deleted.

Update roles/memcached/tasks/main.yml:

---
- name: start memcached
  kubernetes.core.k8s:
    definition:
      kind: Deployment
      apiVersion: apps/v1
      metadata:
        name: '{{ ansible_operator_meta.name }}-memcached'
        namespace: '{{ ansible_operator_meta.namespace }}'
      spec:
        replicas: "{{size}}"
        selector:
          matchLabels:
            app: memcached
        template:
          metadata:
            labels:
              app: memcached
          spec:
            containers:
            - name: memcached
              command:
              - memcached
              - -m=64
              - -o
              - modern
              - -v
              image: "docker.io/memcached:1.4.36-alpine"
              ports:
                - containerPort: 11211

This memcached role will:

  • Ensure a memcached Deployment exists
  • Set the Deployment size

Note that the tasks in this Ansible role file are what actually defines the behavior of the spec and status of the memcached custom resource. As Kubernetes allows entry of arbitrary fields when creating resources, we don’t need to actually create specific fields in the CRD. While we won’t be doing this in this tutorial, it is recommended to also define these fields in the CRD, so that Kubernetes users can see the fields that will be used when using the custom resource. It is also good practice to set default values for variables used in Ansible Roles, so edit roles/memcached/defaults/main.yml:

---
# defaults file for Memcached
size: 1

Finally, update the Memcached sample, config/samples/cache_v1alpha1_memcached.yaml:

apiVersion: cache.example.com/v1alpha1
kind: Memcached
metadata:
  name: memcached-sample
spec:
  size: 3

The key-value pairs in the Custom Resource spec are passed to Ansible as extra variables.

Note: The names of all variables in the spec field are converted to snake_case by the operator before running Ansible. For example, serviceAccount in the spec becomes service_account in Ansible. You can disable this case conversion by setting the snakeCaseParameters option to false in your watches.yaml. It is recommended that you perform some type validation in Ansible on the variables to ensure that your application is receiving the expected input.

Configure the operator’s image registry

All that remains is to build and push the operator image to the desired image registry. Your Makefile composes image tags either from values written at project initialization or from the CLI. In particular, IMAGE_TAG_BASE lets you define a common image registry, namespace, and partial name for all your image tags. Update this to another registry and/or namespace if the current value is incorrect. Afterwards you can update the IMG variable definition like so:

-IMG ?= controller:latest
+IMG ?= $(IMAGE_TAG_BASE):$(VERSION)

Once done, you do not have to set IMG or any other image variable in the CLI. The following command will build and push an operator image tagged as example.com/memcached-operator:v0.0.1 to Docker Hub:

make docker-build docker-push

Run the Operator

There are three ways to run the operator:

1. Run locally outside the cluster

Execute the following command, which installs your CRDs and runs the manager locally:

make install run

2. Run as a Deployment inside the cluster

By default, a new namespace is created with name <project-name>-system, ex. memcached-operator-system, and will be used for the deployment.

Commonly, Operator authors may need to modify config/rbac in order to give their Operator the necessary permissions to reconcile.

Run the following to customize the manifests and deploy the operator.

make deploy

The scaffolded Makefile uses kustomize to apply custom configurations and generate manifests from the config/ directory, which are piped to kubectl. Run the following command to see the manifests that were applied to the cluster.

kustomize build config/default

Verify that the memcached-operator is up and running:

$ kubectl get deployment -n memcached-operator-system
NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
memcached-operator-controller-manager   1/1     1            1           8m

3. Deploy your Operator with OLM

First, install OLM:

operator-sdk olm install

Bundle your operator, then build and push the bundle image. The bundle target generates a bundle in the bundle directory containing manifests and metadata defining your operator. bundle-build and bundle-push build and push a bundle image defined by bundle.Dockerfile.

make bundle bundle-build bundle-push

Finally, run your bundle. If your bundle image is hosted in a registry that is private and/or has a custom CA, these configuration steps must be complete.

operator-sdk run bundle example.com/memcached-operator-bundle:v0.0.1

Check out the docs for a deep dive into operator-sdk's OLM integration.

Create a Memcached CR

Update the sample Memcached CR manifest at config/samples/cache_v1alpha1_memcached.yaml and define the spec as the following:

apiVersion: cache.example.com/v1alpha1
kind: Memcached
metadata:
  name: memcached-sample
spec:
  size: 3

Create the CR:

kubectl apply -f config/samples/cache_v1alpha1_memcached.yaml

Ensure that the memcached operator creates the deployment for the sample CR with the correct size:

$ kubectl get deployment
NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
memcached-sample                        3/3     3            3           1m

Check the pods and CR status to confirm the status is updated with the memcached pod names:

$ kubectl get pods
NAME                                  READY     STATUS    RESTARTS   AGE
memcached-sample-6fd7c98d8-7dqdr      1/1       Running   0          1m
memcached-sample-6fd7c98d8-g5k7v      1/1       Running   0          1m
memcached-sample-6fd7c98d8-m7vn7      1/1       Running   0          1m
$ kubectl get memcached/memcached-sample -o yaml
apiVersion: cache.example.com/v1alpha1
kind: Memcached
metadata:
  creationTimestamp: "2021-03-17T19:54:42Z"
  generation: 1
  managedFields:
  - apiVersion: cache.example.com/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:conditions: {}
    manager: ansible-operator
    operation: Update
    time: "2021-03-17T19:54:42Z"
  - apiVersion: cache.example.com/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:spec:
        .: {}
        f:size: {}
    manager: kubectl
    operation: Update
    time: "2021-03-17T19:54:42Z"
  name: memcached-sample
  namespace: default
  resourceVersion: "1008"
  uid: 4b023125-132a-44e3-80de-20801c7a9268
spec:
  size: 3
status:
  conditions:
  - ansibleResult:
      changed: 0
      completion: 2021-03-17T19:54:54.890394
      failures: 0
      ok: 1
      skipped: 0
    lastTransitionTime: "2021-03-17T19:54:42Z"
    message: Awaiting next reconciliation
    reason: Successful
    status: "True"
    type: Running

Update the size

Update config/samples/cache_v1alpha1_memcached.yaml to change the spec.size field in the Memcached CR from 3 to 5:

kubectl patch memcached memcached-sample -p '{"spec":{"size": 5}}' --type=merge

Confirm that the operator changes the deployment size:

$ kubectl get deployment
NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
memcached-sample                        5/5     5            5           3m

Cleanup

Run the following to delete all deployed resources:

kubectl delete -f config/samples/cache_v1alpha1_memcached.yaml
make undeploy

Next Steps

We recommend reading through the Ansible development section for tips and tricks, including how to run the operator locally.

In this tutorial, the scaffolded watches.yaml could be used as-is, but has additional optional features. See the watches reference.

For brevity, some of the scaffolded files were left out of this guide. See Scaffolding Reference.

This example built a namespaced scope operator, but Ansible operators can also be used with cluster-wide scope.

OLM will manage creation of most if not all resources required to run your operator, using a bit of setup from other operator-sdk commands. Check out the OLM integration guide.

Last modified October 12, 2023: (docs): update broken links (#6599) (07cbb522)