Developing E2E tests

E2E tests are meant to verify the proper functioning of a Cluster API management cluster in an environment that resemble a real production environment.

Following guidelines should be followed when developing E2E tests:

The Cluster API test framework provides you a set of helpers method for getting your test in place quickly; the test E2E package provide examples of how this can be achieved and reusable test specs for the most common Cluster API use cases.

Prerequisites

Each E2E test requires a set of artifacts to be available:

  • Binaries & docker images for Kubernetes, CNI, CRI & CSI
  • Manifests & docker images for the Cluster API core components
  • Manifests & docker images for the Cluster API infrastructure provider; in most cases also machine images are required (AMI, OVA etc.)
  • Credentials for the target infrastructure provider
  • Other support tools (e.g. kustomize, gsutil etc.)

The Cluster API test framework provides support for building and retrieving the manifest files for Cluster API core components and for the Cluster API infrastructure provider (see Setup)

For the remaining tasks you can find examples of how this can be implemented e.g. in CAPA E2E tests and CAPG E2E tests.

Setup

In order to run E2E tests it is required to create a Kubernetes cluster with a complete set of Cluster API providers installed. Setting up those elements is usually implemented in a BeforeSuite function, and it consists of two steps:

  • Defining an E2E config file
  • Creating the management cluster and installing providers

Defining an E2E config file

The E2E config file provides a convenient and flexible way to define common tasks for setting up a management cluster.

Using the config file it is possible to:

  • Define the list of providers to be installed in the management cluster. Most notably, for each provider it is possible to define:
    • One or more versions of the providers manifest (built from the sources, or pulled from a remote location).
    • A list of additional files to be added to the provider repository, to be used e.g. to provide cluster-templates.yaml files.
  • Define the list of variables to be used when doing clusterctl init or clusterctl config cluster.
  • Define a list of intervals to be used in the test specs for defining timeouts for the wait and Eventually methods.
  • Define the list of images to be loaded in the management cluster (this is specif of management cluster based on kind).

An example E2E config file can be found here.

Creating the management cluster and installing providers

In order to run Cluster API E2E tests, you need a Kubernetes cluster; the NewKindClusterProvider gives you a type that can be used to create a local kind cluster and pre-load images into it, but also existing clusters can be used if available.

Once you have a Kubernetes cluster, the InitManagementClusterAndWatchControllerLogs method provides a convenient way for installing providers.

This method:

  • Runs clusterctl init using the above local repository.
  • Waits for the providers controllers to be running.
  • Creates log watchers for all the providers

Writing test specs

A typical test spec is a sequence of:

  • Creating a namespace to host in isolation all the test objects
  • Creating objects in the management cluster, wait for the corresponding infrastructure to be provisioned.
  • Exec operations like e.g. changing the Kubernetes version or clusterctl move, wait for the action to complete.
  • Delete objects in the management cluster, wait for the corresponding infrastructure to be terminated.

Creating Namespaces

The CreateNamespaceAndWatchEvents method provides a convenient way to create a namespace and setup watches for capturing namespaces events

Creating objects

There are two possible approaches for creating objects in the management cluster:

  • Create object by object: create the Cluster object, then AwsCluster, Machines, AwsMachines etc.
  • Apply a cluster-templates.yaml file thus creating all the objects this file contains.

The first approaches leverage on the controller-runtime Client and gives you full control, but it comes with some drawbacks as well, because this method does not reflect directly real user workflows, and most importantly, the resulting tests are not as reusable with other infrastructure providers. (See writing portable tests).

We recommend using the ClusterTemplate method and the Apply method for creating objects in the cluster. This methods mimics the recommended user workflows, and it is based on cluster-templates.yaml files that can be provided via the E2E config file, and thus easily swappable when changing the target infrastructure provider.

After creating objects in the cluster, use the existing methods in the Cluster API test framework to discover which object was created in the cluster so your code can adapt to different cluster-templates.yaml files.

Once you have objects references, the framework includes methods for waiting for the corresponding infrastructure to be provisioned, e.g. WaitForClusterToProvision, WaitForKubeadmControlPlaneMachinesToExist.

Exec operations

You can use Cluster API test framework methods to modify Cluster API objects, as a last option, use the controller-runtime Client.

The Cluster API test framework includes also methods for executing clusterctl operations, like e.g. the ClusterTemplate method, the ClusterctlMove method etc.; in order to improve observability, each clusterctl operation creates a detailed log.

After using clusterctl operations, you can rely on the Get and on the Wait methods defined in the Cluster API test framework to check if the operation completed successfully.

Tear down

After a test completes/fails, it is required to:

  • Collect all the logs for the Cluster API controllers
  • Dump all the relevant Cluster API/Kubernetes objects
  • Cleanup all the infrastructure resources created during the test

Those task are usually implemented in the AfterSuite, and again the Cluster API test framework provides you useful methods for those tasks.

Please note that despite the fact that test specs are expected to delete objects in the management cluster and wait for the corresponding infrastructure to be terminated, it can happen that the test spec fails before starting object deletion or that objects deletion itself fails.

As a consequence, when scheduling/running a test suite, it is required to ensure all the generated resources are cleaned up. In Kubernetes, this is implemented by the boskos project.

Writing portable E2E tests

A portable E2E test is a test can run with different infrastructure providers by simply changing the test configuration file.

Following recommendations should be followed to write portable E2E tests:

  • Create different E2E config file, one for each target infrastructure provider, providing different sets of env variables and timeout intervals.
  • Use the [InitManagementCluster method] for setting up the management cluster.
  • Use the ClusterTemplate method and the Apply method for creating objects in the cluster using cluster-templates.yaml files instead of hard coding object creation.
  • Use the Get methods defined in the Cluster API test framework to checks object being created, so your code can adapt to different cluster-templates.yaml files.
  • Never hard code the infrastructure provider name in your test spec. Instead, use the InfrastructureProvider method to get access to the name of the infrastructure provider defined in the E2E config file.
  • Never hard code wait intervals in your test spec. Instead use the GetIntervals method to get access to the intervals defined in the E2E config file.

Cluster API conformance tests

As of today there is no a well-defined suites of E2E tests that can be used as a baseline for Cluster API conformance.

However, creating such suite is something that can provide a huge value for the long term success of the project.

The test E2E package provide examples of how this can be achieved implemeting a set of and reusable test specs for the most common Cluster API use cases.