Skip to content

clusterctl init fails due to too large Managedclusters CRD on EKS #6080

@bjowczarek

Description

@bjowczarek

/kind bug

What steps did you take and what happened:

$ ./clusterctl-darwin-arm64 version
clusterctl version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.11", GitCommit:"01ece4223f61c2b5a08f5906bb7720e20775dba5", GitTreeState:"clean", BuildDate:"2025-08-19T15:38:43Z", GoVersion:"go1.23.12", Compiler:"gc", Platform:"darwin/arm64"}

$ ./clusterctl-darwin-arm64 init -i azure:v1.20.5 -v=5
No default config file available
Fetching providers
Potential override file searchFile="/Users/user/Library/Application Support/cluster-api/overrides/infrastructure-azure/v1.20.5/infrastructure-components.yaml" provider="infrastructure-azure" version="v1.20.5"
Fetching file="infrastructure-components.yaml" provider="azure" type="InfrastructureProvider" version="v1.20.5"
Potential override file searchFile="/Users/user/Library/Application Support/cluster-api/overrides/cluster-api/v1.9.11/metadata.yaml" provider="cluster-api" version="v1.9.11"
Fetching file="metadata.yaml" provider="cluster-api" type="CoreProvider" version="v1.9.11"
Potential override file searchFile="/Users/user/Library/Application Support/cluster-api/overrides/infrastructure-azure/v1.20.5/metadata.yaml" provider="infrastructure-azure" version="v1.20.5"
Fetching file="metadata.yaml" provider="azure" type="InfrastructureProvider" version="v1.20.5"
Creating Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Certificate="selfsigned-cert" Namespace="cert-manager-test"
Deleting Namespace="cert-manager-test"
Deleting Issuer="test-selfsigned" Namespace="cert-manager-test"
Deleting Certificate="selfsigned-cert" Namespace="cert-manager-test"
Skipping installing cert-manager as it is already installed
Installing provider="infrastructure-azure" version="v1.20.5" targetNamespace="capz-system"
Creating objects provider="infrastructure-azure" version="v1.20.5" targetNamespace="capz-system"
Creating Namespace="capz-system"
Creating Issuer="capz-selfsigned-issuer" Namespace="capz-system"
Creating Issuer="azureserviceoperator-selfsigned-issuer" Namespace="capz-system"
Creating Certificate="capz-serving-cert" Namespace="capz-system"
Creating Certificate="azureserviceoperator-serving-cert" Namespace="capz-system"
Creating CustomResourceDefinition="azureasomanagedclusters.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedclustertemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedcontrolplanes.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedcontrolplanetemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedmachinepools.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedmachinepooltemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureclusteridentities.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureclusters.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureclustertemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachinepoolmachines.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachinepools.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachines.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachinetemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedclusters.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedclustertemplates.infrastructure.cluster.x-k8s.io"
Patching CustomResourceDefinition="azuremanagedcontrolplanes.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedcontrolplanetemplates.infrastructure.cluster.x-k8s.io"
Patching CustomResourceDefinition="azuremanagedmachinepools.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedmachinepooltemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="bastionhosts.network.azure.com"
Creating CustomResourceDefinition="extensions.kubernetesconfiguration.azure.com"
Creating CustomResourceDefinition="fleetsmembers.containerservice.azure.com"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Retrying with backoff cause="failed to get current provider object: failed to get API group resources: unable to retrieve the complete list of server APIs: apiextensions.k8s.io/v1: Get \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1?timeout=30s\": http2: client connection lost"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Retrying with backoff cause="failed to get current provider object: failed to get API group resources: unable to retrieve the complete list of server APIs: apiextensions.k8s.io/v1: Get \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1?timeout=30s\": http2: client connection lost"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Retrying with backoff cause="failed to get current provider object: failed to get API group resources: unable to retrieve the complete list of server APIs: apiextensions.k8s.io/v1: Get \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1?timeout=30s\": http2: client connection lost"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Error: action failed after 10 attempts: failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post "https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.retryWithExponentialBackoff
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/client.go:238
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.(*providerComponents).Create
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/components.go:89
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.installComponentsAndUpdateInventory
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/installer.go:116
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.(*providerInstaller).Install
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/installer.go:99
sigs.k8s.io/cluster-api/cmd/clusterctl/client.(*clusterctlClient).Init
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/init.go:153
sigs.k8s.io/cluster-api/cmd/clusterctl/cmd.runInit
	sigs.k8s.io/cluster-api/cmd/clusterctl/cmd/init.go:149
sigs.k8s.io/cluster-api/cmd/clusterctl/cmd.init.func13
	sigs.k8s.io/cluster-api/cmd/clusterctl/cmd/init.go:89
github.com/spf13/cobra.(*Command).execute
	github.com/spf13/cobra@v1.8.1/command.go:985
github.com/spf13/cobra.(*Command).ExecuteC
	github.com/spf13/cobra@v1.8.1/command.go:1117
github.com/spf13/cobra.(*Command).Execute
	github.com/spf13/cobra@v1.8.1/command.go:1041
sigs.k8s.io/cluster-api/cmd/clusterctl/cmd.Execute
	sigs.k8s.io/cluster-api/cmd/clusterctl/cmd/root.go:113
main.main
	sigs.k8s.io/cluster-api/cmd/clusterctl/main.go:27
runtime.main
	runtime/proc.go:272
runtime.goexit
	runtime/asm_arm64.s:1223

Clearly issue is caused by azureasomanagedclusters.infrastructure.cluster.x-k8s.io CRD being too big to process in a single patch or create operation by our EKS api-server. It was possible to create it manually with kubectl through a series of patches after splitting it into 4 chunks.

Image

Unfortunately clusterctl always tries to perform patching operation which fails as before. Same problem appeared for version 1.18.x and 1.19.x

 ./clusterctl-darwin-arm64 init -i azure:v1.20.5 -v=5
No default config file available
Fetching providers
Potential override file searchFile="/Users/user/Library/Application Support/cluster-api/overrides/infrastructure-azure/v1.20.5/infrastructure-components.yaml" provider="infrastructure-azure" version="v1.20.5"
Fetching file="infrastructure-components.yaml" provider="azure" type="InfrastructureProvider" version="v1.20.5"
Potential override file searchFile="/Users/user/Library/Application Support/cluster-api/overrides/cluster-api/v1.9.11/metadata.yaml" provider="cluster-api" version="v1.9.11"
Fetching file="metadata.yaml" provider="cluster-api" type="CoreProvider" version="v1.9.11"
Potential override file searchFile="/Users/user/Library/Application Support/cluster-api/overrides/infrastructure-azure/v1.20.5/metadata.yaml" provider="infrastructure-azure" version="v1.20.5"
Fetching file="metadata.yaml" provider="azure" type="InfrastructureProvider" version="v1.20.5"
Creating Namespace="cert-manager-test"
Creating Issuer="test-selfsigned" Namespace="cert-manager-test"
Creating Certificate="selfsigned-cert" Namespace="cert-manager-test"
Deleting Namespace="cert-manager-test"
Deleting Issuer="test-selfsigned" Namespace="cert-manager-test"
Deleting Certificate="selfsigned-cert" Namespace="cert-manager-test"
Skipping installing cert-manager as it is already installed
Installing provider="infrastructure-azure" version="v1.20.5" targetNamespace="capz-system"
Creating objects provider="infrastructure-azure" version="v1.20.5" targetNamespace="capz-system"
Creating Namespace="capz-system"
Creating Issuer="capz-selfsigned-issuer" Namespace="capz-system"
Creating Issuer="azureserviceoperator-selfsigned-issuer" Namespace="capz-system"
Creating Certificate="capz-serving-cert" Namespace="capz-system"
Creating Certificate="azureserviceoperator-serving-cert" Namespace="capz-system"
Creating CustomResourceDefinition="azureasomanagedclusters.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedclustertemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedcontrolplanes.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedcontrolplanetemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedmachinepools.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureasomanagedmachinepooltemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureclusteridentities.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureclusters.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azureclustertemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachinepoolmachines.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachinepools.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachines.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremachinetemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedclusters.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedclustertemplates.infrastructure.cluster.x-k8s.io"
Patching CustomResourceDefinition="azuremanagedcontrolplanes.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedcontrolplanetemplates.infrastructure.cluster.x-k8s.io"
Patching CustomResourceDefinition="azuremanagedmachinepools.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="azuremanagedmachinepooltemplates.infrastructure.cluster.x-k8s.io"
Creating CustomResourceDefinition="bastionhosts.network.azure.com"
Creating CustomResourceDefinition="extensions.kubernetesconfiguration.azure.com"
Creating CustomResourceDefinition="fleetsmembers.containerservice.azure.com"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Retrying with backoff cause="failed to get current provider object: failed to get API group resources: unable to retrieve the complete list of server APIs: apiextensions.k8s.io/v1: Get \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1?timeout=30s\": http2: client connection lost"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Retrying with backoff cause="failed to get current provider object: failed to get API group resources: unable to retrieve the complete list of server APIs: apiextensions.k8s.io/v1: Get \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1?timeout=30s\": http2: client connection lost"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Retrying with backoff cause="failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"
Retrying with backoff cause="failed to get current provider object: failed to get API group resources: unable to retrieve the complete list of server APIs: apiextensions.k8s.io/v1: Get \"https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1?timeout=30s\": http2: client connection lost"
Patching CustomResourceDefinition="managedclusters.containerservice.azure.com"
Error: action failed after 10 attempts: failed to create provider object apiextensions.k8s.io/v1, Kind=CustomResourceDefinition, /managedclusters.containerservice.azure.com: Post "https://xxxxxxxx.gr7.us-east-2.eks.amazonaws.com/apis/apiextensions.k8s.io/v1/customresourcedefinitions?timeout=30s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.retryWithExponentialBackoff
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/client.go:238
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.(*providerComponents).Create
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/components.go:89
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.installComponentsAndUpdateInventory
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/installer.go:116
sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster.(*providerInstaller).Install
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/cluster/installer.go:99
sigs.k8s.io/cluster-api/cmd/clusterctl/client.(*clusterctlClient).Init
	sigs.k8s.io/cluster-api/cmd/clusterctl/client/init.go:153
sigs.k8s.io/cluster-api/cmd/clusterctl/cmd.runInit
	sigs.k8s.io/cluster-api/cmd/clusterctl/cmd/init.go:149
sigs.k8s.io/cluster-api/cmd/clusterctl/cmd.init.func13
	sigs.k8s.io/cluster-api/cmd/clusterctl/cmd/init.go:89
github.com/spf13/cobra.(*Command).execute
	github.com/spf13/cobra@v1.8.1/command.go:985
github.com/spf13/cobra.(*Command).ExecuteC
	github.com/spf13/cobra@v1.8.1/command.go:1117
github.com/spf13/cobra.(*Command).Execute
	github.com/spf13/cobra@v1.8.1/command.go:1041
sigs.k8s.io/cluster-api/cmd/clusterctl/cmd.Execute
	sigs.k8s.io/cluster-api/cmd/clusterctl/cmd/root.go:113
main.main
	sigs.k8s.io/cluster-api/cmd/clusterctl/main.go:27
runtime.main
	runtime/proc.go:272
runtime.goexit
	runtime/asm_arm64.s:1223

What did you expect to happen:

Is there any way to install Azure provider but without CRDs assuming that they are already present on the cluster? Do you plan to make this single CRD to provided in chunks during initialisation or upgrade process?

Anything else you would like to add:

Environment:

  • cluster-api-provider-azure version: 1.20.5 (capi 1.9.11)
  • Kubernetes version: (use kubectl version): v1.32.9-eks-3025e55
  • OS (e.g. from /etc/os-release): AL2023

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions