Skip to content

Support combined calico/calico image for OSS deployments#4608

Draft
caseydavenport wants to merge 14 commits intotigera:masterfrom
caseydavenport:casey-uber-operator-v2
Draft

Support combined calico/calico image for OSS deployments#4608
caseydavenport wants to merge 14 commits intotigera:masterfrom
caseydavenport:casey-uber-operator-v2

Conversation

@caseydavenport
Copy link
Copy Markdown
Member

Teaches the operator to deploy the combined calico/calico image when the calico component is present in the versions file. Each component deployment gets the appropriate command and health probes.

When the combined image is available, the operator sets:

  • image: calico/calico:$VERSION for typha, kube-controllers, apiserver, webhooks, goldmane, whisker-backend, dikastes, csi, key-cert, guardian
  • command: ["calico", "<subcommand>"] to select the component
  • Exec probes using calico health --port=<port> --type=readiness|liveness for components with HealthAggregator

Health probe changes:

  • Goldmane: exec probe via generic calico health command (port 8080)
  • Kube-controllers: exec probe via generic calico health command (port 9440), replacing the legacy file-based check-status

Depends on: projectcalico/calico#12225

None

Switch the operator to use the consolidated calico/calico uber image
for all OSS (non-enterprise, non-FIPS) component deployments: typha,
kube-controllers, apiserver, CSI driver, CNI plugin, and flexvol.

Each component gets an explicit Command override to dispatch to the
correct subcommand (e.g., ["calico", "typha"]). Enterprise and FIPS
deployments continue using per-component images unchanged.
Extend the uber calico/calico image usage to goldmane, webhooks, and
whisker-backend components for non-enterprise, non-FIPS deployments.
Each component gets a command override to dispatch to the correct
subcommand in the uber binary.
The uber image doesn't include the separate /health binary. Use
httpGet probes against goldmane's health server instead of exec
probes when using the uber image.
The HTTP probes fail on dual-stack clusters because the kubelet resolves
localhost to [::1] (IPv6) while goldmane's health server binds only to
127.0.0.1. Use exec probes with the new "calico goldmane-check" subcommand
to match how the standalone image worked.
The goldmane HealthAggregator now binds all interfaces instead of
localhost, so kubelet HTTP probes work on dual-stack clusters.
Switch from exec-based probes back to native httpGet probes.
Switch goldmane and kube-controllers uber probes to use the generic
"calico health" command with exec probes. This avoids host-to-pod HTTP
traffic while using the standardized health check infrastructure.

Kube-controllers now starts an HTTP health server on port 9094 via
--health-port flag, alongside the legacy file-based status.
When the combined calico/calico image is available, use it for:
- ebpf-bootstrap init container: calico node init [--best-effort]
- preStop hook: calico node shutdown
- readiness probe: calico node health --felix-ready [--bird-ready]

The node container itself still uses calico/node since it needs the
full base image with BIRD, runit, etc.
The calico-node binary now uses cobra subcommands. Update the operator
to use the new syntax when the combined image layout is active:

- Init container: calico-node init [--best-effort] (stays on node image)
- PreStop: calico-node shutdown
- Readiness: calico-node health --felix-ready [--bird-ready]

The node container still uses calico/node since it needs the full base
image with BIRD, runit, iptables, and bpftool.
The node image now includes the combined calico binary at /usr/bin/calico,
avoiding the need to compile it separately. Update init container, probes,
and lifecycle hooks to use "calico node" subcommands via /usr/bin/calico
when the combined image layout is active.
The uber calico binary now namespaces all component entry points under
'calico component'. Update all command arrays in render code and tests
to insert "component" after "calico" for component commands.

User-facing commands (health, ctl, version) remain at the top level.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants