CRD Reference
API reference for the Machine custom resource.
API group: unbounded-cloud.io/v1alpha3
This document describes the custom resource definitions shipped with machina: Machine and MachineOperation.
Machine
| Property | Value |
|---|---|
| Kind | Machine |
| Plural | machines |
| Short name | mach |
| Scope | Cluster |
| Status subresource | Yes |
Printer columns:
| Name | JSON Path | Description |
|---|---|---|
| Host | .spec.ssh.host | SSH target address |
| Phase | .status.phase | Current lifecycle phase |
| K8s Version | .spec.kubernetes.version | Desired Kubernetes version |
| Age | standard | Time since creation |
spec.ssh
SSH connection details. When ssh is nil, the machina controller skips the Machine entirely.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
ssh | SSHSpec | No | — | SSH connection configuration. |
ssh.host | string | Yes | — | Hostname or IP, optionally with port (e.g. 1.2.3.4:2222). Port 22 is assumed when omitted. |
ssh.username | string | No | "azureuser" | SSH username. |
ssh.privateKeyRef | SecretKeySelector | Yes | — | Reference to a Secret containing the SSH private key. Must reside in the unbounded-kube namespace. |
ssh.privateKeyRef.name | string | Yes | — | Secret name. |
ssh.privateKeyRef.namespace | string | Yes | — | Secret namespace (must be unbounded-kube). |
ssh.privateKeyRef.key | string | No | "ssh-privatekey" | Key within the Secret’s data map. |
ssh.bastion | BastionSSHSpec | No | — | Optional jump host for the SSH connection. |
ssh.bastion.host | string | Yes | — | Bastion hostname or IP, optionally with port. |
ssh.bastion.username | string | No | "azureuser" | Bastion SSH username. |
ssh.bastion.privateKeyRef | *SecretKeySelector | No | Same as ssh.privateKeyRef | Bastion SSH key. Falls back to the parent ssh.privateKeyRef when omitted. |
spec.pxe
PXE boot configuration consumed by the metalman controller.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
pxe | PXESpec | No | — | PXE boot configuration. |
pxe.image | string | Yes | — | OCI image reference containing netboot artifacts (e.g. "ghcr.io/azure/images/host-ubuntu2404:v1"). |
pxe.dhcpLeases | []DHCPLease | No | — | Static DHCP leases served during PXE boot. |
pxe.dhcpLeases[].ipv4 | string | Yes | — | Static IPv4 address to assign. |
pxe.dhcpLeases[].mac | string | Yes | — | NIC MAC address (matched case-insensitively). |
pxe.dhcpLeases[].subnetMask | string | Yes | — | Subnet mask. |
pxe.dhcpLeases[].gateway | string | Yes | — | Default gateway. |
pxe.dhcpLeases[].dns | []string | No | — | DNS server addresses. |
pxe.redfish | RedfishSpec | No | — | BMC access via the Redfish API. |
pxe.redfish.url | string | Yes | — | Redfish endpoint URL. |
pxe.redfish.username | string | Yes | — | Redfish username. |
pxe.redfish.deviceID | string | No | "1" | Redfish system device ID. |
pxe.redfish.passwordRef | SecretKeySelector | Yes | — | Secret containing the Redfish password. |
pxe.cloudInit | CloudInitSpec | No | — | Optional cloud-init customization for PXE-booted machines. |
pxe.cloudInit.userDataConfigMapRef | ConfigMapKeySelector | No | — | Reference to a ConfigMap containing custom cloud-init user-data. |
pxe.cloudInit.userDataConfigMapRef.name | string | Yes | — | ConfigMap name. |
pxe.cloudInit.userDataConfigMapRef.namespace | string | Yes | — | ConfigMap namespace. |
pxe.cloudInit.userDataConfigMapRef.key | string | No | "user-data" | Key within the ConfigMap. |
spec.kubernetes
Kubernetes join configuration.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
kubernetes | KubernetesSpec | No | — | Kubernetes join settings. |
kubernetes.version | string | No | Cluster version | Desired Kubernetes version (e.g. "v1.34.0"). A v prefix is added automatically if missing. |
kubernetes.nodeRef | *LocalObjectReference | No | — | Reference to the corresponding Node object. Set by the controller. |
kubernetes.nodeLabels | map[string]string | No | — | Labels to apply to the Node (not yet propagated by the machina controller). |
kubernetes.bootstrapTokenRef.name | string | Yes | — | Name of the bootstrap token Secret in kube-system. |
spec.operations
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
operations.rebootCounter | int64 | No | 0 | Triggers a reboot when the spec value exceeds the status value. |
operations.repaveCounter | int64 | No | 0 | Triggers a PXE repave when the spec value exceeds the status value. |
spec.provider and spec.providerID
provider selects the external control provider for out-of-band operations.
Built-in providers are AzureVM and OCIInstance, and provider-specific
controllers may use their own non-empty provider names. providerID identifies
the underlying infrastructure resource and follows the Kubernetes Node provider
ID convention.
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
provider | string | For external operations | – | External control provider, such as AzureVM, OCIInstance, or a provider-specific value handled by a custom controller. |
providerID | string | For external operations | – | Provider-specific resource ID such as azure:///subscriptions/.../virtualMachines/name or oci://ocid1.instance.... |
Machine operation credentials are selected by the Machine site label. Providers that support OIDC/workload identity use WorkloadIdentity; providers or sites that need provider-specific credential material use ExternalPlugin with a referenced Secret.
Custom provider controllers can implement pkg/machineops.Provider and reuse pkg/machineops/controller.MachineOperationReconciler with SiteName and ProviderName set for their deployment.
apiVersion: unbounded-cloud.io/v1alpha3
kind: MachineOperationCredential
metadata:
name: remote-azure
spec:
siteName: remote
provider: AzureVM
auth:
mode: WorkloadIdentity
apiVersion: unbounded-cloud.io/v1alpha3
kind: MachineOperationCredential
metadata:
name: remote-oci
spec:
siteName: remote
provider: OCIInstance
auth:
mode: ExternalPlugin
secretRef:
namespace: unbounded-kube
name: remote-oci-auth
MachineOperation
| Property | Value |
|---|---|
| Kind | MachineOperation |
| Plural | machineoperations |
| Short name | mop |
| Scope | Cluster |
| Status subresource | Yes |
MachineOperation is a job-like CR for discrete operations. The in-host agent handles Kubernetes node operations such as NodeReboot and agent operations such as AgentReset; machine-ops-controller handles out-of-band VM operations such as Azure VM power actions. PXE/BMC operations remain owned by metalman for now.
| Field | Type | Required | Description |
|---|---|---|---|
spec.machineRef | string | No | Target Machine name. Either machineRef or machineSelector must be set. |
spec.machineSelector | LabelSelector | No | Selects Machines by label. Supported for agent-handled operations (NodeReboot, AgentUpgrade, AgentReset). Each matching agent independently picks up the operation. Not supported for host operations. |
spec.operationKind | string | Yes | One of NodeReboot, AgentUpgrade, AgentReset, HostReboot, HostPowerOff, HostPowerOn, HostReplace. |
spec.parameters | map[string]string | No | Operation-specific parameters. |
spec.ttlSecondsAfterFinished | int32 | No | Delete completed or failed operations after this many seconds. |
status.phase | string | No | Pending, InProgress, Complete, or Failed. |
status.message | string | No | Human-readable status message. |
status.startedAt | time | No | Operation start timestamp. |
status.completedAt | time | No | Terminal phase timestamp. |
status.targets | []TargetStatus | No | Per-Machine target status snapshot used by metalman host operations. |
AgentUpgrade is handled by the in-host agent and requires spec.parameters.downloadURL. The URL must point to an unbounded-agent release tarball; the agent stages it as the inactive blue/green daemon binary, records the previous binary as last known good, and restarts unbounded-agent-daemon.service. If systemd cannot keep the upgraded daemon running, unbounded-agent-daemon-recovery.service switches the daemon back to the last known good binary.
The Azure VM provider handles:
| Operation | Azure action |
|---|---|
HostReboot | VirtualMachinesClient.BeginRestart |
HostPowerOff | VirtualMachinesClient.BeginPowerOff |
HostPowerOn | VirtualMachinesClient.BeginStart |
HostReplace | VirtualMachinesClient.Get, BeginDelete, then BeginCreateOrUpdate |
HostReplace for AzureVM destructively replaces the VM: it reads the existing VM model, detaches NICs and data disks, deletes the VM resource, and recreates the same VM name with fresh cloud-init custom data that installs unbounded-agent. The old OS disk is not reused. Operation completion means the replacement VM create operation completed; it does not mean the Kubernetes Node is Ready. The Machine controller continues tracking whether the Kubernetes Node disappears and rejoins. Configure machine-ops-controller --api-server-endpoint with an API server address reachable from replaced hosts; the generated agent bootstrap config uses that value.
This replacement flow avoids Azure standalone VM customData immutability during native reimage. It intentionally destroys host-local state on the old OS disk.
The OCI instance provider handles:
| Operation | OCI action |
|---|---|
HostReboot | RESET |
HostPowerOff | STOP |
HostPowerOn | START |
HostReplace | STOP old instance, LaunchInstance replacement, patch Machine.spec.providerID, then terminate old instance |
HostReplace for OCIInstance creates a replacement instance because OCI launch user_data is immutable after instance creation. The controller stops the old instance, launches a new instance in the same availability domain, subnet, shape, and fault domain, requests a public IP for bootstrap egress, patches Machine.spec.providerID to the new instance OCID after the replacement reaches RUNNING, and then terminates the old instance. The replacement reuses the original Machine name as the kubelet node name so it rejoins through the existing Kubernetes Node object. Operation completion means the replacement is running, provider ID handoff succeeded, and old-instance cleanup succeeded; it does not wait for the Kubernetes Node to become Ready.
The OCI replacement flow copies display name, defined tags, freeform tags, selected agent/availability/shape settings, and primary VNIC subnet/NSG/source-destination-check settings. It adds Unbounded freeform tags for idempotent retry lookup. It does not preserve the exact private IP, boot volume, or attached data volumes; active attached data volumes fail the operation before the old instance is stopped. By default, the replacement uses the latest compatible Canonical Ubuntu 24.04 image for the source instance shape. Set spec.parameters.imageID to use a specific OCI image OCID. Set spec.parameters.sshAuthorizedKeys to append SSH authorized keys to replacement metadata for break-glass debugging.
Metalman handles bare-metal host operations for Machines with spec.pxe.redfish
and no external spec.provider/spec.providerID. Bare-metal host operations may
target one Machine with spec.machineRef or a site-scoped set of Machines with
spec.machineSelector. Selector-based bare-metal host operations must select a
single metalman site with unbounded-cloud.io/site=<site>.
For metalman operations, status.targets[] is snapshotted when execution starts
and remains authoritative even if labels later change. Each entry includes:
| Field | Type | Description |
|---|---|---|
machineRef | string | Target Machine name. |
phase | string | Target phase: Pending, InProgress, Complete, or Failed. |
stage | string | Target operation stage such as WaitingOff, WaitingOn, or WaitingRepave. |
message | string | Human-readable target progress or failure message. |
startedAt | time | Target start timestamp. |
completedAt | time | Target terminal timestamp. |
observedGeneration | int64 | Machine generation acted on. |
targetOperations | OperationsStatus | Counter targets used by bare-metal HostReplace. |
attempts | int32 | External action attempts for retryable Redfish operations. |
lastAttemptAt | time | Most recent external action attempt timestamp. |
status
| Field | Type | Description |
|---|---|---|
phase | string | Current lifecycle phase (see table below). |
message | string | Human-readable status message. |
ssh.fingerprint | string | SSH host key fingerprint (not yet implemented). |
redfish.certFingerprint | string | BMC TLS certificate SHA-256 fingerprint. Set by metalman using TOFU. |
tpm.ekPublicKey | string | TPM endorsement key in PEM format. Set by metalman attestation using TOFU. |
operations.rebootCounter | int64 | Last-acted reboot counter value. |
operations.repaveCounter | int64 | Last-acted repave counter value. |
conditions | []Condition | Standard Kubernetes conditions (see below). |
Conditions
| Type | Set By | Description |
|---|---|---|
SSHReachable | machina | True / False based on a TCP probe to the SSH port. |
Provisioning | machina | True while the install script is running over SSH. lastTransitionTime records when provisioning started, used to detect stale provisioning attempts (e.g. after a controller restart). |
Provisioned | machina | True after successful SSH provisioning. ObservedGeneration tracks the spec generation. |
PoweredOff | metalman | Tracks BMC power state during a reboot cycle. Removed after power-on completes. Not defined as a CRD type constant; set directly by the metalman redfish reconciler. |
BootOrderConfigSupported | metalman | Set to False when the BMC does not support boot order configuration. Not defined as a CRD type constant; set directly by the metalman redfish reconciler. |
Repaved | metalman | False/Pending during repave; True/Succeeded after /pxe/disable. Stale False conditions are removed after a 30-minute timeout. |
Phase lifecycle
The machina controller drives the following phases:
| Phase | Meaning | Requeue interval |
|---|---|---|
Pending | SSH is unreachable. | 30 s |
Provisioning | Install script is running over SSH. | — |
Joining | Provisioned; waiting for a Node with the matching label. | 30 s |
Ready | Node exists, or no kubernetes spec is present. | 5 min |
Failed | Provisioning encountered an error. | 60 s |
Rebooting | Reserved for metalman or provider controllers. | — |
Labels and annotations
Labels:
| Label | Applied to | Description |
|---|---|---|
unbounded-cloud.io/machine | Node | Maps the Node back to its Machine CR. Set during provisioning. |
unbounded-cloud.io/site | Machine | Scopes a metalman instance to a subset of Machines. |
unbounded-cloud.io/default-bootstrap-token | Secret | Marks a Secret as the default bootstrap token for auto-discovery. |
Annotations:
| Annotation | Description |
|---|---|
unbounded-cloud.io/provider | Associates a Machine with a provider controller (extension point). |
Examples
Minimal SSH-only Machine:
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: worker-01
spec:
ssh:
host: "10.0.0.50"
privateKeyRef:
name: ssh-key
namespace: unbounded-kube
kubernetes:
version: v1.34.0
bootstrapTokenRef:
name: bootstrap-token-abc123
SSH with bastion:
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: worker-02
spec:
ssh:
host: "192.168.1.100:2222"
username: ubuntu
privateKeyRef:
name: ssh-key
namespace: unbounded-kube
key: id_ed25519
bastion:
host: "bastion.example.com"
username: jump
kubernetes:
version: v1.34.0
bootstrapTokenRef:
name: bootstrap-token-abc123
Azure VM with external power operations:
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: azure-worker-01
spec:
provider: AzureVM
providerID: azure:///subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/rg-workers/providers/Microsoft.Compute/virtualMachines/azure-worker-01
configurationRef:
name: azure-workers
apiVersion: unbounded-cloud.io/v1alpha3
kind: MachineOperation
metadata:
name: azure-worker-01-hardreboot
spec:
machineRef: azure-worker-01
operationKind: HostReboot
ttlSecondsAfterFinished: 300
OCI instance with external power operations:
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: oci-worker-01
spec:
provider: OCIInstance
providerID: oci://ocid1.instance.oc1...
configurationRef:
name: oci-workers
apiVersion: unbounded-cloud.io/v1alpha3
kind: MachineOperation
metadata:
name: oci-worker-01-poweroff
spec:
machineRef: oci-worker-01
operationKind: HostPowerOff
ttlSecondsAfterFinished: 300
PXE / bare-metal Machine:
apiVersion: unbounded-cloud.io/v1alpha3
kind: Machine
metadata:
name: baremetal-01
labels:
unbounded-cloud.io/site: lab
spec:
ssh:
host: "10.0.0.60"
privateKeyRef:
name: ssh-key
namespace: unbounded-kube
pxe:
image: ghcr.io/azure/images/host-ubuntu2404:v1
dhcpLeases:
- ipv4: "10.0.0.60"
mac: "aa:bb:cc:dd:ee:ff"
subnetMask: "255.255.255.0"
gateway: "10.0.0.1"
dns:
- "8.8.8.8"
redfish:
url: "https://bmc-01.example.com"
username: admin
passwordRef:
name: bmc-password
namespace: unbounded-kube
cloudInit:
userDataConfigMapRef:
name: my-cloud-init
namespace: unbounded-kube
kubernetes:
version: v1.34.0
bootstrapTokenRef:
name: bootstrap-token-abc123
Netboot OCI Images
Netboot images are standard OCI container images built FROM scratch that
contain all files needed for PXE booting a machine under /disk/. This follows
the kubevirt containerDisk convention.
Files with a .tmpl suffix are Go templates rendered per-machine at serve time;
other files are served verbatim. A metadata.yaml file provides image-level
configuration (e.g. dhcpBootImageName).
Image layout
Template data
Templates receive the following data object:
| Field | Type | Description |
|---|---|---|
.Machine | *Machine | The Machine CR that initiated the request. |
.ApiserverURL | string | External Kubernetes API server URL. |
.ServeURL | string | External metalman HTTP URL. |
.KubernetesVersion | string | Resolved Kubernetes version for the machine. |
.ClusterDNS | string | Cluster DNS service IP. |
Building images
Images are built, tagged, and pushed using standard container tooling:
docker build -t ghcr.io/azure/images/host-ubuntu2404:v1 .
docker push ghcr.io/azure/images/host-ubuntu2404:v1
See images/host-ubuntu2404/ for an example Containerfile.
metadata.yaml
dhcpBootImageName: shimx64.efi
The dhcpBootImageName field specifies the boot filename included in DHCP
responses (option 67).
CRD relationships
See Also
- SSH Guide – SSH provisioning walkthrough using these CRDs.
- PXE Guide – Bare-metal provisioning walkthrough using Machine and OCI netboot images.
- Networking CRDs – Site, GatewayPool, and related CRDs from unbounded-net.
- CLI Reference – The
kubectl unboundedcommands that create these resources. - Architecture – How these CRDs drive the provisioning pipelines.