🤖 Humanoid 🦾 Industrial & Cobot 🚚 AGV / AMR 🐕 Quadruped ⚙️ Reducers · Servos · Sensors 🚁 Drones & Autonomy 🧠 Embodied AI
Robos News

Robotics Deployment & Automation: Docker, Kubernetes, and CI/CD Pipelines for Robot Systems (2026 Guide)

Standard DevOps documentation covers general software deployment; this guide covers the additional constraints — real-time kernels, hardware device access, DDS middleware, and multi-architecture ARM builds — that robotics teams must solve before those tools work on a robot fleet. If you've tried to apply a standard Kubernetes tutorial to a ROS 2 system and hit a wall, this is the guide that fills the gap.

Why Robotics Deployment Is Different: Constraints That General DevOps Docs Ignore

General DevOps tooling assumes your workload is stateless, runs on commodity x86 hardware, and communicates over standard TCP/IP. Robot software breaks all three assumptions.

Real-time OS requirements. Many robot control loops require deterministic scheduling — jitter measured in microseconds, not milliseconds. This means your container host may need a PREEMPT_RT-patched kernel or a dedicated RTOS partition. Docker and Kubernetes do not configure this for you; it must be baked into the node image before orchestration begins.

Hardware-in-the-loop access. Robots depend on physical devices: LiDAR over USB or PCIe, IMUs on serial buses, GPU accelerators for perception. Containers need explicit --device flags or Kubernetes device plugins (such as the NVIDIA GPU operator or custom USB device plugins) to expose these safely. Forgetting this is the single most common reason a "working" container image fails on the actual robot.

ROS 2 lifecycle and DDS middleware. ROS 2 nodes follow a managed lifecycle (unconfigured → inactive → active → finalized). Kubernetes health probes must be mapped to these lifecycle states, not generic HTTP endpoints. DDS — the underlying transport — uses multicast discovery by default, which conflicts with Kubernetes overlay networks. Teams must configure a unicast peer list or use a DDS router (such as Zenoh or eProsima's DDS Router) to bridge namespaces across pods.


Containerizing Robot Software: Docker and ROS 2 in Practice

Running ROS 2 in Docker for production is viable and increasingly standard, but requires deliberate image design.

Multi-architecture builds for ARM. Most robot compute boards (NVIDIA Jetson, Raspberry Pi CM4, NXP i.MX) are ARM-based. Use docker buildx with --platform linux/arm64,linux/amd64 to produce multi-arch manifests. Pin your base image to a specific ROS 2 release tag (e.g., ros:jazzy-ros-base) aligned with your REP-2000 target platform to avoid silent ABI mismatches between releases.

Sensor driver access. Mount /dev selectively rather than running --privileged. Use udev rules on the host and map only the required device nodes. For cameras and LiDAR, also consider sharing the host network namespace (--network host) to avoid DDS discovery issues — though this trades isolation for compatibility.

DDS networking inside containers. Set ROS_DOMAIN_ID consistently across your pod spec environment variables. For multi-host fleets, configure a FASTRTPS_DEFAULT_PROFILES_FILE or equivalent profile that disables multicast and lists explicit peer IPs or uses a discovery server. Without this, nodes in separate pods will not find each other.


Orchestrating Robot Fleets with Kubernetes: Edge Deployments, K3s, and KubeEdge

Full Kubernetes is often too heavy for a single robot's onboard compute. The practical split is:

  • K3s for capable edge nodes (Jetson AGX, industrial PCs): lightweight, single-binary, supports standard Kubernetes manifests with minimal overhead.
  • KubeEdge when you need cloud-to-edge synchronization: it extends the Kubernetes control plane to edge nodes that may be intermittently connected, syncing desired state even during network partitions.
  • Full Kubernetes (e.g., RKE2) for fixed infrastructure like robot workcells on a factory LAN with reliable connectivity.

Map ROS 2 node groups to Kubernetes DaemonSets for per-node services (hardware drivers) and Deployments for replicated services (perception pipelines). Use nodeAffinity rules to pin hardware-dependent pods to the specific robot node that has the required device.

For rolling updates to a live fleet, set maxUnavailable: 1 and use preStop lifecycle hooks to trigger a ROS 2 node shutdown via the lifecycle service call before the container terminates — preventing abrupt mid-operation failures.


Infrastructure as Code for Robotics Labs and Production Lines

Terraform handles cloud and virtualized infrastructure: provisioning the Kubernetes control plane, container registries, artifact storage, and VPN gateways that connect your robot fleet to CI systems. Use Terraform workspaces to separate staging (simulation) and production (physical robot) environments.

Ansible is the right tool for the physical layer that Terraform cannot reach: configuring real-time kernel parameters (/etc/sysctl.d/), installing vendor GPU drivers, setting udev rules, and bootstrapping K3s on bare-metal robot compute boards. A typical robotics Ansible role sequence:

  1. Apply RT kernel and CPU isolation (isolcpus) settings
  2. Install container runtime and device plugins
  3. Join the node to the K3s cluster
  4. Deploy base ROS 2 system packages and DDS configuration

Keep hardware-specific variables in Ansible inventory host vars, not in playbooks, so the same playbook applies to heterogeneous robot hardware.


GitOps CI/CD Pipelines for Robot Firmware and Software

The Open Robotics community has tracked growing adoption of CI/CD tooling among ROS 2 teams, with GitHub Actions and GitLab CI being the most commonly adopted platforms for building and testing ROS 2 packages — reflecting the broader shift toward treating robot software like production software.

Build and test stage (GitHub Actions / GitLab CI). Use the official ros-tooling/setup-ros action or equivalent to install ROS 2 dependencies. Run colcon build and colcon test in a matrix across your target ROS 2 distros (aligned with REP-2000 release targets). Cross-compile ARM artifacts using QEMU emulation or a native ARM runner to catch architecture-specific bugs before hardware.

Artifact promotion. Push passing multi-arch images to a container registry with semantic version tags. Never deploy latest to production robots — tag by Git SHA and ROS 2 distro.

GitOps delivery with Argo CD. Store Kubernetes manifests in a dedicated Git repo. Argo CD watches this repo and syncs changes to the fleet. For robot fleets, use Argo CD ApplicationSets to generate per-robot or per-site applications from a single template, enabling fleet-wide rollouts with per-robot override capability.

Staged rollout pattern:

  1. Simulation environment (software-in-the-loop)
  2. Hardware-in-the-loop test rig
  3. Canary robot (single unit in production)
  4. Full fleet rollout

Deployment Checklist and Tool Selection Matrix

Pre-deployment checklist:

  • Host kernel patched for real-time if required by control loop
  • Device plugins configured for all hardware peripherals
  • DDS discovery configured for container network (unicast or discovery server)
  • Multi-arch image built and tested on target ARM board
  • ROS 2 lifecycle probes mapped to Kubernetes readiness/liveness checks
  • maxUnavailable set appropriately for fleet rolling update policy
  • Secrets (API keys, TLS certs) managed via Kubernetes Secrets or Vault — not baked into images

Tool selection matrix:

Scenario Recommended Stack
Single robot, onboard compute only Docker Compose + systemd
Small fleet (2–20 robots), reliable LAN K3s + Argo CD + GitHub Actions
Large fleet, intermittent connectivity KubeEdge + Argo CD ApplicationSets
Mixed cloud + edge (simulation + physical) Terraform (cloud) + Ansible (edge) + K3s
Safety-critical, hard real-time control Bare metal + RTOS, containerize only non-RT nodes

Frequently asked questions

Can you run ROS 2 in a Docker container for production robot deployment?

Yes — ROS 2 runs reliably in Docker containers for production use, provided you address three non-obvious requirements: expose hardware devices explicitly via device flags or Kubernetes device plugins rather than running privileged containers; configure DDS discovery for unicast or a discovery server since multicast is typically blocked in container networks; and build multi-architecture images if your robot uses ARM hardware. Teams using this approach pin images to specific ROS 2 release tags per REP-2000 targets to avoid ABI drift between distro updates.

What is the best CI/CD pipeline setup for deploying updates to a fleet of autonomous robots?

The most robust pattern combines GitHub Actions or GitLab CI for building and testing ROS 2 packages (including cross-compiled ARM artifacts), a container registry for versioned multi-arch images, and Argo CD for GitOps-based delivery to the fleet. Use Argo CD ApplicationSets to manage per-robot or per-site applications from a single template. Always stage rollouts: simulation → hardware-in-the-loop test rig → canary robot → full fleet. Never deploy untagged or `latest` images to production robots.

What is the difference between K3s and KubeEdge for robot fleet orchestration?

K3s is a lightweight Kubernetes distribution suited for capable edge nodes with reliable network connectivity — it runs the full Kubernetes API on the robot itself. KubeEdge extends a central Kubernetes control plane to edge nodes and is designed for intermittent connectivity scenarios: it syncs desired state locally so robots continue operating and accepting updates even when the cloud connection drops. For most indoor robot fleets on a stable LAN, K3s is simpler; for outdoor or mobile fleets with unreliable uplinks, KubeEdge is the stronger choice.

How do you handle real-time kernel requirements when containerizing robot software?

The real-time kernel must be configured on the container host, not inside the container — containers share the host kernel, so a `PREEMPT_RT`-patched kernel on the host benefits all containers running on that node. Use Ansible to apply RT kernel installation, CPU isolation (`isolcpus`), and scheduler tuning as part of node provisioning. For hard real-time control loops, consider keeping those processes outside containers entirely and only containerizing the non-RT software stack (perception, planning, telemetry).

CD
Robos News Newsroom

Robos News covers markets, crypto and commodities for Asia & the Middle East — tier-1 desk research, AI-driven analysis, institutional-grade data. Tip our newsroom: [email protected]

Email the newsroom →
Disclaimer: This article is for informational purposes only and does not constitute investment advice. Data may be delayed up to 15 minutes. Past performance is not indicative of future results. Consult a licensed financial advisor before making investment decisions.

Related Stories

More from Industry →