Solved: I built an automated Talos + Proxmox + GitOps homelab starter (ArgoCD + Workflows + DR)

Published: (January 1, 2026 at 06:24 PM EST)
7 min read
Source: Dev.to

Source: Dev.to

Executive Summary

TL;DR: This blog post solves the problem of manual, inconsistent, and fragile homelab setups by detailing an automated, resilient system. It integrates Talos Linux, Proxmox, and a GitOps approach using ArgoCD and Argo Workflows for infrastructure provisioning, application management, and strategic disaster recovery.

🎯 Key Takeaways

  • Proxmox VE + Talos Linux – a robust, API‑driven foundation for automated VM provisioning and a secure, immutable Kubernetes OS.
  • ArgoCD – implements a GitOps workflow that continuously syncs Kubernetes cluster configurations and applications from a Git repository, eliminating configuration drift and enabling automated deployments.
  • Argo Workflows – orchestrates complex operational tasks such as automated backups (Proxmox VMs via PBS, Kubernetes apps via Velero) and disaster‑recovery testing, greatly enhancing homelab resilience and recovery capabilities.

Building a robust, automated homelab or small‑scale IT environment presents unique challenges. This post details how integrating Talos Linux, Proxmox, and a GitOps approach with ArgoCD, Argo Workflows, and strategic Disaster Recovery (DR) can transform a manual, fragile setup into a resilient, self‑healing system.

Symptoms: The Homelab Headache

Many IT professionals building or maintaining homelabs encounter a recurring set of frustrations that hinder scalability, reliability, and efficient management. These symptoms usually stem from a lack of automation and a reactive approach to infrastructure.

1. Manual VM & Kubernetes provisioning

  • What happens: New virtual machines are created on hypervisors (e.g., Proxmox) → OS is installed manually → networking is configured → Kubernetes cluster is bootstrapped.
  • Impact:
    • Extremely time‑consuming.
    • Prone to human error.
    • Each node becomes a “snowflake,” making consistency impossible.

2. Configuration drift & inconsistency

  • What happens: Manual tweaks to VMs, Kubernetes manifests, or network settings diverge from the intended state.
  • Impact:
    • Environments quickly lose alignment with the desired configuration.
    • Troubleshooting becomes difficult.
    • Deployments become unreliable because the desired state isn’t codified or enforced.

3. Lack of automated deployments & updates

  • What happens: Deploying new apps, updating services, or patching the OS requires manual SSH sessions, ad‑hoc scripts, or dashboard clicks.
  • Impact:
    • Slow, inefficient workflow.
    • Increased risk of downtime or unexpected failures.

4. Fragile disaster‑recovery (DR) strategy

  • What happens: No clear, automated DR plan; backups are manual, often outdated, and recovery procedures are untested.
  • Impact:
    • A single hardware failure or misconfiguration can cause data loss.
    • Service outages become prolonged and complex to resolve.

5. Operational burden of Kubernetes

  • What happens: Managing the control plane, keeping nodes up‑to‑date, and ensuring application resilience require constant attention.
  • Impact:
    • High operational overhead.
    • Complexity can quickly overwhelm a homelab enthusiast without automation.

Solution 1: Proxmox + Talos for a Robust & Minimalist Infrastructure Base

The foundation of a reliable homelab begins with a solid, automated infrastructure layer. This solution combines Proxmox VE for virtualization with Talos Linux for a secure, minimal, and immutable Kubernetes operating system.

Proxmox VE – The Virtualization Workhorse

Proxmox VE provides a powerful, open‑source platform for managing virtual machines, containers, and storage. Its API‑driven nature makes it an ideal candidate for infrastructure automation, allowing you to provision VMs programmatically instead of relying on manual GUI clicks.

Example: Automating VM Provisioning (Conceptual)

#!/usr/bin/env bash
# Basic VM creation using qm (simplified for illustration)
# In practice, wrap this in Terraform, Ansible, etc.

VMID="101"
VMNAME="talos-node-01"
MEM="4096"          # 4 GB RAM
CPUS="2"
DISK_SIZE="32G"
ISO_STORAGE="local:iso"
OS_TYPE="l26"
NET_BRIDGE="vmbr0"

# 1️⃣ Create the VM
qm create "$VMID" \
    --name "$VMNAME" \
    --memory "$MEM" \
    --cores "$CPUS" \
    --ostype "$OS_TYPE"

# 2️⃣ Attach storage
qm set "$VMID" \
    --scsihw virtio-scsi-pci \
    --scsi0 "local-lvm:$DISK_SIZE"

# 3️⃣ Add network
qm set "$VMID" \
    --net0 "virtio,bridge=$NET_BRIDGE"

# 4️⃣ Cloud‑Init CD‑ROM
qm set "$VMID" \
    --ide2 "local:cloudinit" \
    --boot "order=ide2"

# 5️⃣ Set boot order
qm set "$VMID" \
    --boot "order=ide2;scsi0"

# 6️⃣ Start the VM
qm start "$VMID"

Note: The Cloud‑Init payload should contain the Talos installer command and any required ignition files.

Talos Linux – Kubernetes‑Native OS

Talos Linux is a secure, minimal, and immutable operating system designed specifically for running Kubernetes. It eliminates unnecessary components, reducing the attack surface and operational overhead. Its API‑driven management model aligns perfectly with a GitOps approach.

  • Minimal Footprint: No shell, no package manager, no unnecessary services.
  • Immutability: The OS never drifts; all changes are applied via atomic updates.
  • API‑Driven: Configuration and operations are performed via a gRPC API, ideal for automation.
  • Enhanced Security: Reduced attack surface and cryptographic integrity checks.

Example: Generating Talos Configuration

#!/usr/bin/env bash
# 1️⃣ Generate cluster config (control‑plane + workers)
talosctl gen config my-cluster https://<CONTROL_PLANE_IP>:6443

# 2️⃣ Apply to each node
talosctl apply-config \
    --insecure \
    --nodes <NODE_IP> \
    --file worker.yaml

# 3️⃣ Bootstrap control plane
talosctl bootstrap \
    --nodes <CONTROL_PLANE_IP>

These commands are typically wrapped in CI/CD pipelines so the entire provisioning‑to‑bootstrap process is fully automated.

What’s Next?

The upcoming sections will cover:

  1. GitOps with ArgoCD – Keeping Kubernetes manifests in sync with a Git repository.
  2. Argo Workflows for Automation – Orchestrating backups, restores, and DR drills.
  3. Disaster Recovery Strategy – Using Proxmox Backup Server (PBS) and Velero to protect both VM and Kubernetes workloads.

Solution 1 – Talos Configuration (Bootstrap the Cluster)

talosctl gen config my-talos-cluster https://192.168.1.10:6443 \
    --control-plane 192.168.1.10,192.168.1.11,192.168.1.12 \
    --workers 192.168.1.13,192.168.1.14 \
    --output ./cluster-configs \
    --with-kubespan

The command creates controlplane.yaml and worker.yaml in ./cluster-configs. Apply them with:

# Control‑plane node
talosctl apply-config \
    --nodes 192.168.1.10 \
    --file ./cluster-configs/controlplane.yaml \
    --preserve-client-id \
    --wait

# Worker node
talosctl apply-config \
    --nodes 192.168.1.13 \
    --file ./cluster-configs/worker.yaml \
    --preserve-client-id \
    --wait

Solution 2 – GitOps with Argo CD (Automated Configuration Management)

GitOps Principles

PrincipleDescription
DeclarativeDesired state is declared in Git (YAML manifests).
Version‑controlledAll changes are committed, providing audit history and easy rollbacks.
AutomatedGit changes automatically trigger cluster updates.
ReconciledA controller continuously aligns the actual cluster state with the Git‑defined desired state.

Argo CD – The GitOps Controller

Key features include automated sync, rollback/roll‑forward, health monitoring, and multi‑cluster support.

Deploying an Application with Argo CD

# applications/argocd/application-nginx.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: nginx-hello-world
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/my-homelab-gitops.git
    targetRevision: HEAD
    path: applications/nginx-hello-world
  destination:
    server: https://kubernetes.default.svc
    namespace: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
      - CreateNamespace=true

Repository layout (example)

my-homelab-gitops/
├── infrastructure/
│   └── talos/
│       └── cluster-config-patches/
├── applications/
│   ├── nginx-hello-world/
│   │   ├── deployment.yaml
│   │   └── service.yaml
│   └── argocd/
│       └── application-nginx.yaml
└── argocd-apps/
    ├── homelab-infra.yaml
    └── homelab-apps.yaml

When the Application manifest is committed, Argo CD automatically deploys and manages the nginx‑hello‑world app, keeping it in sync with Git.

Solution 3 – Argo  Workflows & Integrated DR (Operational Automation & Resilience)

Typical Homelab Use Cases

Use CaseDescription
Automated BackupsTrigger Proxmox VM backups and Velero Kubernetes backups.
DR TestingSpin up test environments, restore backups, validate services.
Infrastructure ProvisioningOrchestrate creation of new Talos nodes on Proxmox.
Application Release PipelinesManage complex deployments with pre‑/post‑hooks.

Example: Conceptual Backup Workflow

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: backup-and-verify-
spec:
  entrypoint: backup-and-verify
  templates:
  - name: backup-and-verify
    steps:
    - - name: snapshot-vm
        template: vm-snapshot
    - - name: backup-k8s
        template: k8s-backup
    - - name: verify-backup
        template: verify-backup

  - name: vm-snapshot
    container:
      image: your-registry/proxmox-cli:latest
      command: ["/bin/sh", "-c"]
      args:
        - |
          echo "Creating snapshot for VM 101..."
          proxmox-cli snapshot create --vm-id 101 --name backup-$(date +%s)

  - name: k8s-backup
    container:
      image: velero/velero:latest
      command: ["/velero", "backup", "create", "daily-backup", "--wait"]

  - name: verify-backup
    container:
      image: appropriate/curl:latest
      command: ["/bin/sh", "-c"]
      args:
        - |
          echo "Verifying Proxmox snapshot..."
          proxmox-cli snapshot list --vm-id 101 | grep backup-
          echo "Verifying Velero backup..."
          velero backup get daily-backup | grep Completed

Schedule this workflow with a CronWorkflow for nightly execution, add alerting, and extend it with restoration steps for full DR testing.

Integrated Disaster Recovery (DR) Overview

  • Infrastructure as Code: Rebuild Proxmox + Talos from Git after a disaster.
  • ArgoCD: Sync applications automatically to a fresh cluster.
  • Proxmox Backup Server (PBS): VM‑level backups for base OS and stateful workloads.
  • Velero: Kubernetes‑native backups of resources and persistent volumes.
  • Argo Workflows: Automate the entire recovery pipeline—from VM provisioning to backup restoration and health verification.

Feature Comparison

FeatureManual DR StrategyAutomated GitOps DR Strategy
RTOHours to daysMinutes to hours
RPOVariable, depends on last manual backupLow, frequent automated backups
ConsistencyHighly variable, prone to human errorHigh, enforced by Git and automation
TestingInfrequent, disruptiveFrequent, automated, non‑disruptive (sandbox)
Infrastructure RecoveryManual VM recreation, OS installAutomated provisioning via IaC
Application RecoveryManual redeployment, config, data restoreArgoCD auto‑sync, Velero restore
ComplexityHigh for large environmentsHigh initial setup, low ongoing maintenance
Operational CostHigh labor, extended downtimeLower labor, quicker recovery, reduced impact

Conclusion

By adopting a comprehensive strategy that leverages

  • Proxmox for virtualization
  • Talos Linux for a minimalist Kubernetes OS
  • GitOps driven by Argo CD and Argo Workflows for automation and disaster recovery

you can transform your homelab into a self‑healing, consistent, and secure environment. The upfront effort pays off in stability, scalability, and peace of mind, letting you focus on experimentation rather than firefighting.

Darian Vance

👉 Read the original article on TechResolve.blog

Back to Blog

Related posts

Read more Âť