I am working on setting up a home server but I want it to be reproducible if I need to make large changes, switch out hardware, or restore from a failure. What do you use to handle this?
Fleet from Rancher to deploy everything to k8s. Baremetal management with Tinkerbell and Metal3 to management my OS deployments to baremetal in k8s. Harvester is the OS/K8S platform and all of its configs can be delivered in install or as cloudinit k8s objects. Ansible for the switches (as KubeOVN gets better in Harvester default separate hardware might be removed), I’m not brave enough for cross planning that yet. For backups I use velero, and shoot that into the cloud encrypted plus some nodes that I leave offline most of the time except to do backups and updating them. I user hauler manifests and a kube cronjob to grab images, helm charts, rpms, and ISO to local store. I use SOPS to store the secrets I need to boot strap in git. OpenTofu for application configs that are painful in helm. Ansible for everything else.
For total rebuilds I take all of that config and load it into a cloudinit script that I stick on a Rocky or sles iso that, assuming the network is up enough to configure, rebuilds from scratch, then I have a manual step to restore lost data.
That covers everything infra but physical layout in a git repo. Just got a pikvm v4 on order along with a pikvm switch, so hopefully I can get more of the junk on Metal3 for proper power control too and less IPXE shenanigans.
Next steps for me are CI/CD pipelines for deploying a mock version of the lab into Harvester as VMs, running integrations tests, and if it passes merge the staged branch into prod. I do that manually a little already but would really like to automate it. One I do that I start running Renovate to grab the latest stable for stuff for me.
Definitely overkill lol. But I like it. Haven’t found a more complete solutions that doesn’t feel like a comp sci dissertation yet.
The goal is pretty simple. Make as much as possible, helm values, k8s manifests, tofu, ansible, cloud init as possible and in that order of preference because as you go up the stack you get more state management for “free”. Stick that in git and test and deploy from that source as much as possible. Everything else is just about getting to there as fast as possible, and keeping the 3-2-1 rule alive and well for it all (3 backups, 2 different media, 1 off-site).
Carefully
Ansible!
Incus and ansible
Git controlled docker-compose files and backed up docker data volumes.pretty easy to go back to a point in time.
That’s actually a really good idea. From now on I will do the same. Thanks!
Packer builds the terraformable/openTofuable templates to launch into the hypervisor where chef (eventually mgmtConfig) will manage them from there until they die.
All that is launched by git. Fire and forget. Updates are cronned.
There are no containers. Don’t got time to fuck about. If Systemd wasn’t an absolute embarrassment I’d not worry about updates even as much as I do, which isn’t much aside from the aforementioned cancer.
NixOS
Out of curiosity: Are you running nix-ops with nix-secrets or how did you cover orchestration & credentials?
I use flakes and all hosts are configured from a single flake, where each host has its own configuration. I have some custom modules and even custom package in the same flake. I also use home manager. I have 4 hosts managed in total: home server, laptop, gaming PC, and a cloud server. All hosts were provisioned using nixos-anywhere + disko, except for the first one which was installed manually. For secrets I use sops-nix, encrypted secrets are stored in the same flake/repo.
How do you manage your home server configuration
Poorly, which is to say that I just let borgmatic back up all my compose files and hope for the best
Yep.
“I manage my server in yaml. Sometimes yml.”
reproducible
You tried writing bash scripts that set things up for you, haven’t you? It’s NixOS for you.
NixOS for configuration and restic for data
Proxmox on the metal, then every service as a docker container inside an LXC or VM. Proxmox does nice snapshots (to my NAS) making it a breeze to move them from machine to machine or blow away the Proxmox install and reimport them. All the docker compose files are in git, and the things I apply to every LXC/VM (my monitoring endpoint, apt cache setup etc) are all applied with ansible playbooks also in git. All the LXC’s are cloned from a golden image that has my keys, tailscale setup etc.
This is pretty much my setup as well. Proxmox on bare metal, then everything I do are in Ubuntu LXC containers, which have docker installed inside each of them running whatever docker stack.
I just installed Portainer and got the standalone agents installed on each LXC container, so it’s helped massively with managing each docker setup.
Of course you can do whatever base image you want for the LXC container, I just prefer Ubuntu for my homelab.
I do need to setup a golden image though to make stand-ups easier…one thing at a time though!
So you make in proxmox container (LXC) the docker container?
deleted by creator
systemd unit files, because its all podman containers.
With NixOS, you get a reproducible environment. When you need to change your hardware, you simply back up your data, write your NixOS configuration, and you can reproduce your previous environment.
I use it to manage all my services.
Terraform and ansible. Script service configuration and use source control. Containerize services where possible to make them system agnostic.
How do you decide what’s for Terraform and what’s for Ansible?
I used to have a fille with every cli command and notes on how each thing was set up. When I had to reinstall it from scratch it took all day going through lots of manual steps and remembering how it should all go.
Recently I converted the whole thing to Ansible. Now I could rebuild my entire system on a brand new OS installation with one command that completes in minutes. It’s all modular and I can add new services easily whether they are docker containers or scripts or whatever. If I ever break anything, it will reset everything to its intended state and leave it alone otherwise. And it’s free and pretty easy to learn and start using.
Plus I use git along with it for version control, so I can always revert to any previous configuration instantly.








