Edit: it seems like my explanation turned out to be too confusing. In simple terms, my topology would look something like this:
I would have a reverse proxy hosted in front of multiple instances of git servers (let’s take 5 for now). When a client performs an action, like pulling a repo/pushing to a repo, it would go through the reverse proxy and to one of the 5 instances. The changes would then be synced from that instance to the rest, achieving a highly available architecture.
Basically, I want a highly available git server. Is this possible?
I have been reading GitHub’s blog on Spokes, their distributed system for Git. It’s a great idea except I can’t find where I can pull and self-host it from.
Any ideas on how I can run a distributed cluster of Git servers? I’d like to run it in 3+ VMs + a VPS in the cloud so if something dies I still have a git server running somewhere to pull from.
Thanks
This is a fantastic comment. Thank you so much for taking the time.
I wasn’t planning to run a GUI for my git servers unless really required, so I’ll probably use SSH. Thanks, yes that makes the part of the reverse proxy a lot easier.
I think your idea of having a designated “master” (server 1) and having rolling updates to the rest of the servers is a brilliant idea. The replication procedure becomes a lot easier this way, and it also removes the need for the reverse-proxy too! - I can just use Keepalived, set up weights to make one of them the master and corresponding slaves for failover. It also won’t do round-robin so no special stuff for sticky sessions! This is great news from the perspective of networking for this project.
Hmm, you said to enable pushing repos to the remote git repo instead of having it pull? I was going create a wireguard tunnel and have it accessible from my network for some stuff but I guess it makes sense.
Thanks again for the wonderful comment.