I had 2 problems: a huge mess on my personal laptop of dev environments, and a random assortment of mini projects and experiments in different cloud providers and VPSs. Maximum cognitive load and mental switching costs between projects.
During the 2020 pandemic, I decided to do some housekeeping and refactor and standardize my approach to development. It's taken me about 3 years to get here, and thanks to LLMs/ChatGPT I finally got this project over the finish line.
Originally I "just wanted" the "Heroku experience" (git push to deploy) on a local dev environment, basically gamifying my coding environment, i.e. "make my dev environment feel like Starcraft (the RTS game)". git push
and see a notification of a successful CI/CD run. But one thing lead to another and I ended up with a mini cloud. People complain about Kubernetes/Nomad complexity, but eventually complexity catches up to you and you realize life is easier with container orchestration. The juice is worth the squeeze.
So it's not quite a "dev env" on my laptop anymore, but more like "my laptop is a remote to an ultra-fast mini cloud".
I discovered NixOS after reading "Erase your darlings" - where the author describes how NixOS can bootstrap itself from an immutable directory: /nix and /boot. Everything else can be deleted on boot. Any state (documents, or /etc config, or home directory dot files) can be restored after boot from a backed-up network drive. This alone was both a stress reliever (clean system after boot) and forcing function (declarative config must be correct for stuff to work after boot).
Then I spent a few months going deep into NixOS configuration rabbit holes and Nix packages, starting with https://github.com/mitchellh/nixos-config
But since then I've mostly stopped using Nix packages for anything substantial because I realized "systemd is NOT all you need", and not all Nix packages are easy to use compared to Docker images.
Yes, you can do almost everything you'd want with native NixOS config and Nix packages and systemd, but it's a lot easier with Nomad and Docker (because vendors usually maintain Docker images):
nomad run myapp.nomad.hcl
nomad node drain <node>
, e.g. have nomad move a running image to another machine so you can reboot it and run nix flake update && nixos-rebuild switch
to upgrade the underlying Linux and NixOSNomad is not strictly necessary, you could use Docker Swarm exclusively, or maybe even Docker Compose. Or you can swap Nomad for Kubernetes (and long-term I may end up with Kubernetes). All offer ways to achieve zero downtime deploys and dependency management. I like both Kubernetes and Nomad, but Nomad is dead simple to configure.
You can also use Nomad without Docker - just using app binaries instead of Docker images. But you'll often need a supporting app via Docker image and then, why not simplify and use Docker for everything.
Self-hosted git via Docker is not too bad either. You can run a GitHub clone via a gitea Docker image, create a repo, and a job runner, and get a CI/CD system. I spent a week trying to get sourcehut working but Gitea felt like Github and the setup was comparatively easy.
The big takeaway here is: Docker is simple and vendors/projects provide working images. These are often better than the packages provided by Nix.
A CLI that allows you to template your Nomad job files so you can do waypoint up
from CI/CD and organize your secrets and environments.
TL;DR: build, tag, and push Docker images to gitea artifact hosting, then interpolates the Nomad job file with the image tag and secrets, and runs nomad run
on the job file.
I actually love this little app but Hashicorp has deprecated it (as of Oct 2023).
You can replicate most of its functionality with a few shell scripts, but I intend to keep using it, at least for the most basic of deployments.
Most are already familiar with this. I mostly ignore logs (I use the wander
app to tail the Nomad logs and journalctl
for systemd logs) but don't collect them. Instead I pump custom metrics to Prometheus, and spend a lot of time tweaking my fancy dashboards to read from Prometheus and Postgres.
At a certain scale or for certain use cases, you need log collection and search, your mileage may vary. Many great solutions (Loki) exist.
Likewise, for error tracking I'd consider self-hosted Sentry.io.
ext4 is the gold standard, and dead simple. XFS is common for big databases. ZFS is complicated but as far as I can tell, considered quite stable. The reason to use it is it abstracts away having lots of different disks as a single volume. Simpler than RAID and allows you to mirror or stripe and add/remove disks if they break.
Most importantly, ZFS has snapshots:
sudo zfs list -t snapshot
sudo zfs snapshot rpool/persist/backups@2023-12-25
and you can sync ZFS over the network
and of course, "rollback to snapshot on boot"
zfs rollback -r rpool/local/root@blank
ZFS snapshots are great for huge Postgres data dirs. Say you have a 3TB Postgres data dir, and you want to test a new version of Postgres. You can snapshot the data dir, and then run the new version of Postgres on the snapshot. If it doesn't work, rollback to the snapshot.
Or you can zfs send
the data dir snapshot to another machine instead of running a pg_dump
and pg_restore
.
Gitea is an open source Github. Looks and acts exactly like it, except extremely fast UI. Likewise the Act runner. A little complicated to setup (but way easier than Sourcehut), but worth it because I can git push
and get a full CI/CD run in under 2 seconds (also partly due to fast 7950x).
I use a custom build Docker image with all the deps pre-installed the runner does not install anything (other than project deps like npm go mod, etc.)
Open source dropbox. I can keep my "code" and "docs" directory on a mirror'd ZFS volume (2x Samsung 990 pro), and sync it to all laptops.
Similar to syncthing, but for one-off projects. Specifically, I use it to avoid rolling a Docker + Nomad config for quick dev work.
mkdir my-app
mutagen sync create --ignore=prod.log --name=sync-my-app-to-7950x1 ~/syncthing/my-app myuser@7950x1:code/my-app
screen
npm start
Now I can edit that project locally on my laptop, keep it real-time synced with a server, without having to sync my entire code directory to that server.
If the project matures, create a Docker image and Nomad config.
Tailsale on laptop and the servers, and use the Tailscale IP so Syncthing and Mutagen work from anywhere.
Keep the servers behind a NAT (router, no public IP) and use Cloudflare so they are accessible publicly behind a domain name.
# NixOS config serving example.com from a Nomad job
services.cloudflared = {
enable = true;
tunnels = {
"abababab-abab-abab-abab-abababababab" = {
credentialsFile = "/etc/cloudflared/credentials.json";
ingress = {
"example.com" = "http://example-app-on-nomad.service.consul:9999";
};
default = "http_status:404";
};
};
};
I also use Netlify, but for the sake of standardizing on Cloudflare everywhere:
wrangler pages deploy --project-name=myfrontend --env production --branch=production dist
Cloudflare prefers you don't use their tunnel service for images/assets, because they have a highly optimized and cheap image hosting service.
You can use their libraries/sdk, or just POST to their API:
const response = await fetch(url, {
method: "POST",
body: formData,
headers: {
Authorization: Bearer ${CLOUDFLARE_IMAGE_UPLOAD_TOKEN}
,
// 'Content-Type': 'multipart/form-data' is automatically set by fetch when using FormData
},
});
sudo reboot
and have that "new car smell" or that "reformatted PC smell" with a new OS and all your Docker images still running
3+ copies and snapshots of your data
Cheap and fast hardware
Stable app versions that survive Linux updates
Move off the Apple ecosystem and onto a NixOS Linux laptop so I enjoy deploying my personal environment, config/preferences, and keys to a new machine in seconds.