Skip to content

kentborg/alias_sync

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

203 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alias_sync

Synchronize e-mail aliases across a pair of geographically separated Postfix mail servers.

What This Is

A single Rust binary that runs as a daemon, a Postfix pipe transport, and a CLI tool. Users create new e-mail aliases by sending a specially formatted message to a designated address. The program validates the request (SASL authentication, format checks), coordinates with the peer server over mutual TLS so both create the alias atomically, and sends a confirmation e-mail back through the new alias to prove it works on both servers.

The two servers communicate using self-signed, pinned mTLS certificates that the program generates and manages itself. No certificate authority, no OpenSSL.

Why It Exists

The author runs personal e-mail on two redundant Postfix/Dovecot servers, physically located about 3,000 miles apart. Creating aliases by hand on both machines was tedious and error-prone. This program automates it.

This is a personal tool solving a personal problem. Very few people run their own mail servers, and fewer still run redundant pairs. It is published not because others are expected to use it, but because the way it was built is interesting.

How It Was Built

All code in this repository was written by Claude (Anthropic's Opus 4.5, Opus 4.6, Sonnet 4.5, and Haiku 4.5 models) using Claude Code. The human designed the system, wrote detailed specifications, directed implementation, and reviewed everything, but did not write code. This was the human's first experience with Claude Code, chosen as a learning project.

The collaboration produced 16 specification documents before implementation began. Claude was instrumental in both writing and cross-checking these specs, catching inconsistencies from different perspectives across different contexts.

This is decidedly not vibe coding. A human with long software experience was firmly in charge of design and made all architectural decisions.

The Stunt

The explicit goal is for this program to be complete and correct the first time it runs against reality -- not just passing tests in synthetic environments, but actually working on the first real install and first real execution. This includes:

  • First dpkg -i on the development machine installs correctly
  • First debug-mode run works end-to-end
  • First cross-compiled deploy to two production Raspberry Pi 4 servers, 3,000 miles apart, works

This is an unusual ambition for a non-trivial program. The scale of the investment in getting it right before running it:

  • ~7,500 lines of non-test Rust across 4 workspace crates
  • ~5,400 lines of tests (323 test functions across four layers)
  • ~4,700 lines of specifications, most written before implementation began

The specifications and tests were not afterthoughts. Writing 16 detailed spec documents first meant that Claude and the human were working from the same model of the system. The tests -- unit, integration, two-instance, installation, and cross-architecture -- were the mechanism for closing the gap between "it compiles" and "it works." Neither alone would have been sufficient.

How It Went

The first production install failed. The .deb package installed without error, but the postinst script never ran -- no directories created, no config, no certificates. The cause: the Rust tar crate's set_path("./postinst") silently strips the ./ prefix, but dpkg requires it to recognize maintainer scripts. The test suite didn't catch it because (a) the tar crate also strips ./ on read, so round-trip checks passed, (b) the test normalized paths before asserting, masking the difference, and (c) fakeroot dpkg --root with --force-* flags is more lenient than production dpkg. Three layers of tests, three ways to miss the same bug.

The fix was two lines: write the path bytes directly into the tar header, bypassing the crate's normalization. The test fix was to assert on raw header bytes instead of the crate's accessor. See docs/development/postmortem-tar-prefix.md for the full analysis.

The second attempt failed too. The binary ran on the build machine's qemu-user-static (which shares the host's libraries) but not on the actual Raspberry Pi: glibc 2.39 from the build machine (Debian trixie) vs glibc 2.36 on the target (Debian bookworm). The fix was switching from aarch64-unknown-linux-gnu to aarch64-unknown-linux-musl for fully static binaries with no glibc dependency. See docs/development/postmortem-glibc-mismatch.md.

The third attempt failed differently. The binary installed and ran correctly -- directories, config, certificates, daemon, all fine. But the postinst script told Postfix to use a transport table file that the installer never created. It printed instructions for the admin to create the file manually, used a placeholder domain requiring substitution, and silently ignored postmap failure before reloading Postfix. Postfix's transport_maps lookup failed for every message (not just ours), deferring all mail. A total outage on a production mail server, caused by the installer. See docs/development/postmortem-transport-maps.md.

The fix was architectural: the program now owns the transport file entirely. A new configure-postfix subcommand queries Postfix for the server's domains, shows a report of every change it will make, and only applies after the admin confirms. It writes the transport file, runs postmap, verifies the .db exists, sets postconf values, and reloads Postfix -- but only if every preceding step succeeded. The installer no longer touches Postfix configuration at all; it prints the report and tells the admin to run configure-postfix as a separate step.

The fourth attempt was the first where installation succeeded without incident. But once mail actually reached the program, it hit the first runtime bug: the pipe auth check compared the SASL login name (kentborg) against the full recipient address (makeaddress@borg.org), which can never match. Any authenticated user should be allowed to create aliases; the destination-match check was simply wrong. Three installation failures before the code itself got a chance to run -- and then it had a bug too.

The fifth attempt succeeded. A specially formatted message sent to the secondary server (low-priority MX) triggered the protocol, synchronized the new alias to the primary, and both servers received the confirmation e-mail. A test message sent to the new alias on the primary server was delivered correctly.

What the Failures Have in Common

The stunt goal was not quite met. Four installation failures preceded any runtime execution, and the first runtime execution had a logic bug. The program worked correctly on the second run against production.

Every failure happened at the same place: the boundary between code and the real-world environment. Tests are models -- necessarily simpler than reality. The question is which simplifications are safe bets and which are not.

The tar prefix bug: tested against dpkg, but against the wrong dpkg behavior -- fakeroot with lenient flags, more permissive than production. The model was wrong about how dpkg interprets tar headers.

The glibc bug: qemu-user-static shares the host's libraries, not the target machine's. The model was wrong about what "running arm64 binaries" means.

The transport_maps and pipe auth bugs: Postfix was never in the test environment at all -- only mock_commands stubs. A real Postfix instance in the test suite would have been possible, and might have caught both. Every simplification is a bet that the simplified thing doesn't matter. Some of those bets were wrong.

Architecture

A Cargo workspace with four crates:

  • sync-protocol -- Generic transaction-based sync protocol with crash recovery. Makes no mention of aliases or e-mail.
  • alias-file -- Parses, writes, and fingerprints Postfix alias files. Handles metadata stored in adjacent comment lines.
  • communications -- mTLS transport layer using rustls.
  • alias_sync (binary crate) -- Wires the libraries together. CLI, Postfix integration, daemon loop, certificate management.

Additional binary targets:

  • maintainer_scripts -- dpkg postinst/prerm/postrm as compiled Rust, not shell scripts. Dispatches on argv[0].
  • mock_commands -- Fake Postfix tools and system commands for testing and debug installs.
  • inject_deb_scripts -- Inserts the compiled maintainer scripts into the .deb package after cargo-deb runs.

Unusual Design Choices

Maintainer scripts are Rust binaries. dpkg requires only that they be executable. The project's philosophy is to minimize shell scripting to near zero.

The .deb package is a first-class development artifact, built early and exercised continuously, not bolted on at the end. Installation correctness is treated with the same rigor as runtime correctness.

No async runtime. Blocking I/O with threads. The workload is tiny (a handful of aliases per year) and simplicity matters more than throughput.

Testing

If the goal is to work correctly the first time, testing has to go well beyond "does it compile." There are ~323 tests across four layers:

Unit tests in each library crate verify parsing, serialization, fingerprinting, protocol state machines, crash recovery, and TLS certificate handling in isolation. Pure logic, no I/O.

Integration tests spawn the real compiled binary as a child process with --base-path redirection and mock commands standing in for Postfix tools and system utilities. These test the CLI, the pipe transport (e-mail path), daemon startup and shutdown, signal handling, certificate watching, and error cases -- all exercising real code paths, not test doubles.

Two-instance tests start two daemons on localhost with real mTLS certificates, exchange certs, verify connectivity, create aliases at both realistic and rapid-fire rates, check duplicate rejection, and confirm both servers end up with identical alias files.

Installation tests build the actual .deb package, install it into a temporary root via fakeroot dpkg --root, and verify the full postinst sequence: user creation, directory setup, certificate generation, config file writing, and Postfix configuration report. Purge is tested too -- Postfix cleanup, file removal, user removal.

Cross-architecture tests repeat the two-instance lifecycle using the arm64 cross-compiled .deb, with binaries running under qemu-user-static. This validates that the actual package destined for the production Raspberry Pi servers installs and runs correctly before it ever leaves the development machine.

A mock commands binary (mock_commands) simulates postconf, postalias, sendmail, adduser, systemctl, and others, logging every invocation so tests can assert not just outcomes but the exact sequence of system interactions.

just ci    # fmt-check, lint, lint-release, test, test-deb,
           # test-dpkg, test-dual, doc-check (~45s)

Project Status

Feature-complete. All tests passing. Deployed and working on both production Raspberry Pi servers. Four installation failures preceded first execution; the first runtime execution had a logic bug. The program worked correctly on the second run against production.

License

MIT


This README was written by Claude (Opus 4.6). All code in this repository was also written by Claude.

About

Maintains synched e-mail aliases across two geographically separated symmetrical Postfix servers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors