14.3.3.2 Blue Green Proxy Switch
A focused guide to Blue Green Proxy Switch, connecting core concepts with practical Docker and container operations.
A blue-green proxy switch is the specific mechanism by which a reverse proxy or load balancer redirects traffic from the currently live environment to the newly validated one, and it is the single moment in the entire blue-green deployment process where production traffic is actually affected, which makes the reliability and speed of this switch the most important detail in the whole pattern.
Reload versus restart
The proxy configuration change should be applied with a reload, not a restart, since a restart briefly drops the proxy's listening socket and can interrupt in-flight connections, while a reload re-reads the configuration while continuing to serve existing connections against the old configuration until they complete:
docker exec proxy nginx -s reload
docker exec proxy traefik healthcheck
Nginx's reload behavior specifically starts new worker processes with the updated configuration while letting existing workers finish handling their current connections under the old configuration, which is what makes the switch effectively seamless from a connected client's perspective.
Static configuration file swap
The most direct approach to the switch is updating the proxy's upstream target in its configuration file and triggering a reload:
upstream backend {
server green.internal:3000;
}
cp nginx-green.conf /etc/nginx/conf.d/upstream.conf
docker exec proxy nginx -s reload
Keeping pre-written configuration files for both the blue and green target, rather than generating the switch configuration on the fly during the deployment itself, removes any chance of a templating error happening at the exact moment the switch needs to be fast and reliable.
Label-driven switching with a container-aware proxy
Proxies like Traefik that discover backends through labels can have the switch driven by which environment's containers are currently labeled as the active target, without editing a static configuration file at all:
services:
api-green:
labels:
- "traefik.http.routers.api.rule=Host(`api.example.com`)"
- "traefik.enable=true"
api-blue:
labels:
- "traefik.enable=false"
docker compose -p myapp-blue exec api sh -c "echo disabled"
docker label update myapp-green-api-1 traefik.enable=true
docker label update myapp-blue-api-1 traefik.enable=false
This approach shifts the switch from "edit a file and reload a process" to "change container metadata that the proxy is already watching," which can be faster and less error-prone once the label-driven discovery is correctly set up.
DNS-based switching
For setups where the proxy itself is external (a cloud load balancer or CDN), the switch can be implemented as a DNS or load balancer target change rather than a reverse proxy configuration reload:
aws elbv2 modify-listener --listener-arn arn:aws:elasticloadbalancing:... \
--default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:...:targetgroup/green-tg
DNS-based switching carries an additional consideration that configuration-file-based switching does not: DNS changes propagate according to the record's TTL, meaning some clients may continue resolving to the old environment for a period after the switch is initiated, which is generally not a concern for a load-balancer target group switch but is a meaningful caveat for switching at the DNS record level directly.
Verifying the switch took effect
After triggering the switch, confirming that traffic is genuinely flowing to the new environment, rather than assuming the configuration change succeeded, closes the loop on the operation:
curl -s -o /dev/null -w "%{remote_ip}\n" https://api.example.com/healthz
docker logs --since 1m myapp-green-api-1 | grep -c "GET /healthz"
A switch that silently failed to apply, due to a syntax error in the new configuration or a proxy that did not actually pick up the reload signal, is far easier to catch immediately through this kind of verification than to discover later through unrelated symptoms.
Handling in-flight requests during the switch
Requests already in progress against the old environment at the moment of the switch should be allowed to complete against that environment rather than abruptly cut off, which a proper reload (rather than restart) generally preserves automatically, since existing connections continue being served by the proxy worker that was already handling them:
proxy_read_timeout 60s;
Setting an appropriately generous read timeout for long-running requests ensures the proxy itself does not prematurely terminate a slow but legitimate in-flight request during or shortly after a switch.
The rollback is the same mechanism, reversed
Because the switch is just a configuration change pointing at a different backend, rollback uses the exact same mechanism, simply reapplying the previous configuration:
cp nginx-blue.conf /etc/nginx/conf.d/upstream.conf
docker exec proxy nginx -s reload
This symmetry is one of the most valuable properties of the proxy switch approach: there is no separate, less-tested "rollback path" distinct from the "forward path," since both are the identical operation applied with a different target, which means whatever confidence exists in the switch mechanism working correctly applies equally to rollback.
Automating the switch as part of the deployment pipeline
Treating the switch as a scripted, version-controlled step in the deployment pipeline, rather than a manual edit performed by an operator, removes the risk of a typo or missed step during a moment that directly affects live traffic:
./scripts/switch-environment.sh green
#!/bin/sh
set -e
TARGET="$1"
cp "nginx-${TARGET}.conf" /etc/nginx/conf.d/upstream.conf
docker exec proxy nginx -t
docker exec proxy nginx -s reload
Including a configuration test (nginx -t) immediately before the reload catches a malformed configuration file before it is actually applied, which is a small addition that prevents the switch from breaking the proxy entirely due to a syntax mistake.
Common mistakes
- Restarting the proxy instead of reloading it, briefly dropping in-flight connections during what should be a seamless switch.
- Generating the switch configuration dynamically at deployment time instead of using pre-written, tested configuration files for each target environment.
- Switching at the DNS record level without accounting for client-side caching of the previous record according to its TTL.
- Performing the switch manually without any automated verification that it actually took effect, only discovering a failed switch through unrelated downstream symptoms.
- Treating the rollback path as separate from and less tested than the forward switch path, when in a well-designed setup they should be the identical mechanism applied in the opposite direction.
A blue-green proxy switch should be implemented as a fast, reload-based (not restart-based), scripted, and immediately verified operation, with the rollback path being the literal reverse of the same mechanism, since the entire safety value of blue-green deployment depends on this one moment being as close to instantaneous and reliable as possible.