Rebalance swarm: promote all nodes to managers and remove hostname constraints
- Promoted p1, p2, p3 from worker to manager nodes for 4-node quorum - Removed unnecessary hostname constraints from service configs - Only traefik and portainer remain pinned to p0 - Services now auto-balance across all nodes via GlusterFS shared storage - Updated README with cluster overview and distribution strategy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
49
README.md
49
README.md
@@ -1,3 +1,50 @@
|
|||||||
# swarm-production
|
# swarm-production
|
||||||
|
|
||||||
Production Docker Swarm Infrastructure
|
Production Docker Swarm Infrastructure
|
||||||
|
|
||||||
|
## Cluster Overview
|
||||||
|
|
||||||
|
### Nodes
|
||||||
|
- **p0** (Manager/Leader) - Infrastructure services
|
||||||
|
- **p1** (Manager) - Application services
|
||||||
|
- **p2** (Manager) - Application services
|
||||||
|
- **p3** (Manager) - Application services
|
||||||
|
|
||||||
|
All nodes are managers providing a 4-node quorum (can tolerate 2 node failures while maintaining quorum).
|
||||||
|
|
||||||
|
### Storage
|
||||||
|
- **GlusterFS** mounted at `/home/doc/swarm-data/` on all nodes
|
||||||
|
- Shared storage enables services to run on any node without storage constraints
|
||||||
|
|
||||||
|
## Service Distribution Strategy
|
||||||
|
|
||||||
|
### Pinned Services
|
||||||
|
Services that must run on specific nodes:
|
||||||
|
|
||||||
|
- **traefik** (p0) - Published ports 80/443, needs stable IP for DNS
|
||||||
|
- **portainer** (p0) - Management UI, stays with leader for convenience
|
||||||
|
- **rsync** (manager constraint) - Backup service, needs manager access
|
||||||
|
|
||||||
|
### Floating Services
|
||||||
|
Services that can run on any node (swarm auto-balances):
|
||||||
|
|
||||||
|
- adminer
|
||||||
|
- authentik (server, worker, redis)
|
||||||
|
- n8n
|
||||||
|
- paperless (webserver, redis)
|
||||||
|
- tracker-nginx
|
||||||
|
- uptime-kuma
|
||||||
|
|
||||||
|
## Recent Changes (2025-10-30)
|
||||||
|
|
||||||
|
### Swarm Rebalancing
|
||||||
|
- Promoted p1, p2, p3 from workers to managers
|
||||||
|
- Removed unnecessary hostname constraints from service configs
|
||||||
|
- Force-redeployed services to redistribute across all nodes
|
||||||
|
- Verified GlusterFS accessibility on all nodes
|
||||||
|
|
||||||
|
### Results
|
||||||
|
- Achieved balanced workload distribution across all 4 nodes
|
||||||
|
- Improved high availability with 4-node manager quorum
|
||||||
|
- Services now self-balance automatically when nodes fail/recover
|
||||||
|
- Fixed Portainer agent connectivity by restarting agents after manager promotion
|
||||||
@@ -10,9 +10,6 @@ services:
|
|||||||
- ADMINER_DESIGN=nette
|
- ADMINER_DESIGN=nette
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.hostname == p0
|
|
||||||
networks:
|
networks:
|
||||||
homelab:
|
homelab:
|
||||||
external: true
|
external: true
|
||||||
|
|||||||
@@ -10,9 +10,6 @@ services:
|
|||||||
- homelab
|
- homelab
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.hostname == p0
|
|
||||||
|
|
||||||
authentik_server:
|
authentik_server:
|
||||||
image: ghcr.io/goauthentik/server:2025.10.0
|
image: ghcr.io/goauthentik/server:2025.10.0
|
||||||
@@ -38,9 +35,6 @@ services:
|
|||||||
- homelab
|
- homelab
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.hostname == p0
|
|
||||||
labels:
|
labels:
|
||||||
- "traefik.enable=true"
|
- "traefik.enable=true"
|
||||||
- "traefik.http.routers.authentik.rule=Host(`auth.frostlabs.me`)"
|
- "traefik.http.routers.authentik.rule=Host(`auth.frostlabs.me`)"
|
||||||
@@ -75,9 +69,6 @@ services:
|
|||||||
- homelab
|
- homelab
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.hostname == p0
|
|
||||||
depends_on:
|
depends_on:
|
||||||
- redis
|
- redis
|
||||||
|
|
||||||
|
|||||||
@@ -17,9 +17,6 @@ services:
|
|||||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.hostname == p0
|
|
||||||
restart_policy:
|
restart_policy:
|
||||||
condition: on-failure
|
condition: on-failure
|
||||||
delay: 5s
|
delay: 5s
|
||||||
|
|||||||
@@ -5,9 +5,6 @@ services:
|
|||||||
- homelab
|
- homelab
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.hostname == p0
|
|
||||||
|
|
||||||
paperless_webserver:
|
paperless_webserver:
|
||||||
image: ghcr.io/paperless-ngx/paperless-ngx:latest
|
image: ghcr.io/paperless-ngx/paperless-ngx:latest
|
||||||
@@ -48,9 +45,6 @@ services:
|
|||||||
- homelab
|
- homelab
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.hostname == p0
|
|
||||||
depends_on: # Fixed: removed postgres dependency
|
depends_on: # Fixed: removed postgres dependency
|
||||||
- paperless_redis
|
- paperless_redis
|
||||||
|
|
||||||
|
|||||||
@@ -14,9 +14,10 @@ services:
|
|||||||
retries: 3
|
retries: 3
|
||||||
start_period: 60s
|
start_period: 60s
|
||||||
deploy:
|
deploy:
|
||||||
placement:
|
|
||||||
constraints: [node.hostname == p0]
|
|
||||||
replicas: 1
|
replicas: 1
|
||||||
|
placement:
|
||||||
|
preferences:
|
||||||
|
- spread: node.hostname
|
||||||
restart_policy:
|
restart_policy:
|
||||||
condition: on-failure
|
condition: on-failure
|
||||||
delay: 10s
|
delay: 10s
|
||||||
|
|||||||
@@ -11,9 +11,6 @@ services:
|
|||||||
- /home/doc/swarm-data/appdata/webfiles/production/taylors-development:/usr/share/nginx/html:ro
|
- /home/doc/swarm-data/appdata/webfiles/production/taylors-development:/usr/share/nginx/html:ro
|
||||||
deploy:
|
deploy:
|
||||||
replicas: 1
|
replicas: 1
|
||||||
placement:
|
|
||||||
constraints:
|
|
||||||
- node.role == worker
|
|
||||||
networks:
|
networks:
|
||||||
homelab:
|
homelab:
|
||||||
external: true
|
external: true
|
||||||
Reference in New Issue
Block a user