Mac Mini AI Server Monitoring: Keep Your Agent Running 24/7
A Mac mini is the best value AI server money can buy in 2026 — 128GB unified memory, silent operation, and low power draw. But deploying a local LLM agent on bare metal means you're on the hook for keeping it alive. This guide covers the production monitoring stack used to run always-on AI agents without cloud babysitting.
What you'll build:
- Health check script that polls your server every 5 minutes
- Auto-restart via launchd when the process dies
- Log watcher that fires alerts on critical errors
- Disk, memory, and Ollama availability checks
- Remote access via Tailscale for monitoring from anywhere
1. Why Mac Mini for AI Agents?
The M4 Max Mac mini with 128GB unified memory can run 32B+ parameter models at full speed with no throttling, no cloud latency, and zero per-token costs after hardware purchase. For always-on autonomous agents — the kind that run decision loops, check email, monitor repos, and respond to webhooks — this is transformative.
The trade-off: you're the ops team. When the process crashes at 3am, there's no AWS auto-scaling group to restart it. You need a monitoring stack.
Hardware recommendation (2026):
- Mac mini M4 Max, 128GB — runs 70B models, ideal for production agents
- Mac mini M4 Pro, 64GB — runs 32B models well, good budget option
- Mac mini M4, 32GB — runs 14B models, fine for lightweight agents
2. The Monitoring Architecture
A production Mac mini AI server needs four monitoring layers:
Layer 1: Process Health
Is the server process running? Does it respond to HTTP health checks? Is Ollama reachable on port 11434? These checks run every 5 minutes via a cron job.
Layer 2: Resource Monitoring
Free memory, disk usage, and CPU load. LLM inference is memory-bound — if swap pressure builds up, inference grinds to a halt. Alert at 80% memory pressure, hard stop at 95%.
Layer 3: Log Watching
Tail the server log for critical errors — out of memory, port conflicts, authentication failures, and runaway loop detection. Write structured alerts to a JSONL file for queryability.
Layer 4: Auto-Recovery
launchd keeps the server process alive across crashes and reboots. The health monitor provides a second layer — if launchd doesn't catch it, the health check does.
3. Setting Up launchd for Auto-Restart
launchd is macOS's native process supervisor. It starts services at login, restarts them on crash, and survives reboots. Here's a complete plist for an AI agent server:
<!-- ~/Library/LaunchAgents/com.myagent.server.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.myagent.server</string>
<key>ProgramArguments</key>
<array>
<string>/path/to/bun</string>
<string>run</string>
<string>/Users/you/my-agent/src/cli.ts</string>
<string>server</string>
</array>
<key>WorkingDirectory</key>
<string>/Users/you/my-agent</string>
<key>EnvironmentVariables</key>
<dict>
<key>PATH</key>
<string>/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin</string>
<key>NODE_ENV</key>
<string>production</string>
</dict>
<!-- Auto-restart on crash -->
<key>KeepAlive</key>
<dict>
<key>Crashed</key>
<true/>
</dict>
<!-- Throttle restarts to avoid rapid loops -->
<key>ThrottleInterval</key>
<integer>10</integer>
<!-- Log output -->
<key>StandardOutPath</key>
<string>/Users/you/my-agent/.monitor/server.log</string>
<key>StandardErrorPath</key>
<string>/Users/you/my-agent/.monitor/server.log</string>
<!-- Start at login -->
<key>RunAtLoad</key>
<true/>
</dict>
</plist>Load it with:
launchctl load ~/Library/LaunchAgents/com.myagent.server.plist launchctl start com.myagent.server # Verify it's running launchctl list | grep myagent
To restart manually after a config change:
launchctl kickstart -k "gui/$(id -u)/com.myagent.server"
4. Health Monitor Script
Run this every 5 minutes via cron. It checks server HTTP health, Ollama availability, disk space, and memory pressure — then writes structured logs and triggers auto-restart if needed:
#!/bin/bash
# health-monitor.sh — runs every 5 minutes via cron
SERVER_URL="http://localhost:YOUR_PORT"
AUTH_TOKEN="your-auth-token"
LOG_FILE="$HOME/ai-server/.monitor/health-monitor.log"
ALERTS_FILE="$HOME/ai-server/.monitor/alerts.jsonl"
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# Check server HTTP health
SERVER_STATUS="fail"
if curl -sf -H "Authorization: Bearer $AUTH_TOKEN" "$SERVER_URL/health" --connect-timeout 5 --max-time 10 > /dev/null 2>&1; then
SERVER_STATUS="ok"
fi
# Check Ollama
OLLAMA_STATUS="fail"
if curl -sf http://localhost:11434/api/tags --connect-timeout 3 --max-time 8 > /dev/null 2>&1; then
OLLAMA_STATUS="ok"
fi
# Check disk (alert if > 80%)
DISK_PCT=$(df -h / | awk 'NR==2 {gsub(/%/,"",$5); print $5}')
DISK_STATUS="ok:${DISK_PCT}%"
[ "$DISK_PCT" -gt 80 ] && DISK_STATUS="warn:${DISK_PCT}%"
# Check memory (free pages × 16KB)
FREE_PAGES=$(vm_stat | awk '/Pages free/{gsub(/./,"",$3); print $3}')
FREE_MB=$(( FREE_PAGES * 16384 / 1048576 ))
MEM_STATUS="ok:${FREE_MB}MB"
[ "$FREE_MB" -lt 4096 ] && MEM_STATUS="warn:${FREE_MB}MB"
# Log the check
echo "[$TIMESTAMP] Check: server=$SERVER_STATUS ollama=$OLLAMA_STATUS disk=$DISK_STATUS mem=$MEM_STATUS" >> "$LOG_FILE"
# Auto-restart if server is down
if [ "$SERVER_STATUS" = "fail" ]; then
echo "{"ts":"$TIMESTAMP","severity":"critical","message":"Server down — attempting restart"}" >> "$ALERTS_FILE"
launchctl kickstart -k "gui/$(id -u)/com.myagent.server" 2>/dev/null || {
# Fallback: kill and restart
lsof -ti:3000 | xargs kill 2>/dev/null
sleep 2
nohup bun run ~/ai-server/src/cli.ts server > ~/ai-server/.monitor/server.log 2>&1 &
}
sleep 15
if curl -sf "$SERVER_URL/health" > /dev/null 2>&1; then
echo "{"ts":"$TIMESTAMP","severity":"info","message":"Server restarted successfully"}" >> "$ALERTS_FILE"
fi
fiAdd to cron with crontab -e:
*/5 * * * * /Users/you/my-agent/scripts/health-monitor.sh
5. Log Watcher for Critical Alerts
The log watcher tails the server log in real time and writes structured alerts when it detects known error patterns. Run it as a background daemon via launchd:
#!/bin/bash
# log-watcher.sh — tails server log, emits structured alerts
LOG_FILE="$HOME/ai-server/.monitor/server.log"
ALERTS_FILE="$HOME/ai-server/.monitor/alerts.jsonl"
# Wait for log to exist
while [ ! -f "$LOG_FILE" ]; do sleep 5; done
tail -F "$LOG_FILE" | while read -r line; do
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
case "$line" in
*"ENOMEM"*|*"out of memory"*)
echo "{"ts":"$TIMESTAMP","severity":"critical","message":"OOM: $line"}" >> "$ALERTS_FILE"
;;
*"EADDRINUSE"*|*"address already in use"*)
echo "{"ts":"$TIMESTAMP","severity":"error","message":"Port conflict: $line"}" >> "$ALERTS_FILE"
;;
*"Embedding API error"*|*"Failed to embed"*)
echo "{"ts":"$TIMESTAMP","severity":"warning","source":"log-watcher","message":"$line"}" >> "$ALERTS_FILE"
;;
*"SIGKILL"*|*"process dead"*)
echo "{"ts":"$TIMESTAMP","severity":"critical","message":"Process killed: $line"}" >> "$ALERTS_FILE"
;;
esac
done6. Remote Access with Tailscale
Tailscale creates a private WireGuard mesh network between your devices. Your Mac mini gets a stable hostname (e.g. mac-mini.your-tailnet.ts.net) that works from anywhere — phone, laptop, another server — without port forwarding or dynamic DNS.
Setup (5 minutes)
- Install Tailscale on the Mac mini:
brew install tailscale - Run
tailscale upand authenticate - Install Tailscale on your phone/laptop
- Connect to your AI server:
curl http://your-hostname.example-tailnet.ts.net:YOUR_PORT/health
iOS App Integration
Connect your iOS app to the agent server via the Tailscale hostname. The connection works on LTE, WiFi, and corporate networks without any router configuration. Auth with Bearer tokens in the Authorization header for security.
7. Monitoring Checklist
Before going to sleep with your AI agent running unsupervised:
- ✓launchd plist loaded and process visible in Activity Monitor
- ✓health-monitor.sh running in cron, log file growing
- ✓log-watcher.sh running as background daemon
- ✓Tailscale connected, hostname resolves from phone
- ✓Disk has at least 50GB free (models + logs + workspace)
- ✓Ollama responds to
curl localhost:11434/api/tags - ✓Auth token in .env, not hardcoded in scripts
- ✓Alerts JSONL file writable, not growing unbounded
8. Common Failure Modes
Embedding API 500 errors
Ollama's embedding model (nomic-embed-text) occasionally returns 500s under memory pressure. These are usually transient. If persistent: restart Ollama with pkill ollama && ollama serve &. Check available memory — the embedding model needs ~500MB headroom.
Port 3000 already in use
Happens when launchd starts a new instance before the old one fully exits. Fix: lsof -ti:3000 | xargs kill -9 then restart via launchctl. Add a 5-second delay to the plist's ThrottleInterval if this recurs.
Runaway memory consumption
An agent stuck in a loop can exhaust memory in hours. Set a max-memory limit in your agent runtime and add a heartbeat that kills+restarts the process if RSS exceeds a threshold. Monitor with ps aux | grep 'server' — watch the RSS column.
Context window overflow
Very long multi-step agent tasks accumulate tool results until the context overflows. Implement a context summarizer that condenses older steps. Alert when token count exceeds 80% of the model's context limit. Truncate from the oldest messages first, preserving the system prompt and the most recent turns.
Conclusion
A Mac mini AI server with proper monitoring runs more reliably than most cloud deployments. The key is layered redundancy: launchd for process supervision, health checks for HTTP-level validation, log watching for error detection, and Tailscale for remote access. With this stack in place, your AI agent runs 24/7 with zero manual babysitting.
The monitoring scripts above are battle-tested patterns from production agent systems. Want pre-packaged monitoring skills you can install with one click? Check out the Skillgate marketplace — health monitoring, log alerting, and auto-recovery as drop-in agent skills.
Production monitoring as a Skillgate skill.
Health checks, auto-restart, log watching, and alerting — pre-built, tested, and ready to install.
Browse Monitoring Skills