What do I get with each skill pack?

Each skill pack includes a comprehensive guide (2000+ words), ready-to-use system prompts, prompt templates, TypeScript code snippets, and configuration files.

Can I use these skills in my own projects?

Yes. All skill packs are yours to use in personal and commercial projects. Modify, extend, and deploy them however you need.

Do I need to write code to start?

No code required for the prompts and guides. The code snippets are optional reference implementations you can adapt to your stack.

What is included in the free skill packs?

Free packs include the same structure as paid ones — guides, prompts, snippets, and configs. They cover essential skills like tool building, prompt injection defense, FAQ generation, CSV transformation, and API integration.

What is the best AI skills marketplace in 2026?

Skillgate offers 222+ production-ready AI agent skill packs covering agent architecture, automation, security, marketing, data analysis, developer tools, and more. Each pack includes comprehensive guides, ready-to-use prompts, TypeScript snippets, and configuration templates.

← Back to Blog

TutorialMarch 4, 202612 min read

Mac Mini AI Server Monitoring: Keep Your Agent Running 24/7

A Mac mini is the best value AI server money can buy in 2026 — 128GB unified memory, silent operation, and low power draw. But deploying a local LLM agent on bare metal means you're on the hook for keeping it alive. This guide covers the production monitoring stack used to run always-on AI agents without cloud babysitting.

What you'll build:

Health check script that polls your server every 5 minutes
Auto-restart via launchd when the process dies
Log watcher that fires alerts on critical errors
Disk, memory, and Ollama availability checks
Remote access via Tailscale for monitoring from anywhere

1. Why Mac Mini for AI Agents?

The M4 Max Mac mini with 128GB unified memory can run 32B+ parameter models at full speed with no throttling, no cloud latency, and zero per-token costs after hardware purchase. For always-on autonomous agents — the kind that run decision loops, check email, monitor repos, and respond to webhooks — this is transformative.

The trade-off: you're the ops team. When the process crashes at 3am, there's no AWS auto-scaling group to restart it. You need a monitoring stack.

Hardware recommendation (2026):

Mac mini M4 Max, 128GB — runs 70B models, ideal for production agents
Mac mini M4 Pro, 64GB — runs 32B models well, good budget option
Mac mini M4, 32GB — runs 14B models, fine for lightweight agents

2. The Monitoring Architecture

A production Mac mini AI server needs four monitoring layers:

Layer 1: Process Health

Is the server process running? Does it respond to HTTP health checks? Is Ollama reachable on port 11434? These checks run every 5 minutes via a cron job.

Layer 2: Resource Monitoring

Free memory, disk usage, and CPU load. LLM inference is memory-bound — if swap pressure builds up, inference grinds to a halt. Alert at 80% memory pressure, hard stop at 95%.

Layer 3: Log Watching

Tail the server log for critical errors — out of memory, port conflicts, authentication failures, and runaway loop detection. Write structured alerts to a JSONL file for queryability.

Layer 4: Auto-Recovery

launchd keeps the server process alive across crashes and reboots. The health monitor provides a second layer — if launchd doesn't catch it, the health check does.

3. Setting Up launchd for Auto-Restart

launchd is macOS's native process supervisor. It starts services at login, restarts them on crash, and survives reboots. Here's a complete plist for an AI agent server:

<!-- ~/Library/LaunchAgents/com.myagent.server.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Label</key>
  <string>com.myagent.server</string>

  <key>ProgramArguments</key>
  <array>
    <string>/path/to/bun</string>
    <string>run</string>
    <string>/Users/you/my-agent/src/cli.ts</string>
    <string>server</string>
  </array>

  <key>WorkingDirectory</key>
  <string>/Users/you/my-agent</string>

  <key>EnvironmentVariables</key>
  <dict>
    <key>PATH</key>
    <string>/usr/local/bin:/usr/bin:/bin:/opt/homebrew/bin</string>
    <key>NODE_ENV</key>
    <string>production</string>
  </dict>

  <!-- Auto-restart on crash -->
  <key>KeepAlive</key>
  <dict>
    <key>Crashed</key>
    <true/>
  </dict>

  <!-- Throttle restarts to avoid rapid loops -->
  <key>ThrottleInterval</key>
  <integer>10</integer>

  <!-- Log output -->
  <key>StandardOutPath</key>
  <string>/Users/you/my-agent/.monitor/server.log</string>
  <key>StandardErrorPath</key>
  <string>/Users/you/my-agent/.monitor/server.log</string>

  <!-- Start at login -->
  <key>RunAtLoad</key>
  <true/>
</dict>
</plist>

Load it with:

launchctl load ~/Library/LaunchAgents/com.myagent.server.plist
launchctl start com.myagent.server

# Verify it's running
launchctl list | grep myagent

To restart manually after a config change:

launchctl kickstart -k "gui/$(id -u)/com.myagent.server"

4. Health Monitor Script

Run this every 5 minutes via cron. It checks server HTTP health, Ollama availability, disk space, and memory pressure — then writes structured logs and triggers auto-restart if needed:

#!/bin/bash
# health-monitor.sh — runs every 5 minutes via cron

SERVER_URL="http://localhost:YOUR_PORT"
AUTH_TOKEN="your-auth-token"
LOG_FILE="$HOME/ai-server/.monitor/health-monitor.log"
ALERTS_FILE="$HOME/ai-server/.monitor/alerts.jsonl"
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)

# Check server HTTP health
SERVER_STATUS="fail"
if curl -sf -H "Authorization: Bearer $AUTH_TOKEN"     "$SERVER_URL/health" --connect-timeout 5 --max-time 10     > /dev/null 2>&1; then
  SERVER_STATUS="ok"
fi

# Check Ollama
OLLAMA_STATUS="fail"
if curl -sf http://localhost:11434/api/tags     --connect-timeout 3 --max-time 8 > /dev/null 2>&1; then
  OLLAMA_STATUS="ok"
fi

# Check disk (alert if > 80%)
DISK_PCT=$(df -h / | awk 'NR==2 {gsub(/%/,"",$5); print $5}')
DISK_STATUS="ok:${DISK_PCT}%"
[ "$DISK_PCT" -gt 80 ] && DISK_STATUS="warn:${DISK_PCT}%"

# Check memory (free pages × 16KB)
FREE_PAGES=$(vm_stat | awk '/Pages free/{gsub(/./,"",$3); print $3}')
FREE_MB=$(( FREE_PAGES * 16384 / 1048576 ))
MEM_STATUS="ok:${FREE_MB}MB"
[ "$FREE_MB" -lt 4096 ] && MEM_STATUS="warn:${FREE_MB}MB"

# Log the check
echo "[$TIMESTAMP] Check: server=$SERVER_STATUS ollama=$OLLAMA_STATUS disk=$DISK_STATUS mem=$MEM_STATUS"   >> "$LOG_FILE"

# Auto-restart if server is down
if [ "$SERVER_STATUS" = "fail" ]; then
  echo "{"ts":"$TIMESTAMP","severity":"critical","message":"Server down — attempting restart"}"     >> "$ALERTS_FILE"

  launchctl kickstart -k "gui/$(id -u)/com.myagent.server" 2>/dev/null || {
    # Fallback: kill and restart
    lsof -ti:3000 | xargs kill 2>/dev/null
    sleep 2
    nohup bun run ~/ai-server/src/cli.ts server > ~/ai-server/.monitor/server.log 2>&1 &
  }

  sleep 15
  if curl -sf "$SERVER_URL/health" > /dev/null 2>&1; then
    echo "{"ts":"$TIMESTAMP","severity":"info","message":"Server restarted successfully"}"       >> "$ALERTS_FILE"
  fi
fi

Add to cron with crontab -e:

*/5 * * * * /Users/you/my-agent/scripts/health-monitor.sh

5. Log Watcher for Critical Alerts

The log watcher tails the server log in real time and writes structured alerts when it detects known error patterns. Run it as a background daemon via launchd:

#!/bin/bash
# log-watcher.sh — tails server log, emits structured alerts

LOG_FILE="$HOME/ai-server/.monitor/server.log"
ALERTS_FILE="$HOME/ai-server/.monitor/alerts.jsonl"

# Wait for log to exist
while [ ! -f "$LOG_FILE" ]; do sleep 5; done

tail -F "$LOG_FILE" | while read -r line; do
  TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)

  case "$line" in
    *"ENOMEM"*|*"out of memory"*)
      echo "{"ts":"$TIMESTAMP","severity":"critical","message":"OOM: $line"}" >> "$ALERTS_FILE"
      ;;
    *"EADDRINUSE"*|*"address already in use"*)
      echo "{"ts":"$TIMESTAMP","severity":"error","message":"Port conflict: $line"}" >> "$ALERTS_FILE"
      ;;
    *"Embedding API error"*|*"Failed to embed"*)
      echo "{"ts":"$TIMESTAMP","severity":"warning","source":"log-watcher","message":"$line"}" >> "$ALERTS_FILE"
      ;;
    *"SIGKILL"*|*"process dead"*)
      echo "{"ts":"$TIMESTAMP","severity":"critical","message":"Process killed: $line"}" >> "$ALERTS_FILE"
      ;;
  esac
done

6. Remote Access with Tailscale

Tailscale creates a private WireGuard mesh network between your devices. Your Mac mini gets a stable hostname (e.g. mac-mini.your-tailnet.ts.net) that works from anywhere — phone, laptop, another server — without port forwarding or dynamic DNS.

Setup (5 minutes)

Install Tailscale on the Mac mini: brew install tailscale
Run tailscale up and authenticate
Install Tailscale on your phone/laptop
Connect to your AI server: curl http://your-hostname.example-tailnet.ts.net:YOUR_PORT/health

iOS App Integration

Connect your iOS app to the agent server via the Tailscale hostname. The connection works on LTE, WiFi, and corporate networks without any router configuration. Auth with Bearer tokens in the Authorization header for security.

7. Monitoring Checklist

Before going to sleep with your AI agent running unsupervised:

✓launchd plist loaded and process visible in Activity Monitor
✓health-monitor.sh running in cron, log file growing
✓log-watcher.sh running as background daemon
✓Tailscale connected, hostname resolves from phone
✓Disk has at least 50GB free (models + logs + workspace)
✓Ollama responds to curl localhost:11434/api/tags
✓Auth token in .env, not hardcoded in scripts
✓Alerts JSONL file writable, not growing unbounded

8. Common Failure Modes

Embedding API 500 errors

Ollama's embedding model (nomic-embed-text) occasionally returns 500s under memory pressure. These are usually transient. If persistent: restart Ollama with pkill ollama && ollama serve &. Check available memory — the embedding model needs ~500MB headroom.

Port 3000 already in use

Happens when launchd starts a new instance before the old one fully exits. Fix: lsof -ti:3000 | xargs kill -9 then restart via launchctl. Add a 5-second delay to the plist's ThrottleInterval if this recurs.

Runaway memory consumption

An agent stuck in a loop can exhaust memory in hours. Set a max-memory limit in your agent runtime and add a heartbeat that kills+restarts the process if RSS exceeds a threshold. Monitor with ps aux | grep 'server' — watch the RSS column.

Context window overflow

Very long multi-step agent tasks accumulate tool results until the context overflows. Implement a context summarizer that condenses older steps. Alert when token count exceeds 80% of the model's context limit. Truncate from the oldest messages first, preserving the system prompt and the most recent turns.

Conclusion

A Mac mini AI server with proper monitoring runs more reliably than most cloud deployments. The key is layered redundancy: launchd for process supervision, health checks for HTTP-level validation, log watching for error detection, and Tailscale for remote access. With this stack in place, your AI agent runs 24/7 with zero manual babysitting.

The monitoring scripts above are battle-tested patterns from production agent systems. Want pre-packaged monitoring skills you can install with one click? Check out the Skillgate marketplace — health monitoring, log alerting, and auto-recovery as drop-in agent skills.

Production monitoring as a Skillgate skill.

Health checks, auto-restart, log watching, and alerting — pre-built, tested, and ready to install.

Browse Monitoring Skills