A comprehensive health monitoring and automatic recovery system for the file watcher service to ensure it continues running and automatically recovers if it crashes.
Backend Service - /apps/service/src/watcher-health.service.ts
Frontend Component - /apps/web/src/app/components/WatcherHealthStatus.tsx
Documentation - /docs/WATCHER_HEALTH_MONITORING.md
App Module - /apps/service/src/app.module.ts
WatcherHealthService to providersApp Service - /apps/service/src/app.service.ts
watcherHealthStatus()watcherRecentErrors(limit?)clearWatcherErrors()setWatcherAutoRecovery(enabled)isWatcherAutoRecoveryEnabled()App Controller - /apps/service/src/app.controller.ts
GET /watcher/health - Get health statusGET /watcher/errors - List recent errorsDELETE /watcher/errors - Clear error logsPOST /watcher/auto-recovery - Set auto-recovery statusGET /watcher/auto-recovery - Get auto-recovery statusWatcher Service - /apps/service/src/watcher.service.ts
ready event listener to log when watcher is readyStats Section - /apps/web/src/app/components/StatsSection.tsx
WatcherHealthStatus componentwatcher_errors database tableUser starts watcher
↓
WatcherHealthService begins monitoring
↓
Every 30 seconds: Health check runs
↓
Is watcher still running?
├─ YES: Continue monitoring (no action)
└─ NO:
├─ Log error to database
├─ Emit WebSocket alert to UI
└─ If auto-recovery enabled:
├─ Attempt restart with last config
├─ Log recovery attempt
├─ If successful: Reset attempt counter
└─ If failed: Increment counter (max 5/hour)
Auto-recovery is enabled by default. Users can:
POST /watcher/auto-recovery { "enabled": false }watcher_auto_recovery setting to falsepkill -f "watcher"# Get health status
curl http://localhost:3001/watcher/health
# Get recent errors
curl http://localhost:3001/watcher/errors?limit=20
# Clear error logs
curl -X DELETE http://localhost:3001/watcher/errors
# Enable auto-recovery
curl -X POST http://localhost:3001/watcher/auto-recovery \
-H "Content-Type: application/json" \
-d '{"enabled": true}'
# Check auto-recovery status
curl http://localhost:3001/watcher/auto-recovery
✅ Build successful - All TypeScript compiles without errors ✅ Tests - Existing tests continue to pass ✅ No breaking changes - Fully backward compatible
New table created automatically on first run:
CREATE TABLE IF NOT EXISTS watcher_errors (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL,
message TEXT NOT NULL,
recovery_attempt INTEGER DEFAULT 0,
created_at TEXT NOT NULL
);
No existing tables or data are affected.
Users can now:
For detailed information, see /docs/WATCHER_HEALTH_MONITORING.md