Server Monitoring Control Room

Real-time visibility for infrastructure health, container workload performance, and alert readiness.

Prometheus Grafana Alertmanager Node Exporter cAdvisor

Primary Dashboards

System Overview

High-level health, uptime, CPU, memory, and network throughput in one place.

Open Overview

Containers & Services

Top container resource consumers, total container load, and service activity.

Open Containers

Storage & Network

Disk utilization, IO throughput, inode pressure, and network reliability.

Open Storage

Availability & Probes

HTTP probe uptime, latency, and endpoint health checks.

Open Availability

Alerts & Incidents

Current firing alerts, pending alerts, and target status.

Open Alerts

Logs Explorer

Unified system and container logs powered by Loki.

Open Logs

Blackbox Probes

SSL expiry and external endpoint checks with trendlines.

Open Blackbox

Legacy System Health

Original system health dashboard retained for quick comparisons.

Open Legacy View

Quick Access

Grafana Workspace

Explore dashboards, alerts, and data sources.

Open Grafana

Prometheus UI

Run ad-hoc queries and inspect scrape health.

Open Prometheus

Alertmanager

Review active alerts and routing rules.

Open Alertmanager

Node Metrics

Raw exporter endpoint for troubleshooting.

View Metrics

Service Status

Prometheus Collecting

Grafana Online

Alertmanager Monitoring

Node Exporter Exporting

cAdvisor Tracking

Operations Notes

Default Grafana credentials should be changed.

Prometheus retention tuned for lower disk usage. 15d / 2GB cap

Need a new target? Add it to prometheus.yml. /opt/server-monitoring