Saneops ingests Grafana Alerting webhooks, correlates related firings into a single incident, deduplicates flapping alerts, and drafts a first-pass root-cause analysis (RCA) using your own LLM API key. The result: roughly 80% fewer pages reaching humans, with each remaining page already enriched with context.
How the Grafana → Saneops pipeline works
- Webhook receiver: Each Saneops tenant has a unique inbound URL of the form
https://app.saneops.in/webhooks/grafana/<tenant-token>. Add it as a Contact Point in Grafana Alerting under Alerts & IRM → Contact points → Webhook. - Payload normalisation: Saneops parses the standard Alertmanager-format payload Grafana sends —
status,labels,annotations,fingerprint,startsAt,endsAt. Both firing and resolved transitions are honoured. - Correlation: Two alerts within a configurable time window (default 10 minutes) sharing strong labels (
service,namespace,cluster,deployment,job,app,pod) cluster into one incident. Semantic similarity over the alert text adds a fallback signal. - Deduplication: Identical fingerprints arriving in close succession (flapping) are merged. Saneops counts the dedup hits but doesn't page on them.
- Auto-resolve: If Grafana sends
status: resolved(requiressend_resolved: trueon the receiver), Saneops closes the incident automatically. As a safety net, idle incidents are also auto-closed after 24 hours by default. - LLM RCA: Once an incident reaches a configurable severity threshold, Saneops asks your tenant's LLM (Anthropic, OpenAI, Gemini, Grok, DeepSeek, or local Ollama) for a first-draft root-cause summary. The full prompt is auditable.
Want to see this on your own alert stream? Saneops is free for the first 1,000 alerts/month — no card, BYOK LLM, Docker self-host or hosted cloud.
Setup — five minutes end-to-end
1. Create a Saneops tenant
Sign up at app.saneops.in/signup. The signup form gives you a slug, an admin user, and a tenant-scoped API key on submit. No credit card.
2. Copy your Grafana webhook URL
In the Saneops UI: Integrations → Grafana → Connect. The page displays your tenant's exact webhook URL plus a one-paste sample payload you can curl to verify connectivity before wiring Grafana.
3. Add the contact point in Grafana
Grafana > Alerts & IRM > Contact points > New contact point
Name: Saneops
Integration: Webhook
URL: https://app.saneops.in/webhooks/grafana/<your-token>
HTTP Method: POST
Max alerts: 0 (no truncation)
Disable resolved messages: NO ← important; without this, Saneops can't auto-close4. Route alert rules to it
Create or edit a notification policy that points to the Saneops contact point. You can route everything, or scope by labels — Saneops handles either.
5. Verify
Trigger a test alert from Grafana (Test button on the contact point). Within seconds the Saneops Webhook inspector shows the raw payload, and a corresponding incident appears under Incidents.
What you get back from Saneops
- One incident per real outage — not one per Grafana alert rule firing.
- Common-label distillation — the labels every clustered alert agrees on, surfaced at the incident level.
- Service rollup — incidents tagged with the affected services so blast-radius is obvious.
- LLM-drafted RCA — a 3-bullet first-draft summary at the top of every incident with severity ≥ high.
- Outbound notifications — fan back out to Slack, Microsoft Teams, email, PagerDuty Events v2, OpsGenie, Zenduty, or generic webhook only when an incident actually warrants paging a human.
- Workflow automation — runbook DAGs that fire on incident events: auto-acknowledge low severity, post Slack messages with templated context, page on-call only when severity ≥ critical, etc.
send_resolved: true on your Grafana receiver. Without it, Saneops has to fall back to the 24-hour idle-resolve sweep — incidents stay visible longer than they should.