The Alert Nobody Reads
Your Slack channel has 147 unread error alerts. Your team muted it last Tuesday. The one critical alert about payment failures is buried between 146 alerts about a harmless deprecation warning.
This is alert fatigue, and it's the silent killer of error tracking ROI. You invested in tooling, configured alerts, and now your team ignores all of them. The tool is technically working — it's catching every error — but it's operationally useless.
Why Alert Fatigue Happens
Alert fatigue is always a configuration problem, not a people problem. It happens when:
1. You Alert on Every New Error
The default in most tools is "alert me when a new error appears." This sounds reasonable until you realize that a single bad deploy can generate 50 new error types in 10 minutes. Your Slack channel explodes, and the team learns to ignore it.
2. You Don't Filter by Severity
An info-level log message and a fatal crash both trigger the same alert. When everything is urgent, nothing is urgent.
3. You Alert on Volume, Not Change
"You had 1,000 errors today" is meaningless if you had 1,000 errors yesterday too. What matters is *change* — "errors increased 300% in the last hour."
4. You Never Clean Up Resolved Alerts
Old alert rules for bugs you've already fixed keep firing on edge cases. The noise grows over time.
The Alert Rules That Actually Work
Here are the only 4 alert rules you need:
Rule 1: Fatal Crash (Immediate)
Condition: New issue with level = fatal
Action: Slack + email
Threshold: 1 eventFatal means the process died. This is always urgent. Keep this channel sacred — if it pings, something is genuinely broken.
Rule 2: Error Spike (Within 15 Minutes)
Condition: Error rate increases > 200% vs. last hour
Action: Slack
Window: 15 minutesThis catches bad deploys and infrastructure issues. A 200% threshold filters out normal fluctuation while catching real spikes.
Rule 3: New High-Frequency Error (Within 1 Hour)
Condition: New error with > 50 events in 60 minutes
Action: Slack
Window: 60 minutesNew errors that happen frequently are worth investigating. New errors that happen once are usually noise.
Rule 4: Weekly Digest (Scheduled)
Condition: Scheduled summary
Action: Email
Schedule: Monday 9 AMA weekly summary of top unresolved issues, error trends, and resolved-vs-new counts. This is for planning, not incident response.
Implementing This in Practice
Step 1: Delete All Existing Alert Rules
Seriously. Start from zero. Your current rules are the reason your team muted the channel.
Step 2: Create One Dedicated Channel
Don't mix error alerts with deployment notifications, CI results, and PR reviews. Create #errors-critical and keep it focused.
Step 3: Add the 4 Rules Above
Most error tracking tools support these conditions. In Bugsly, you can set condition type (event_frequency, new_issue, regression, threshold) with custom windows and thresholds.
Step 4: Review Monthly
Once a month, look at your alert history:
- How many alerts fired? How many were actionable?
- If more than 30% were ignored, your thresholds are too sensitive
- If a critical issue was missed, you need a new rule
The "Two-Alert Test"
Here's a simple test for your alert configuration: if your team gets more than 2 alerts per day on average, your rules are too noisy. Two alerts per day is sustainable. Twenty is not.
The Cultural Fix
Alert rules are technical, but alert fatigue is cultural. Two things that help:
- Assign an on-call rotation — one person owns alerts for the week. Everyone else can mute the channel guilt-free.
- Acknowledge alerts — use emoji reactions (checkmark = handled, eyes = investigating) so the team knows someone is on it.
- Post-incident: "was this alert useful?" — after every incident, ask whether the alert helped or was noise. Adjust rules accordingly.
The Goal
The goal isn't zero alerts. It's zero *wasted* alerts. Every alert should make someone think "I need to act on this" — not "oh, another one of those." When your alert channel commands respect, your error tracking tool is finally earning its keep.
Try Bugsly Free
AI-powered error tracking that explains your bugs. Set up in 2 minutes, free forever for small projects.
Get Started FreeRelated Articles
Rails Error Handling Patterns for Production
Implement robust Rails error handling with rescue_from, custom exception classes, background job error recovery, and API error responses.
Read moreLog Levels Explained: DEBUG, INFO, WARN, ERROR — When to Use Each
A practical guide to choosing the right log level — with framework-specific mappings for Python, Java, Node.js, and Go.
Read moreKotlin Production Best Practices
Essential Kotlin best practices for production including null safety, coroutine patterns, sealed classes for errors, and testing strategies.
Read moreFastAPI Logging Best Practices
Set up production-ready logging in FastAPI with structured output, request tracing, middleware integration, and performance monitoring.
Read more