All posts

How to Prioritize Production Bugs When Everything Feels Urgent

A practical framework for triaging production errors when your dashboard shows hundreds of unresolved issues and every Slack alert feels critical.

The 3 AM Slack Alert Problem

Your phone buzzes at 3 AM. Error alert. You open the dashboard and see 247 unresolved issues. The most recent one happened 2 minutes ago. The one with the highest event count has 12,000 occurrences. A fatal crash appeared yesterday that affects 3 users.

Which one do you fix first?

If your answer is "whichever one I saw last" or "the one my boss asks about," you don't have a prioritization system — you have a reaction system. And reaction systems burn out engineers.

Why "Last Seen" Is the Wrong Default Sort

Most error tracking tools sort issues by "last seen" — the most recently occurred error appears first. This sounds logical but creates a terrible triage experience:

  • A harmless console.warn from a bot triggers every 5 seconds and always sits at the top
  • A critical data corruption bug that happened once yesterday gets buried
  • New errors from a deploy push important recurring errors down the list
  • Your team triages whatever is newest, not whatever matters most

Last seen rewards *recency*, not *impact*. What you actually want is a sort that rewards frequency × severity × recency.

A Better Framework: The 4-Quadrant Triage

Here's a practical framework for deciding which bugs to fix first:

Quadrant 1: Fix Now (High Frequency + High Severity)

  • Fatal crashes that kill the process
  • Data corruption or loss bugs
  • Authentication/authorization bypasses
  • Payment processing errors

These affect many users and cause real damage. Drop everything.

Quadrant 2: Fix Today (High Frequency + Low Severity)

  • UI glitches that affect every page load
  • Performance degradation (slow but not broken)
  • Warning-level errors that spam your logs

These are annoying and create noise, but the app still works.

Quadrant 3: Fix This Week (Low Frequency + High Severity)

  • Edge case crashes that affect 1% of users
  • Intermittent data loss under specific conditions
  • Race conditions that require a specific sequence to trigger

Dangerous but rare. Schedule these — don't ignore them.

Quadrant 4: Backlog (Low Frequency + Low Severity)

  • Deprecated API warnings
  • Third-party library errors you can't control
  • Bot traffic errors
  • Errors in admin-only features

Track these but don't prioritize. Mark as "ignored" if they're not actionable.

How to Implement This in Your Error Tracking Tool

Step 1: Sort by Event Count, Not Last Seen

Change your default sort to show most-frequent errors first. An error that happens 500 times per hour is almost certainly more urgent than one that happened once just now.

In Bugsly, issues are now sorted by frequency by default. The most impactful errors surface first without any configuration.

Step 2: Use Severity Levels as Multipliers

Not all errors are equal:

  • Fatal (process dies) → 10x urgency multiplier
  • Error (operation fails) → 5x multiplier
  • Warning (something off but recoverable) → 2x multiplier
  • Info/Debug → 1x multiplier

Look for visual indicators. Bugsly marks fatal and high-frequency issues with a flame icon so they stand out in the list.

Step 3: Check User Impact, Not Just Event Count

10,000 events from one bot = low priority.

50 events from 50 different real users = high priority.

If your tool shows "users affected" or "unique users," use that metric alongside event count.

Step 4: Set Up Smart Alerts

Don't alert on every new error. Alert on:

  • New errors with fatal level
  • Error rate spikes (50%+ increase in 10 minutes)
  • New errors that affect more than N users

This prevents alert fatigue while catching the things that actually matter.

The Daily Triage Routine

Here's a 10-minute daily triage routine that works:

  1. Open dashboard — check the health indicator (green/yellow/red)
  2. Scan top 5 unresolved issues — sorted by frequency
  3. For each: read the AI analysis, decide: fix now / fix today / backlog / ignore
  4. Bulk-mark anything you're not fixing as "ignored" with a reason
  5. Create tasks for the 1-3 issues you'll fix today

This takes 10 minutes and ensures nothing critical sits unnoticed.

The Counter-Intuitive Truth

The best bug prioritization isn't about fixing more bugs. It's about fixing fewer bugs — the *right* bugs. A team that fixes 3 high-impact bugs per day ships better software than a team that fixes 20 low-impact bugs per day.

Your error tracking dashboard should help you make that distinction instantly. If it can't, the tool is working against you.

Try Bugsly Free

AI-powered error tracking that explains your bugs. Set up in 2 minutes, free forever for small projects.

Get Started Free