Level 1 - Alert noise scorecard rule | New Relic Documentation

Alert noise occurs when monitoring systems generate too many alerts, making it difficult to identify real problems. This scorecard rule helps you identify policies that create excessive alerts so you can focus on genuine issues.

About this scorecard rule

This alert noise rule is part of Level 1 (Reactive) in the business uptime maturity model. It helps you identify alert policies that generate too many incidents, which can overwhelm your team and mask critical problems.

Why this matters: Alert fatigue reduces response times and can cause teams to miss genuine critical issues. Teams that receive too many alerts often become desensitized and may ignore or delay responses to legitimate problems.

How this rule works

This rule evaluates incidents over a 7-day period to identify alert policies that generate more than 14 incidents during that time. This threshold represents approximately 2 incidents per day, which most teams can handle effectively without experiencing alert fatigue.

Understanding your score

Pass (Green): No alert policies generated more than 14 incidents in the past 7 days
Fail (Red): One or more policies exceeded the 14-incident threshold
Target: All alert policies should generate manageable incident volumes that your team can respond to effectively

What this means for your team:

Passing score: Your alert policies are well-tuned and generating actionable alerts
Failing score: Some policies may be too sensitive or need adjustment to reduce false positives

How to reduce alert noise

If your score indicates excessive alert noise, follow these steps to optimize your alert policies:

1. Identify problematic policies

Review the failing policies: Look at which specific policies triggered more than 14 incidents
Analyze incident patterns: Check if incidents occur at regular intervals or during specific conditions
Assess incident validity: Determine if the incidents represent genuine issues that require attention

2. Optimize alert conditions

Adjust thresholds:

Increase threshold values to reduce sensitivity if alerts trigger on normal fluctuations
Use percentage-based thresholds instead of absolute values when appropriate
Consider the normal operating range of your systems

Modify evaluation windows:

Extend the time window to avoid alerts on temporary spikes
Use longer evaluation periods for metrics that naturally fluctuate

Implement smarter detection:

Consider using anomaly detection instead of static thresholds
Use baseline comparisons for metrics with predictable patterns

3. Consolidate and streamline alerts

Group related conditions: Combine multiple related alert conditions into a single policy
Use alert correlation: Set up rules to group related incidents and reduce duplicate notifications
Prioritize critical alerts: Ensure that high-priority alerts are clearly distinguished from informational ones

4. Validate your changes

After making adjustments:

Monitor incident volume for the next 7 days
Verify that legitimate issues are still being detected
Confirm that your team can respond effectively to remaining alerts

Measuring improvement

Track these metrics to verify your alert optimization efforts are working:

Reduced incident volume: Fewer total incidents generated by your alert policies
Improved response times: Teams can respond faster when alerts are more focused
Higher alert confidence: Team members trust alerts and respond appropriately
Fewer false positives: Incidents that require genuine action rather than dismissal

Common scenarios and solutions

High-frequency, low-impact alerts:

Problem: Alerts trigger on minor metric fluctuations
Solution: Increase thresholds or use longer evaluation windows

Cascading alerts:

Problem: One issue triggers multiple related alerts
Solution: Implement alert correlation or create dependency-based alerting

Seasonal or predictable patterns:

Problem: Alerts fire during known busy periods
Solution: Use dynamic baselines or time-based alert conditions

Important considerations

Balance sensitivity with noise: Ensure that reducing noise doesn't eliminate detection of genuine issues
Regular review: Alert policies should be reviewed and adjusted as your systems evolve
Team feedback: Involve your response team in evaluating alert effectiveness
Custom thresholds: The 14-incident threshold may need adjustment based on your team size and response capacity

Next steps

Immediate action: Address any policies currently failing this rule
Ongoing monitoring: Review this scorecard rule weekly to catch new sources of alert noise
Advance to Level 2: Once alert noise is under control, focus on proactive monitoring practices

For additional guidance on alert optimization, see our Alert Quality Management implementation guide.

Level 1 - Alert noise scorecard rule

About this scorecard rule.css-21sua1{background:none;border:none;width:0;padding:0;}