• /
  • EnglishEspañolFrançais日本語한국어Português
  • EntrarComeçar agora

Level 1 - Alert noise scorecard rule

Alert noise occurs when monitoring systems generate too many alerts, making it difficult to identify real problems. This scorecard rule helps you identify policies that create excessive alerts so you can focus on genuine issues.

About this scorecard rule

This alert noise rule is part of Level 1 (Reactive) in the business uptime maturity model. It helps you identify alert policies that generate too many incidents, which can overwhelm your team and mask critical problems.

Why this matters: Alert fatigue reduces response times and can cause teams to miss genuine critical issues. Teams that receive too many alerts often become desensitized and may ignore or delay responses to legitimate problems.

How this rule works

This rule evaluates incidents over a 7-day period to identify alert policies that generate more than 14 incidents during that time. This threshold represents approximately 2 incidents per day, which most teams can handle effectively without experiencing alert fatigue.

Understanding your score

  • Pass (Green): No alert policies generated more than 14 incidents in the past 7 days
  • Fail (Red): One or more policies exceeded the 14-incident threshold
  • Target: All alert policies should generate manageable incident volumes that your team can respond to effectively

What this means for your team:

  • Passing score: Your alert policies are well-tuned and generating actionable alerts
  • Failing score: Some policies may be too sensitive or need adjustment to reduce false positives

How to reduce alert noise

If your score indicates excessive alert noise, follow these steps to optimize your alert policies:

1. Identify problematic policies

  1. Review the failing policies: Look at which specific policies triggered more than 14 incidents
  2. Analyze incident patterns: Check if incidents occur at regular intervals or during specific conditions
  3. Assess incident validity: Determine if the incidents represent genuine issues that require attention

2. Optimize alert conditions

Adjust thresholds:

  • Increase threshold values to reduce sensitivity if alerts trigger on normal fluctuations
  • Use percentage-based thresholds instead of absolute values when appropriate
  • Consider the normal operating range of your systems

Modify evaluation windows:

  • Extend the time window to avoid alerts on temporary spikes
  • Use longer evaluation periods for metrics that naturally fluctuate

Implement smarter detection:

  • Consider using anomaly detection instead of static thresholds
  • Use baseline comparisons for metrics with predictable patterns

3. Consolidate and streamline alerts

  • Group related conditions: Combine multiple related alert conditions into a single policy
  • Use alert correlation: Set up rules to group related incidents and reduce duplicate notifications
  • Prioritize critical alerts: Ensure that high-priority alerts are clearly distinguished from informational ones

4. Validate your changes

After making adjustments:

  1. Monitor incident volume for the next 7 days
  2. Verify that legitimate issues are still being detected
  3. Confirm that your team can respond effectively to remaining alerts

Measuring improvement

Track these metrics to verify your alert optimization efforts are working:

  • Reduced incident volume: Fewer total incidents generated by your alert policies
  • Improved response times: Teams can respond faster when alerts are more focused
  • Higher alert confidence: Team members trust alerts and respond appropriately
  • Fewer false positives: Incidents that require genuine action rather than dismissal

Common scenarios and solutions

High-frequency, low-impact alerts:

  • Problem: Alerts trigger on minor metric fluctuations
  • Solution: Increase thresholds or use longer evaluation windows

Cascading alerts:

  • Problem: One issue triggers multiple related alerts
  • Solution: Implement alert correlation or create dependency-based alerting

Seasonal or predictable patterns:

  • Problem: Alerts fire during known busy periods
  • Solution: Use dynamic baselines or time-based alert conditions

Important considerations

  • Balance sensitivity with noise: Ensure that reducing noise doesn't eliminate detection of genuine issues
  • Regular review: Alert policies should be reviewed and adjusted as your systems evolve
  • Team feedback: Involve your response team in evaluating alert effectiveness
  • Custom thresholds: The 14-incident threshold may need adjustment based on your team size and response capacity

Next steps

  1. Immediate action: Address any policies currently failing this rule
  2. Ongoing monitoring: Review this scorecard rule weekly to catch new sources of alert noise
  3. Advance to Level 2: Once alert noise is under control, focus on proactive monitoring practices

For additional guidance on alert optimization, see our Alert Quality Management implementation guide.

Copyright © 2025 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.