Errors Dashboard

Introduction

The Errors Dashboard - accessible from the Reliability Dashboard - lists all the reported errors in one location.

📘

Learn how OverOps Calculates the Reliability Score

Click here for a visual overview of the OverOps dashboards, and the way they all connect to provide QA, DevOps and SRE teams a complete picture of application reliability across multiple environments.

2538

Errors Dashboard

You can then access the specific dashboard you wish to view - New Errors, Increasing Errors, etc. - by selecting them from the Errors dropdown or by entering the dashboard name in the Find dashboards by name field:

1062

Select a Specific Dashboard

At the top of the dashboard are the standard environments filters to enable observing the behavior of any one of the environments to which the current datasource is connected, as well as slicing and dicing by a target application, deployment, or server group.

1828

🚧

Note

Garbage Collector (GC) metrics (accessible through the Correlate dropdown function in the dashboard) are currently reported for Oracle/HotSpot only and are not currently supported for Java 9.

📘

Learn how OverOps calculates Reliability Score

Click here for a visual overview of the OverOps dashboards, and the way they all connect to provide QA, DevOps and SRE teams a complete picture of application reliability across multiple environments.

Clicking each error name in the table will jump to the OverOps Automated Root Cause (ARC) analysis for this error that will show exactly the complete state of the event at the moment where it exceeded the threshold. This enables developers and SREs to see the actual state within the application that caused that error to understand whether the cause of the error is code or infrastructure related.

  • The Errors dashboard shows the baseline vs active in accordance with the slowdown and increasing errors dashboards, for better correlation
  • Clicking one of the top indicators takes you to the Errors-drill down Dashboard which provides a zoom into an event type
    • The indicators (top boxes) enable easy filtering according to important event types: uncaught, critical, HTTP Error, and log error.
    • All errors are ranked in the dashboard.
  • Fix - Events groups are scored to take into account the number of participating apps in the score
  • Fail rate deltas within the home dashboard use up arrow indicators to show increases in transaction fail rate.

📘

Errors Dashboard JSON Model

Customize the dashboard, or integrate any of the widgets in it into your Reliability Dashbousing the Grafana JSON Model of this dashboard.