Errors Dashboard
Introduction
The Errors Dashboard - accessible from the Reliability Dashboard - lists all the reported errors in one location.
Learn how OverOps Calculates the Reliability Score
Click here for a visual overview of the OverOps dashboards, and the way they all connect to provide QA, DevOps and SRE teams a complete picture of application reliability across multiple environments.
You can then access the specific dashboard you wish to view - New Errors, Increasing Errors, etc. - by selecting them from the Errors dropdown or by entering the dashboard name in the Find dashboards by name field:
At the top of the dashboard are the standard environments filters to enable observing the behavior of any one of the environments to which the current datasource is connected, as well as slicing and dicing by a target application, deployment, or server group.
Note
Garbage Collector (GC) metrics (accessible through the Correlate dropdown function in the dashboard) are currently reported for Oracle/HotSpot only and are not currently supported for Java 9.
Learn how OverOps calculates Reliability Score
Click here for a visual overview of the OverOps dashboards, and the way they all connect to provide QA, DevOps and SRE teams a complete picture of application reliability across multiple environments.
Clicking each error name in the table will jump to the OverOps Automated Root Cause (ARC) analysis for this error that will show exactly the complete state of the event at the moment where it exceeded the threshold. This enables developers and SREs to see the actual state within the application that caused that error to understand whether the cause of the error is code or infrastructure related.
- The Errors dashboard shows the baseline vs active in accordance with the slowdown and increasing errors dashboards, for better correlation
- Clicking one of the top indicators takes you to the Errors-drill down Dashboard which provides a zoom into an event type
- The indicators (top boxes) enable easy filtering according to important event types: uncaught, critical, HTTP Error, and log error.
- All errors are ranked in the dashboard.
- Fix - Events groups are scored to take into account the number of participating apps in the score
- Fail rate deltas within the home dashboard use up arrow indicators to show increases in transaction fail rate.
Errors Dashboard JSON Model
Customize the dashboard, or integrate any of the widgets in it into your Reliability Dashbousing the Grafana JSON Model of this dashboard.
Updated almost 5 years ago