Slowdown Capabilities
Introduction
OverOps has the capability to automatically detect slowdowns and identify possible root cause for each one of them. This is achieved thanks to an intelligent automatic timers mechanism that identifies the most significant method that could be causing a slowdown starting from the relevant entry point (the start of your transaction) and up to deeper method calls in order to provide a better understanding of what caused the slowdown.
Important Notes
Slowdowns, Automatic Timers, and Timer Events are currently supported only on the JVM languages listed here.
If the event you clicked (either from the OverOps Home or by using the OverOps Reliability Dashboards) was not found and you've received the “Event not found” message, this may be the result of one of the following causes:
- The "Automatic entry point timers” UDF is disabled or has not been set - in this case please check the UDF status in the “Performance Slowdowns” View Alert Setting.
- The OverOps micro-agent flag for using the new timers feature has not been enabled; please add the flag as specified above.
- If you're using Java 10 or 11 or J9, Timer Events, Automatic Timers, and Slowdown Snapshots are currently not supported on these Java versions.
If the issue persists and no events are being captured or found, please contact the OverOps support team
Automatic Timers and Slowdown Snapshots
The OverOps Micro-Agent periodically collects statistics of each transaction, once our Micro-Agent encounters an entry point. The backend service collects this data, calculates the threshold for this entry point (based on a standard deviation calculation from the method’s average running time) and relay it back to the micro-agent.
When this timer feature is enabled, the Micro-Agent will take a snapshot of a slowdown event when the transaction running time takes longer than the calculated threshold. The Micro-Agent will do so while looking for the most significant methods using OverOps heuristics search algorithm, so we can obtain a deeper stack trace, with more relevant data for you to help you analyze where the most running time was spent.
Using the automatic timer requires enabling a special UDF called “Automatic entry point timers”. If this feature has been enabled for your use, you'll see an Admin view called “Performance Slowdowns” under the Admin category. The Timers UDF will be activated on this view as an Anomaly Alert and will automatically set the new timers on entry point exceeding their average threshold.
The parameters for this UDF can be set from the Alert Settings to control the following behaviors:
- The time period to use for the baseline
- The number of standard deviations for the threshold
- The minimum allowed threshold
- The minimum delta value to a threshold that will cause an update of its value
Also, to get the proper snapshot data of the transaction that has introduced the slowness, you'll need to enable the following runtime flag when you run the agent with your application:
-Dtakipi.parallax
java -Xmx2G -agentlib:TakipiAgent -Dtakipi.parallax -jar myapp.jar
Or you can set this parameter in the agent.properties files:
takipi.parallax=1
Setting Automatic Timer Alerts
Alerts for Automatic Timer events are set either through the Alerts Dashboard or by adding an alert through the Views pane. In either option remember to set an alert for "Automatic entry point timers".
Timer Events
OverOps can also tell you when your code runs longer than expected using timer events, which track predefined methods for latency.
You can start tracking method timing by adding timers in the Timer Settings screen. Enter the relevant class and method, and the runtime threshold over which you'd like to get alerted. When the method runs for longer than the threshold you defined, OverOps will alert you and provide deep contextual information so you can get to the root cause of the problem as fast as possible. The Timer Events screen also displays a graph with the average runtime of a transaction.
Timer Settings
You can access the Timer Settings dialog using different options:
From the Timer category
In the Views pane, click Timer to open the category, then click Add Timer to open the Timer Settings dialog.
Right-clicking an Event
This displays the Timer Settings dialog, where you'll define mandatory and optional parameters as described below.
Defining Mandatory Parameters
- Class: the class in which to watch for latency. The class may be selected from an available class list or added using free text. The format may contain the package name (e.g. "com.package.Class"). If a package is not specified, all classes with that name will be tracked.
- Method: the method to watch for latency. The method may be selected from an available method list or added using free text. All methods with the selected name will be tracked.
- Threshold: the threshold over which to collect events. Thresholds are specified in milliseconds. Each time a method runs for longer than the defined threshold, an event will fire.
Defining Optional Parameters
- Server: the list of servers to be tracked. Click here to learn how to assign names to your monitored servers.
- Application: the list of applications to be tracked (e.g. "Producer-Service", "Consumer-Service", Web-frontend", etc.). Click here to learn how to assign application names to your JVMs.
Once you've added a Timer, you can edit the threshold, server, and application values. If you edit a threshold, subsequent events will appear separately in the grid. When done, click Save Changes.
Setting Timer Event Alerts
Alerts for Timer events are set either through the Alerts Dashboard or by adding an alert through the Views pane. In either option remember to set an alert for "Anomaly".
You'll find more information in the Event List and the ARC Analysis Screen articles.
Timer Events in the Events Grid
The events grid displays Timer events for the methods that run longer than the defined thresholds. Contextual data for these events will be displayed in the same way as exceptions or logged errors.
The graphs also include a visual indication of the current snapshot and provide the ability to go back to the current snapshot after manipulating the graph.
Note that Timer events are part of the Timers views only and do not appear in other Views.
Related Articles
Updated almost 5 years ago