Real-Time ADDM
ADDM is further enhanced in the new Enterprise Manager Cloud Control 12c version with a real-time feature. This is known as Real-Time ADDM (supported for any Oracle database Target of version 10.2.0.4 and higher) and is very useful for situations when the database has become very slow, or has hung and normal ADDM cannot work since it may not be possible to even log in to the database. Even if it is possible to log in to the database, taking an AWR snapshot or running an ADDM task may worsen the problem.
Sometimes it may be due to just one blocking session that the entire database arrives at this unresponsive state.
Often, not being able to find out the root cause of the problem, DBAs may resort to bouncing the database to fix the situation. But this is a very drastic step as all uncommitted transactions will be aborted and rolled back, and mid-tier connections and states will also be lost. The actual cause of the problem will not be found as all diagnostic information will also be lost and the problem can happen again.
In such a scenario, Real-Time ADDM bypasses the SQL layer through the Agent and directly accesses the database by using a special proprietary diagnostic connection capability. A lightweight connection is made via the Agent without any I/O and without acquiring additional locks or global resources such as enqueues or latches, as it does not run any SQL.
An attempt is also made to start a normal JDBC connection. Whatever data is returned by either of these connections can be used for analysis by Real-Time ADDM. It performs a non-locking analysis and will be able to detect problems and hangs. If it is a RAC database, a cluster-wide analysis will be provided. Real-Time ADDM is able to detect manifold issues such as global database-resource contentions, including databases that are CPU bound, I/O bound or interconnect bound, or with OS-level memory paging.
If there are memory issues, such as excessive growth of the PGA or library cache/memory allocation issues, they are also detected. Real-Time ADDM also warns you if resource limits for processes or sessions are reached, or if there are deadlocks affecting the database. Database hangs are detected, for example, if the ASM-storage system is not responding, the instance is in the process of shutting down, the memory pool is being flushed, and so on. Top-blocker analysis is performed.
The recommendations that are provided for single or RAC databases are precise and can be set into action immediately, and diagnostics data will be collected for further analysis if there is no fix available. In fact, there is no other tool like Real-Time ADDM available today, which can easily access a hung database and then recommend a solution by understanding and analyzing the problem in detail.
To access this feature of Real-Time ADDM, you need to go to Performance | Real-Time ADDM from the database menu. This brings up a screen where you need to enter the SYSDBA credentials that will be used to access the database in the diagnostic connection mode.
The Real-Time ADDM page now appears. Click on the Start button. This starts up both the normal JDBC connection as well as the diagnostics connection, and you can see the progress of each. The first time this is run, an error appears about a missing installation of some required PL/SQL packages. Click on the error to see the full error message in the box at the lower half of the screen, the text in the screenshot may not be readable but is included for the purpose of illustration:
You can click on the PL/SQL Package Deployment link, this allows you to deploy the package using either the Simple Installation method or the Advanced Installation method, as seen in the following screenshot. The difference is that in the latter, you can schedule a future time for deployment and you can also run on multiple databases.
In both methods, either a DBA credential is required or you can use the DBSNMP
user (provided the CONNECT
, CREATE TYPE
, and CREATE PROCEDURE
privileges have been granted to the DBSNMP
user).
After this deployment completes, you can start the Real-Time ADDM utility again. This time there is no error and both the types of connections now report their full findings as shown in the following screenshot. There is one finding:
We move to the Findings tab and there the main finding is reported: a single SQL statement is taking up 75.29 percent of session activity in the database, which is a lot:
The Hardware Resources tab reports that the hosts are not CPU-bound, and graphically shows the average process load on the CPU including both Oracle and other processes. The Oracle databases have only taken up 30.2% of the Run Queue. There are also other subtabs available, such as Memory, I/O, and Interconnect, which give the related information:
In the Activity tab, we can see the SQL IDs of the statements with the highest activity and also the user sessions with the highest activity.
In the Hang Data tab, under the Final Blockers subtab (seen in the following screenshot), we see the Top Final Blockers by cumulative blocking time section. Session Ids 18 and 43 are waiting on Session ID 10 (the screenshot may not be fully readable but is included for the purpose of illustration):
The Blocked Sessions subtab also shows more information about the top waits in the Top Waiters by Wait Time section (which in this case are Session Ids 18 and 43), Immediate Blockers of Sessions 18 and Blocker Chain above Session 18, which could be useful in the case of a series of blocking sessions. In this case it is seen that Session ID:10 is performing an I/O as shown by the text db file async I/O, whereas Session ID:18 is waiting for a local-write wait. It turns out that there were two sessions doing a Cartesian join on tables along with a sort on the results, and this resulted in the blocking activity:
To have found this information via the standard ADDM would have been quite difficult, given the slow response of the database. And there can also be other cases when the database is actually so hung that a normal JDBC connection is not possible, in this case, the lightweight diagnostic connection of Real-Time ADDM would be a lifesaver for the DBA.
The last tab of Statistics on the Real-Time ADDM page shows database data and host data, including current OS load (number of processes), background CPU usage per second, database time per second, the average synchronous single-block read latency, and so on.