暂无图片
暂无图片
16
暂无图片
暂无图片
暂无图片

To Achieve Efficient “1-3-5” Problems Identification and Processing through zCloud

原创 孙雪 2020-04-24
1604

This article describes how to use the zCloud platform to identify, trace and locate,and analyze and solve database issues in O&M.

To simply put, identify the problem within 1 min, locate problem within three and solve problem within 5.

for those who haven’t learned about zCloud, please click for primary introduction:

https://www.modb.pro/db/23395
The Opportunity and Challenge of Multi-Heterogeneous Architecture and Resources for PaaS Providers

now, pay attention…

With the continuous development of the financial sector, many financial institutions have built a certain fundamental platform for resource sharing and usage control, and the authority maintain to achieve the standardization and automation of O&M with many kinds of O&M tools such as monitoring, scoring portrait and performance analysis.

The use of these tools has improved the efficiency and quality of O&M in a way.

However, there are downsides as well, due to the large number of tools, many abnormal issues in the database need to be analyzed and located through several tools, which brings great management complexity and operational difficulty to DBA and management personnel, and increase the overall time-consuming, if the business interrupted for long time, it is inevitable to bring bad user experience and social impact especially for the financial providers.

The zCloud database cloud management platform can provide the full processing of the database traceability: problem identifying-problem locating-problem solving, helping the O&M personnel aware the database running status all the time, discovering the risk in time, and solving the problem in short time.

To simply put, identify the problem within 1 min, locate problem within three and solve problem within 5.

Problem Identification

The main Real-time visualization screen of zCloud display the running status of the database, such as the CPU usage, the memory trend, DB Time, table space used and free amount, and alert information that DBA should pay attention to.

c1.png

If we look into the alert display board on main visualization screen, we see that there is an alarm at 10:23:05: the number of sessions blocked by Oracle database instance P01 (192.168.99.32) exceeds 3, currently reaches 5. If the session blocked for too long, it will seriously affect the user’s experience and may cause more serious problems. Therefore, this kind of alert level is pretty high and needs to be checked and processed immediately.

c2.png

Meanwhile, DBAs will receive the alert report from email, SMS, and etc(as long as the user configured the interface).

We can click the alert ID, then redirect to the alert homepage to get the details, such as the IP address of current issue, severity level, alarm duration, and detailed description.

As show in the picture, the first alarm time is 2020-4-21 10:22:26, and the latest alarm occurred at 10:33:35, which has been lasting 11 minutes!

c3.png

Next, you can conduct in-depth analysis through the real-time performance analysis process provided by zCloud.

Problem Traceaility

Entering the monitoring and alarm module of zCloud and find the P01 instance, we can see that the number of real-time active sessions has increased substantially from about 10:20 and has been high since, with the Application waiting counted the most proportion.

Then we narrow the scope of analysis data, frame the time interval from 10:20 to 10:40 in the trend graph, and we can identify the TOP SQL and TOP SESSION lists of the period.

The UPDATE SQL with SQL ID apqbtnw2abvju takes up 99.64% of the Activity%,

there must be something wrong…

c4.png

Click the SQL ID to view the details of this SQL, and based on which to analyze the SQL text and execution plan, and the SQL access objects.

The red marked part is where the SQL problem is. You can see that the operation of the execution plan is TABLE ACCESS FULL full table scan, the object name is TEST2, with the CPU consuming is 125,886,275, and the IO consuming is 1377, and no need to say a lot of waiting event.

SQL performance details provide statistics of object information, so we can view the statistical information of TEST2 table, such as user name, number of data rows, block size, indexes and etc.

c5.png

To be able to analyze the alarm situation in multiple dimensions, we can view the status of the active session. No surprise, the TOP SQL is still trying to execute in active session. There are 4 sessions in all, taking up the 80% of system resources. the TOP wait event in these four sessions is enq: TX -row lock contention

So, it turned out to be resulted from the session was blocked, a row-level lock waiting event!

c6.png

Next, through the tab page of the session blocking analysis, you can see that the first line is the blocking source.

There, the alarm problem has been quickly located to the source.

c7.png

Problem Processing

zCloud provides the function of killing a session with one click. The DBA can directly click “kill session” button after locate the high consuming session to quickly solve the problem. Since the operation to kill a session is of high-risk, it is recommended to double confirm before execute it.

c8.png

Problem Review

The zCloud platform provides historical performance viewing at any time in the past two years. You can select a range to view historical TOP SQL and deep trace to analyze the details.

c9.png

For user’s convenience, zCloud also generate AWR report and view online at any time to easily record events. For example, select the period from 10:00 to 11:00 in 2020-04-21, and generate AWR reports for event analysis and recording.

c10.png

c11.png

c12.png

zCloud is a database cloud management platform that helps large enterprises achieve centralized management and efficient use of database resources, simplify database operation and maintenance with automated and intelligent functions in a private cloud environment, and helps customers optimize resources to reduce costs, and quickly deliver databases to provide efficient IT support for rapid business growth.

最后修改时间:2020-04-24 18:52:01
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论