一、 概述

随着信息系统业务需求快速增长,业务系统关联日益复杂,信息系统性能问题逐渐突显,一旦出现信息系统性能问题及不可用问题,将严重影响信息系统的稳定运行及用户体验。

结合运维实践,数据库性能问题是造成信息系统性能下降和非停的重要原因之一,如何进行常态化的数据库性能趋势分析,及时发现数据库性能衰减“病灶”,常态化提升信息系统性能,避免救火式性能优化,成为衡量信息系统管理部门运营能力的重要指标之一。

二、研究目标

使用python语言进行Oracle数据库性能趋势分析。

三、工具介绍

开发语言:python2.7

数据库:Oracle11.2.0.4

Web框架:Django

图形展示工具:echart

四、算法介绍

核心算法由运行可靠率、资源竞争率、进程等待率和SQL稳定率四部分组成,如下图所示,本文主要以SQL稳定率为例:

Trend =100-100*sum(( c_time-h_time) /h_time)

说明:

Trend: 表示信息系统性能趋势(%)

c_time: 前一小时SQL平均执行时间(秒)

h_time: 3个月内SQL平均执行时间(秒)

五、效果展示

(1)、系统性能趋势:

(2)、TOPSQL性能趋势分析

(3)、TOPSQL日性能趋势分析

(4)、TOPSQL月性能趋势分析

六、核心代码

核心代码分为数据采集层、数据转换层、web展示层。

(1)、数据采集层:


点击(此处)折叠或打开

def get_topsql_info(username,password,ip,port,dbname,c_type,param=0,b_param=0):

s_top10 = ''

#s_snap_id = 0

print oracle_link_target



if c_type == 'sql_topsql':

sql_topsql="

select round(Elapsed_Time, 2) Elapsed_Time,

round(cpu_time, 2) cpu_time,

Executions,

round(elap_per_exec, 2) elap_per_exec,

round(total_db_time, 2) total_db_time,

sql_id,

substr(nvl(sql_module, ' ** SQL module Not Available ** '), 1, 30) sql_module,

sql_text

from (select nvl((sqt.elap / 1000000), to_number(null)) Elapsed_Time,

nvl((sqt.cput / 1000000), to_number(null)) CPU_Time,

sqt.exec Executions,

decode(sqt.exec,

0,

to_number(null),

(sqt.elap / sqt.exec / 1000000)) Elap_per_Exec,

(100 *

(sqt.elap /

(SELECT sum(e.VALUE) - sum(b.value)

FROM DBA_HIST_SYS_TIME_MODEL e, DBA_HIST_SYS_TIME_MODEL b

WHERE B.SNAP_ID = "+str(b_param)+"

AND E.SNAP_ID = "+str(param)+"

AND B.DBID = (select dbid from v$database)

AND E.DBID = (select dbid from v$database)

AND B.INSTANCE_NUMBER =

(select instance_number from v$instance)

AND E.INSTANCE_NUMBER =

(select instance_number from v$instance)

and e.STAT_NAME = 'DB time'

and b.stat_name = 'DB time'))) Total_DB_Time,

sqt.sql_id,

to_char(decode(sqt.module,

null,

null,

'Module: ' || sqt.module)) SQL_Module,

nvl(to_char(substr(st.sql_text, 1, 30)),

' ** SQL Text Not Available ** ') SQL_Text

from (select sql_id,

max(module) module,

sum(elapsed_time_delta) elap,

sum(cpu_time_delta) cput,

sum(executions_delta) exec

from dba_hist_sqlstat

dba_hist_sqltext st

where st.sql_id(+) = sqt.sql_id

order by nvl(sqt.elap, -1) desc, sqt.sql_id)

where rownum < 100

"

elif c_type == 'top10':

#a list of top10: m_top10

m_top10=get_hsql_info(t,'top10')



#after get top10



#end get top10

for h_sql_id in m_top10:

l_sql_id = h_sql_id[0]

s_top10 = s_top10+",'"+l_sql_id+"'"

s_top10 = s_top10.strip(',')

sql_hsql_top10="select sql_id,to_char(substr(sql_text,1,2000)) sql_text,length(sql_text) sql_length,command_type from dba_hist_sqltext t where t.sql_id in ("+s_top10+')'

else:

cmd=sql_tablespace



#print s_top10

#print log_cmd_i



cmd =""

if c_type == 'sql_topsql':

cmd=sql_topsql

elif c_type == 'top10':

cmd=sql_hsql_top10

else:

cmd=sql_tablespace

#print len(m_top10)



print 'before get topsql exe sql: '

print cmd

print 'get db shell: '

conn = cx_Oracle.connect(oracle_link_target)

cursor = conn.cursor()

cur = cursor.execute(cmd)

db_list = cur.fetchall()

#print 'before return db_list'

#print db_list

return db_list



cursor.close()

conn.close()


(2)、数据转换层


点击(此处)折叠或打开

select row_number() over(partition by ip order by to_number(total_db_time) desc) rn,

ip,

db_name,

sql_id,

decode(elap_per_exec, '0', 0.01, elap_per_exec) elap_per_exec,

decode(elap_avg_exec, '0', 0.01, elap_avg_exec) elap_avg_exec,

decode(sign(decode(elap_avg_exec, '0', 0.01, elap_avg_exec) - decode(elap_per_exec, '0', 0.01, elap_per_exec)),

1,

'up',

-1,

'down',

'equ') sql_status,

round((decode(elap_avg_exec, '0', 0.01, elap_avg_exec) -

decode(elap_per_exec, '0', 0.01, elap_per_exec)) /

decode(elap_avg_exec, '0', 0.01, elap_avg_exec),

2) sql_cont,

executions,

total_db_time,

substr(sql_module, 1, 12) sql_module,

substr(sql_text, 1, 12) sql_text,

ch_date

from (select rownum rn,

d.ip,

d.db_name,

d.sql_id,

replace(d.elap_per_exec, 'None', 0) elap_per_exec,

e.elap_avg_exec,

d.executions,

d.sql_module,

d.sql_text,

d.ch_date,

d.total_db_time

from hsql.h_topsql d,

(select b.ip,

b.sql_id,

round(avg(replace(b.elap_per_exec, 'None', 0)),

2) elap_avg_exec

from hsql.h_topsql_bak b

group by b.ip, b.sql_id) e

where d.sql_id = e.sql_id

and d.ip = e.ip)));


(3)、web展示层


点击(此处)折叠或打开

def topsql_line_servlet(request):

cursor = conn.cursor()

query = "select ip,

(select service_name

from hsql.h_instance h

where h.ip = b.ip

and rownum = 1) service_name,

sql_id,

executions,

elap_per_exec,

to_char(ch_date, 'hh34:mi') sj,

to_char(ch_date, 'yyyy-mm-dd') rq

from hsql.h_topsql b

where ch_date > trunc(sysdate)

order by sj"



print query

cursor.execute(query)

resultset = cursor.fetchall()

cursor.close()

conn.close()


七、总结

通过Oracle性能趋势分析工具的应用可以进行细粒度的数据库性能管理,及时发现潜在的信息系统性能衰减隐患,通过持续性、常态化的信息系统性能优化,优化信息系统提升,提升用户体验。