CN110807017A - AWR report analysis method based on Beautiful Soup analysis technology - Google Patents

AWR report analysis method based on Beautiful Soup analysis technology Download PDF

Info

Publication number
CN110807017A
CN110807017A CN201910986091.3A CN201910986091A CN110807017A CN 110807017 A CN110807017 A CN 110807017A CN 201910986091 A CN201910986091 A CN 201910986091A CN 110807017 A CN110807017 A CN 110807017A
Authority
CN
China
Prior art keywords
awr
report
analysis
rpt
parsing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910986091.3A
Other languages
Chinese (zh)
Inventor
潘敏君
吴健
邱涛
王泽荃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Meichuang Science & Technology Co Ltd
Original Assignee
Hangzhou Meichuang Science & Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Meichuang Science & Technology Co Ltd filed Critical Hangzhou Meichuang Science & Technology Co Ltd
Priority to CN201910986091.3A priority Critical patent/CN110807017A/en
Publication of CN110807017A publication Critical patent/CN110807017A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an AWR report analysis method based on Beautiful Soup analysis technology, which overcomes the defects of low efficiency, instability, invalidity and the like of manually analyzing an AWR report, realizes batch analysis operation of the AWR report in a simpler, more efficient, more stable and more effective mode, and analyzes the database performance from eight dimensions, such as host resources, database resources, session login, sql analysis, sql execution, transaction submission, RAC Statistics, database parameters and the like; and correct conclusions and suggestions are provided, so that a reader can quickly know the performance of the database.

Description

AWR report analysis method based on Beautiful Soup analysis technology
Technical Field
The invention relates to the field of database performance analysis and optimization, in particular to an efficient, stable and effective AWR report analysis method based on Beautiful Soup analysis technology.
Background
The existing analysis mode aiming at the AWR report of the ORACLE database is mainly manual analysis. After the original AWR report is obtained, technicians in different professions can obtain a conclusion with personal experience from different angles, by using different methods, paying attention to different indexes and the like due to different personal knowledge and experience, and the validity of the conclusion cannot be guaranteed. In addition, the AWR report has numerous index items, dispersed modules and multiple focus points, and the stability of report analysis conclusion is difficult to ensure in the face of massive and complicated index items through manual analysis. Meanwhile, when the manual analysis meets the conditions of batch analysis and report ratio equivalence, the efficiency is extremely low, and the analysis efficiency cannot be ensured.
Disclosure of Invention
The invention aims to overcome the defects of low efficiency, instability, invalidity and the like of the conventional manual analysis AWR report, and provides a method for realizing rapid, batch analysis and analysis of the AWR report of an ORACLE database based on a Beautiful Soup analysis technology.
In order to achieve the purpose, the invention adopts the following technical scheme:
an AWR report analysis method based on Beautiful Soup analysis technology comprises the following steps:
(1-1) setting four sets of analysis templates and corresponding parameters of each template; the four sets of resolution templates are awr _ rpt _ u11, awr _ rpt _ u11.2.0.3, awr _ rpt _12 and awr _ rpt _ cdb:
(1-2) introducing a Beautiful Soup library of a PYTHON language, and converting an HTML file of an AWR report into a DOM object of the AWR report;
(1-3) extracting description information of each table of the parsing template a in the DOM object of the AWR report; storing the description information into a static parameter static _ params.py file in a dictionary form; the analysis template a is any one of awr _ rpt _ u11, awr _ rpt _ u11.2.0.3, awr _ rpt _12 and awr _ rpt _ cdb;
(1-4) passing the DOM object of the AWR report into parse template A;
(1-5) calling a Beautiful Soup library built-in method, searching the description information of each table of the analysis template A, and finding the content of each table according to the description information;
and (1-6) giving the list meeting the analysis report template A as a result to an analysis program for analysis to obtain an analysis report.
The invention overcomes the defects of low efficiency, instability, invalidity and the like of the existing manual analysis AWR report, realizes the operation of batch analysis of the AWR report in a simpler, efficient, stable and effective mode, and analyzes the database performance from eight dimensions such as host resources, database resources, session login, sql analysis, sql execution, transaction submission, RAC Statistics, database parameters and the like; and correct conclusions and suggestions are provided, so that a reader can quickly know the performance of the database.
Preferably, the versions of the analysis report templates are ORACLE 10.2.0.1, ORACLE10.2.0.4, ORACLE10.2.0.5, ORACLE11.2.0.1, ORACLE11.2.0.4, ORACLE 11.2.0.5, ORACLE 12.2.0.1, or ORACLE 12.2.0.2.
Preferably, an HTML file of the AWR REPORT is obtained by using an SQL script select output from table (dbms _ workload _ retrieval. AWR _ REPORT _ HTML (: v _ dbid,: in st _ id,: b _ id,: e _ id, 0)) and is stored in a storage directory; the storage directory is formed by splicing the RAW _ AWR _ FOLDER attribute of the current _ app object, the history _ id of the HTML file reported by the AWR and the file name of the HTML file reported by the AWR.
Preferably, each table in the DOM object of the AWR report sets a parsing class, and the parsing class is used for converting Memory _ Dynamic _ Components table data into a list or dictionary object, so as to facilitate processing of the data.
Therefore, the invention has the following beneficial effects: the batch analysis operation of the AWR report is realized in a simpler, efficient, stable and effective mode, and analysis conclusion is analyzed from eight dimensions, such as host resources, database resources, session login, sql analysis, sql execution, transaction submission, RAC Statistics, database parameters and the like, on database performance; and correct conclusions and suggestions are provided, so that a reader can quickly know the performance of the database.
Drawings
FIG. 1 is a flow chart of the present invention.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
The embodiment shown in fig. 1 is an AWR report parsing method based on a Beautiful Soup parsing technology, and includes the following steps:
1. setting four sets of analysis templates and corresponding parameters of each template; the four sets of analysis templates are awr _ rpt _ u11, awr _ rpt _ u11.2.0.3, awr _ rpt _12 and awr _ rpt _ cdb respectively;
2. a Beautiful Soup library of PYTHON language is introduced, and an AWR report (HTML file) is converted into an AWR report (DOM object).
2.1 using SQL script "select output from table (dbms _ workload _ retrieval. AWR _ REPORT _ HTML (: v _ dbid: in st _ id: b _ id: e _ id, 0)); "get AWR report (HTML file) stored in storage directory;
the storage directory is spliced by the RAW _ AWR _ FOLDER attribute of the current _ app object, the history _ id of the AWR report (HTML file), and the file name of the AWR report (HTML file).
For example, an AWR report (HTML file) of 11.2.0.4 version (the AWR report version number is known manually) is obtained using SQL script, RAW _ AWR _ FOLDER is' C: user, do-main-tool, static ', history _ id ═ 1, the AWR report (HTML file) storage directory is' C: \\ User \ do-main-tool \ static \1\ awr11.2.0.4. html'.
2.2 introducing a Beautiful Soup library of a PYTHON language, and converting the AWR report (HTML file) obtained in the step 2.1 into an AWR report (DOM object) by using the Beautiful Soup library.
2.3 write one parse class for each form in 2.2 step AWR report (DOM object).
For example, the parsing template needs to take the value buffer cache, where the value is in a Memory Dynamic Components table of an AWR report (DOM object), and the parsing class of the Memory Dynamic Components table is named Memory _ Dynamic _ Components and Memory _ Dynamic _ Components, and the parsing class can convert data of the Memory _ Dynamic _ Components table into a list or a dictionary object, so as to facilitate data processing.
3. Extracting description information of each table in the AWR report (DOM object) corresponding to the parsing template AWR _ rpt _ u 11.2.0.3;
the high-version AWR report (DOM object) directly extracts the text content of the summary; the low version is replaced by other labels (h3 label, p label), part of the table has no h3 label, and the part of the table has no p label, and the data are stored in a static parameter static _ params.
3.1 get AWR report (DOM object) version. The AWR report (DOM object) version values are stored in a first table. For this purpose, the parsing function pars () of the parse class dblnstinfo of the table is called, and the result is returned to the dictionary object, and the value of the key (release) is obtained as the version number.
3.2 select parameters in MarkDown template AWR _ rpt _ u11.2.0.3 and static _ params.py corresponding to AWR report (DOM object) version number.
4. The AWR report (DOM object) is passed into MarkDown template AWR _ rpt _ u 11.2.0.3;
4.1 pass the AWR report template (DOM object) as the first item into the AWR _ rpt _ u11.2.0.3 template;
4.2, the analytic classes of all the tables in the step 2.3 are transmitted into an awr _ rpt _ u11.2.0.3 template as a second item;
4.3 passing the database parameters in static _ params.py as a third entry into awr _ rpt _ u11.2.0.3 template;
4.4 set a CommonMethod class for processing tables, which is used to normalize table data types and output formats. The commonnethod class contains some data processing functions, such as the int method, which is used to take the ',' and space from the string data in the table, convert it to float class data and keep two decimal places, and pass this class as the fourth item into the awr _ rpt _ u11 template.
5, calling a Beautiful Soup library built-in method, searching elements meeting requirements (requirement standard: the description information of each table of the analysis template awr _ rpt _ u11.2.0.3), and finding table contents meeting the requirements according to the description information;
the 5.111.2.0.4 version of the AWR report table is simple to locate, and the find _ all () method of the Beautiful Soup library can be directly called.
The find _ all () function can find all elements that meet the requirements (requirement criteria: description information for each table of the parse template awr _ rpt _ u 11). During analysis, the sequence number corresponding to the table to be analyzed is transmitted into an analysis function, and the analysis script obtains text content from a static parameter static _ params. Then, the find _ all method of BeautifuSoup is called to find all tables containing description information.
5.22.2, all the analysis classes in the step further process the result returned in the step 5.1, and the common processing mode is.
a) Three parameters are returned: parsing status (success or failure), list of table rows, title
b) Converting row or column headings into dictionary objects
c) Returning directly to the list of all or part of the rows of the list
6, after positioning to the table in step 5, returning the list meeting the conditions of the analysis report template awr _ rpt _ u11.2.0.3 as a result, and submitting the result to an analysis program for analysis until the analysis is completed to form an analysis report.
6.1 returning the analysis result to the analysis report template, displaying data in awr _ rpt _ u11.2.0.3 template, or performing logic calculation to judge whether the database is abnormal. For example, the buffer cache hit rate parameter in 2.1.2 bytes of buffer cache in the awr _ rpt _ u11.2.0.3 template has a reference value of 90% to 100%, and an exception is indicated below 90%.
6.2, writing the analyzed content (character string) into a Markdown file, and transmitting the Markdown file to the position in the step 2.1 for storage.
The noun explains:
AWR: AWR is called Automatic Workload reproducibility in all, and Chinese is called an Automatic load information base. ORACLE divides performance analysis into each subdivision index item according to the existing methodology, collects the subdivision index items at regular time, stores collected information in a database, and can generate a report to analyze the index items when problems occur. This report is the AWR report. The AWR report has rich content and detailed indexes, and comprises database basic information, host computer related information, database global performance indexes, local SQL performance indexes, hotspot access objects and the like, and also has a performance analysis methodology based on a time model
DOM: the full name of DOM (Document 0 object Model) is the Document object Model, which can access and modify the content and structure of a Document in a platform and language independent manner. For example, in Web development, the document structure of HTML is accessed, created, deleted, or modified using JavaScript.
Beautiful Soup: beautiful Soup is a Python library that can extract data from HTML or XML files, it can implement the familiar way of navigating, searching and modifying documents through your favorite converter. The Beautiful Soup can convert html documents into a Dom model, and is convenient for searching for desired elements and data.
High version: ORACLE 11.2.0.3 version and above
Low version: ORACLE 11.2.0.3 version below
It should be understood that this example is for illustrative purposes only and is not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.

Claims (4)

1. An AWR report analysis method based on Beautiful Soup analysis technology is characterized by comprising the following steps:
(1-1) setting four sets of analysis templates and corresponding parameters of each template; the four sets of analysis templates are awr _ rpt _ u11, awr _ rpt _ u11.2.0.3, awr _ rpt _12 and awr _ rpt _ cdb respectively;
(1-2) introducing a Beautiful Soup library of a PYTHON language, and converting an HTML file of an AWR report into a DOM object of the AWR report;
(1-3) extracting description information of each table of the parsing template a in the DOM object of the AWR report; storing the description information into a static parameter static _ params.py file in a dictionary form; the analysis template a is any one of awr _ rpt _ u11, awr _ rpt _ u11.2.0.3, awr _ rpt _12 and awr _ rpt _ cdb;
(1-4) passing the DOM object of the AWR report into parse template A;
(1-5) calling a Beautiful Soup library built-in method, searching the description information of each table of the analysis template A, and finding the content of each table according to the description information;
and (1-6) giving the list meeting the analysis report template A as a result to an analysis program for analysis to obtain an analysis report.
2. The AWR report parsing method based on Beautiful Soup parsing technology as claimed in claim 1, wherein the version of the parsing report template is ORACLE 10.2.0.1, ORACLE10.2.0.4, ORACLE10.2.0.5, ORACLE11.2.0.1, ORACLE11.2.0.4, ORACLE 11.2.0.5, ORACLE 12.2.0.1 or ORACLE12.2.0.2.
3. The AWR REPORT parsing method based on Beautiful Soup parsing technology as claimed in claim 1, wherein SQL script select output from table (dbms _ workload _ retrieval. AWR _ REPORT _ HTML (: v _ dbid,: in st _ id,: b _ id,: e _ id, 0)) is used to obtain HTML file of AWR REPORT, and store the HTML file into the storage directory; the storage directory is formed by splicing the RAW _ AWR _ FOLDER attribute of the current _ app object, the history _ id of the HTML file reported by the AWR and the file name of the HTML file reported by the AWR.
4. The AWR report parsing method based on Beautiful Soup parsing technology as claimed in claim 1, 2 or 3, wherein each table in the DOM object of the AWR report is provided with a parsing class, and the parsing class is used for converting each table data in the DOM object of the AWR report into a list or a dictionary object, so that data processing is facilitated.
CN201910986091.3A 2019-10-16 2019-10-16 AWR report analysis method based on Beautiful Soup analysis technology Pending CN110807017A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910986091.3A CN110807017A (en) 2019-10-16 2019-10-16 AWR report analysis method based on Beautiful Soup analysis technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910986091.3A CN110807017A (en) 2019-10-16 2019-10-16 AWR report analysis method based on Beautiful Soup analysis technology

Publications (1)

Publication Number Publication Date
CN110807017A true CN110807017A (en) 2020-02-18

Family

ID=69488506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910986091.3A Pending CN110807017A (en) 2019-10-16 2019-10-16 AWR report analysis method based on Beautiful Soup analysis technology

Country Status (1)

Country Link
CN (1) CN110807017A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742033A (en) * 2022-06-10 2022-07-12 武汉四通信息服务有限公司 Data analysis method and device, storage medium and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107506276A (en) * 2017-06-26 2017-12-22 杭州沃趣科技股份有限公司 A kind of method for realizing batch collection Oracle AWR reports
CN107656858A (en) * 2016-07-26 2018-02-02 深圳联友科技有限公司 A kind of method and system of automatic O&M monitoring oracle database
CN108363761A (en) * 2018-02-02 2018-08-03 深圳市华讯方舟软件信息有限公司 Hadoop awr automatic loads analyze information bank, analysis method and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107656858A (en) * 2016-07-26 2018-02-02 深圳联友科技有限公司 A kind of method and system of automatic O&M monitoring oracle database
CN107506276A (en) * 2017-06-26 2017-12-22 杭州沃趣科技股份有限公司 A kind of method for realizing batch collection Oracle AWR reports
CN108363761A (en) * 2018-02-02 2018-08-03 深圳市华讯方舟软件信息有限公司 Hadoop awr automatic loads analyze information bank, analysis method and storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LIRONGYANG: "Python学习笔记用BeautifulSoup模块解析HTML", <博客园,HTTPS://WWW.CNBLOGS.COM/LIRONGYANG/P/9692549.HTML> *
WAN了个蛋: "BeautifulSoup模板简单应用", 《博客园,HTTPS://WWW.CNBLOGS.COM/QTCLM/P/11317034》 *
寸草心2130: "Django项目上线后无法加载xadmin等的静态文件问题(dja", 《CSDN,HTTPS://BLOG.CSDN.NET/QQ_35531549/ARTICLE/DETAILS/86600406》 *
小小工匠: "ORACLE常用性能监控SQL【二】", 《CSDN,HTTPS://BLOG.CSDN.NET/YANGSHANGWEI/ARTICLE/DETAILS/52917132》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114742033A (en) * 2022-06-10 2022-07-12 武汉四通信息服务有限公司 Data analysis method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US11995073B2 (en) One-shot learning for text-to-SQL
US10067931B2 (en) Analysis of documents using rules
US20200012697A1 (en) Method and device for information retrieval, device and computer readable storage medium
US8484210B2 (en) Representing markup language document data in a searchable format in a database system
CN109582647B (en) Unstructured evidence file oriented analysis method and system
US8977606B2 (en) Method and apparatus for generating extended page snippet of search result
US20240012810A1 (en) Clause-wise text-to-sql generation
US20170147541A1 (en) Converting portions of documents between structured and unstructured data formats to improve computing efficiency and schema flexibility
US7548912B2 (en) Simplified search interface for querying a relational database
US8527867B2 (en) Enabling users to edit very large XML data
CN112231321B (en) Oracle secondary index and index real-time synchronization method
CN112667563A (en) Document management and operation method and system
CN112231407A (en) DDL synchronization method, device, equipment and medium of PostgreSQL database
CN111191429A (en) System and method for automatic filling of data table
CN113704667A (en) Automatic extraction processing method and device for bidding announcement
CN111368167A (en) Chinese literature data automatic acquisition method based on web crawler technology
CN110807017A (en) AWR report analysis method based on Beautiful Soup analysis technology
Liu et al. Neuron: Query execution plan meets natural language processing for augmenting DB education
CN113553491A (en) Industrial big data search optimization method based on inverted index
Kuc Apache solr 3.1 cookbook
CN117095419A (en) PDF document data processing and information extracting device and method
CN116303359A (en) Method for realizing multi-type document export of database structure
CN111241313A (en) Retrieval method and device supporting image input
Yu et al. A novel method for extracting entity data from Deep Web precisely
US11720531B2 (en) Automatic creation of database objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200218

RJ01 Rejection of invention patent application after publication