WO2020259391A1 - Database script performance testing method and device - Google Patents

Database script performance testing method and device Download PDF

Info

Publication number
WO2020259391A1
WO2020259391A1 PCT/CN2020/096905 CN2020096905W WO2020259391A1 WO 2020259391 A1 WO2020259391 A1 WO 2020259391A1 CN 2020096905 W CN2020096905 W CN 2020096905W WO 2020259391 A1 WO2020259391 A1 WO 2020259391A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
database
abstract
table information
sentence
Prior art date
Application number
PCT/CN2020/096905
Other languages
French (fr)
Chinese (zh)
Inventor
谢锐杰
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2020259391A1 publication Critical patent/WO2020259391A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning

Definitions

  • the embodiment of the present invention relates to the field of financial technology (Fintech), and in particular to a method and device for testing database script performance.
  • the first one is for banks/financial institutions to simulate actual online production database data by calculating numbers in the test environment.
  • the disadvantage of this scheme lies in the time cost of the number creation process, and it will inevitably increase the resource requirements of the test environment.
  • the second is for banks/financial institutions to build a set of quasi-production environment database for performance testing through online production database data copies.
  • the shortcomings of this solution are related to resource requirements.
  • the database resource requirements of banks/financial institutions will double, resulting in a shortage of resources.
  • the embodiment of the present invention provides a method and device for testing database script performance, which is used to save the manpower and resource expenses of manufacturing a test environment or preparing a quasi-production environment.
  • an embodiment of the present invention provides a method for testing database script performance, including:
  • the performance index of the database script is determined.
  • the performance index of the database script is obtained, which is used to indicate the priority of determining the script optimization, which can save testing Human resources for environmental construction and reduction of resource expenditure for preparing quasi-production environment.
  • the preprocessing of the statements in the database script information includes:
  • the information related to the actual business in the sentence of the database script information is deleted.
  • the consumption of the abstract sentence includes current consumption and estimated consumption
  • the table information of the production database includes data volume information, index information, index discrimination information, and data growth trend
  • table information of the test database Including data volume information, index information, index discrimination information, and data growth trends
  • Determining the consumption of the abstracted statement according to the table information of the production database and/or the table information of the test database, the abstracted statement corresponding to the database script information, and the execution plan of the abstracted statement include:
  • the plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database
  • the execution plan of the abstraction statement determines the estimated consumption of the abstraction statement that exists in the table information of the production database
  • the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present
  • the information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined.
  • the current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information
  • the execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
  • the determining the performance index of the database script according to the consumption of the abstraction sentence and the execution frequency of the abstraction sentence includes:
  • a four-quadrant classification is performed to obtain the abstract sentence Performance.
  • the method further includes:
  • an embodiment of the present invention provides an apparatus for testing database script performance, including:
  • the obtaining unit is used to obtain table information of the online production database, table information of the test database, and database script information in the test database test process;
  • the processing unit is used to preprocess the sentences in the database script information to obtain the abstract sentences corresponding to the database script information; and determine the execution frequency of the abstract sentences; according to the table information of the production database And/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence, determine the consumption of the abstraction sentence;
  • the execution frequency of the abstract statement determines the performance index of the database script.
  • processing unit is specifically configured to:
  • the information related to the actual business in the sentence of the database script information is deleted.
  • the consumption of the abstract sentence includes current consumption and estimated consumption
  • the table information of the production database includes data volume information, index information, index discrimination information, and data growth trend
  • table information of the test database Including data volume information, index information, index discrimination information, and data growth trends
  • the processing unit is specifically used for:
  • the plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database
  • the execution plan of the abstraction statement determines the estimated consumption of the abstraction statement that exists in the table information of the production database
  • the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present
  • the information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined.
  • the current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information
  • the execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
  • processing unit is specifically configured to:
  • a four-quadrant classification is performed to obtain the abstract sentence Performance.
  • processing unit is further configured to:
  • the execution frequency of the database script determines the optimization plan of the database script.
  • an embodiment of the present invention also provides a computing device, including:
  • Memory used to store program instructions
  • the processor is configured to call the program instructions stored in the memory, and execute the above-mentioned database script performance test method according to the obtained program.
  • embodiments of the present invention also provide a computer-readable non-volatile storage medium, including computer-readable instructions, which when the computer reads and executes the computer-readable instructions, cause the computer to execute the aforementioned database script performance Method of testing.
  • FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of a method for testing database script performance according to an embodiment of the present invention
  • Figure 3 is a schematic diagram of a four-quadrant distribution provided by an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an apparatus for testing database script performance provided by an embodiment of the present invention.
  • FIG. 1 exemplarily shows a system architecture to which the embodiment of the present invention is applicable.
  • the system architecture may be a server 100 used by a bank/financial institution for database script testing.
  • the server 100 may include a database statistical information collection module 110, The script sentence capture module 120, the sentence analysis module 130, and the report management module 140.
  • the database statistical information collection module 110 can collect database statistical information of banks/financial institutions, that is, table information, including table field information, data volume information, index information, index discrimination information, and data growth trends.
  • the database statistical information collection module 110 can collect table information of an online production database of a bank/financial institution and table information of a test database.
  • the script sentence capturing module 120 can capture the sentence information executed during the database test of the bank/financial institution in the test environment, abstract the captured sentences, classify and count the frequency.
  • the sentence analysis module 130 can calculate sentence consumption in combination with the sentence execution plan and the database statistics of the bank/financial institution.
  • the report management module 140 can perform four-quadrant classification based on the sentence frequency information and sentence consumption. And generate related statement performance test report.
  • FIG. 1 is only an example, which is not limited in the embodiment of the present invention.
  • FIG. 2 exemplarily shows the process of a method for testing database script performance provided by an embodiment of the present invention.
  • the process can be executed by a database script performance testing device, which can be located in the above-mentioned bank/financial institution.
  • the server 100 of the bank/financial institution may also be the server 100.
  • the SQL database will be taken as an example, and the statements involved are SQL statements.
  • the embodiment of the present invention is only an example, and is not limited to the SQL database.
  • the process specifically includes:
  • Step 201 Obtain table information of the online production database, table information of the test database, and database script information in the test database test process.
  • the database statistical information collection module 110 can be used to collect table information of the online production database and collect table information of the test database.
  • the script sentence capturing module 120 is used to capture the database script information during the test database test, and the script information includes a large amount of sentence information. It should be noted that when collecting the table information of the online generation database, the table information of the standby database of the production database can be collected.
  • the table information may include table field information, data volume information, index information, index discrimination information, and data growth trends.
  • Step 202 Preprocess the sentences in the database script information to obtain abstract sentences corresponding to the database script information; and determine the execution frequency of the abstract sentences.
  • the script statement capturing module 120 After the script statement capturing module 120 captures the script information of the database, it can delete the information related to the actual business in the statement of the database script information, and retain the main information required for execution plan analysis, which is equivalent to abstracting the statement. For example, you can delete the numerical information in the where condition, so that the script only retains the main information needed for execution plan analysis, and the rest are abstract statements.
  • the SQL script information captured by the script statement capturing module 120 may be as shown in Table 1.
  • the elements that affect the execution plan of the SQL statement include the table name (or subquery) after the From keyword, the field information after the where condition and the order of the fields, and the simple after the Select keyword The field name (the most commonly used), the constant data after the set keyword in the Update statement and the constant data after the where keyword in the Update statement do not affect the execution plan. Then the data in Table 1 above is abstracted and preprocessed, The obtained SQL abstraction statement can be shown in Figure 4.
  • the five abstract sentences shown in Table 2 can be grouped into one category, as shown in Table 3. After completing the SQL abstraction sentence classification, the execution frequency of the SQL abstraction sentence is counted according to the number of occurrences, and the SQL abstraction sentence and its execution frequency are registered in the table.
  • Step 203 Determine the consumption of the abstract sentence according to the table information of the production database and/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence .
  • the consumption of abstract sentences includes current consumption and estimated consumption.
  • determining the consumption of the abstract sentence it can be specifically: for the abstract sentence that exists in the table information of the production database, substituting the data volume information and/or index information and/or index discrimination information in the table information of the production database into There is an execution plan for the abstract statement of the table information of the production database, and the current consumption of the abstract statement of the table information of the production database is determined. And combined with the data growth trend in the table information of the production database, the estimated recent data statistics information is substituted into the execution plan of the abstract statement of the table information of the production database, and the estimation of the abstract statement of the table information of the production database is determined. Consumption.
  • the data volume information and/or index information and/or index discrimination information in the table information of the test database corresponding to the abstract statements in the table information of the production database are Perform conversion at a preset ratio, and substitute the converted data volume information and/or index information and/or index discrimination information into the execution plan of the abstract statement that does not exist in the production database table information, and determine that there is no production database table information The current consumption of abstract statements.
  • the statement analysis module 130 can be used to analyze the execution plan of the SQL abstract statement, and execute the Explain command that comes with the SQL database on the SQL abstract statement in the test environment to obtain the execution plan of the SQL abstract statement running in the test environment. .
  • the statistical information collected by the production database (content included in the table information) is substituted into the execution plan to calculate SQL consumption.
  • the number of scanned rows in the execution plan is taken as the statistical latitude. SQL consumption statistics.
  • the current consumption of the SQL abstraction statement can be calculated, combined with the data growth trend, the estimated recent data statistics are substituted into the execution plan, and the calculation Estimated consumption of SQL abstraction statements.
  • the estimated recent data is based on the data in the historical table combined with data growth trends.
  • the data volume information of the entire table substituted into the statistical data is used for consumption statistics.
  • the data volume information of the entire table and the index discrimination information are converted into index data volume information for consumption statistics.
  • the statistical data in the table information of the production database is as follows:
  • the data volume in table a is 20000, and the growth trend is linear growth, with a periodical growth step of 3000;
  • the data volume in the statistics of table b is 50000, because table b is a primary key index, the amount of scanned data is 1;
  • the data volume of table c is 10000, and the increasing trend is that the statistical average value of the index discrimination degree of the first 8 bits of the tmpsno field is 1000, and the scanned data volume obtained by combining conversion is 10000/1000 about 10.
  • the growth trend is a curve growth, and the slope change of the last few cycles is +0.15.
  • SQL consumption is 20000+(1*3000)+(10000+10000*0.15)/1000, which is about 23012.
  • the SQL consumption is 20000+(3*3000)+(10000*(1+0.15) ⁇ 3/1000) about 29015.
  • the table information of the test database If there is no data in production, according to the table information of the test database, it is converted according to a certain preset ratio and substituted into the calculation of the consumption of the SQL abstraction statement (including current consumption and estimated consumption).
  • the preset ratio can be set based on experience.
  • Step 204 Determine the performance index of the database script according to the consumption of the abstract sentence and the execution frequency of the abstract sentence.
  • the post-conversion consumption of abstract sentences can be determined according to the consumption and consumption threshold of abstract sentences, and the post-conversion consumption of abstract sentences can be determined according to the execution frequency and frequency threshold of abstract sentences. frequency.
  • the four-quadrant classification is performed to obtain the performance indicators of abstract sentences.
  • the performance index of abstract sentences is the content after four-quadrant classification. Among them, the current consumption and estimated consumption of abstract sentences are classified into four quadrants respectively.
  • the consumption threshold and frequency threshold can be set based on experience.
  • the SQL consumption in Table 2 is the consumption of SQL abstract statements.
  • the four-quadrant classification of the data in Table 2 above can obtain the four-quadrant distribution of SQL consumption as shown in Figure 3.
  • the SQL consumption value is a positive number that needs to be optimized, and the higher the frequency, the higher the optimization priority.
  • an optimization plan for the database script can be determined based on the performance indicators of the database script, the characteristics of the database, the table information of the production database, the table information of the test database, and the execution frequency of the abstract statement.
  • the report management module 140 is mainly used to provide the SQL script statement adjustment and optimization plan.
  • the SQL script statement optimization has mature optimization implementation plans in the industry, and the optimization plan and business logic are relatively related.
  • the embodiment of the present invention is based on the original optimization plan. On the basis of this, some special points related to the MYSQL database and the script sentence optimization proposal combined with statistical information are proposed.
  • the field order of SQL database query conditions will affect the execution consumption of SQL statements.
  • traverse the different sorting combinations in the query respectively count the consumption of SQL abstract statements, and select the lowest cost sorting method Mark as the best sort.
  • Mark the involved non-optimal sorted SQL abstraction statement as a sorted optimized SQL statement.
  • the index information in the basic statistical information of the database and the statistical information of the abstract sentence are obtained, and the secondary statistics are performed to obtain the fields in the query conditions corresponding to each table.
  • the number of occurrences to determine whether the field has an index, index type, and the position of the field in the joint index are Count the fields corresponding to the frequently occurring tables. If the field has no index, it is marked as the field information that may need to be optimized.
  • the report management module 140 can sort the analysis results and the SQL script sentence optimization plan, generate related SQL script sentence performance test reports, and submit it to the project team for SQL script sentence optimization.
  • the priority during optimization can be high frequency high consumption, low frequency high consumption.
  • the data growth trend information in the table information may not be collected, which can save the workload of the database statistical information collection module 110 and can only analyze the current consumption of the current current SQL statements. In this case, the estimated consumption of SQL statements cannot be achieved, and early warning cannot be given.
  • the execution frequency of the above abstract statements may not be counted. In this case, it is only necessary to perform full optimization on the SQL abstract statements analyzed as high consumption.
  • the embodiment of the present invention shows that by obtaining the table information of the online production database, the table information of the test database, and the database script information in the test process, the sentences in the database script information are preprocessed to obtain the abstract sentences corresponding to the database script information , And determine the execution frequency of the abstract statement, and determine the consumption of the abstract statement according to the table information of the production database and/or the table information of the test database, the abstract statement corresponding to the database script information, and the execution plan of the abstract statement.
  • the consumption of abstract sentences and the execution frequency of the abstract sentences determine the performance index of the database script. Because the table information of the online production database and the table information of the test database are combined to perform the performance test of the database script, it can save the manpower and resource expenses of the test environment or the preparation of the quasi-production environment.
  • database script performance test process in the embodiment of the present invention can be applied to various database script performance tests in banks/financial institutions.
  • FIG. 4 exemplarily shows the structure of a database script performance test device provided by an embodiment of the present invention.
  • the device can execute the process of database script performance test.
  • the device can be located in the server 100. , Or the server 100.
  • the device specifically includes:
  • the obtaining unit 401 is configured to obtain table information of the online production database, table information of the test database, and database script information during the test database test process;
  • the processing unit 402 is configured to preprocess the sentences in the database script information to obtain abstract sentences corresponding to the database script information; and determine the execution frequency of the abstract sentences; according to the table of the production database Information and/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence, determine the consumption of the abstraction sentence; according to the consumption sum of the abstraction sentence
  • the execution frequency of the abstract sentence determines the performance index of the database script.
  • processing unit 402 is specifically configured to:
  • the information related to the actual business in the sentence of the database script information is deleted.
  • the consumption of the abstract sentence includes current consumption and estimated consumption
  • the table information of the production database includes data volume information, index information, index discrimination information, and data growth trend
  • table information of the test database Including data volume information, index information, index discrimination information, and data growth trends
  • the processing unit 402 is specifically configured to:
  • the plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database
  • the execution plan of the abstraction statement determines the estimated consumption of the abstraction statement that exists in the table information of the production database
  • the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present
  • the information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined.
  • the current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information
  • the execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
  • processing unit 402 is specifically configured to:
  • a four-quadrant classification is performed to obtain the abstract sentence Performance.
  • processing unit 402 is further configured to:
  • the execution frequency of the database script determines the optimization plan of the database script.
  • an embodiment of the present invention also provides a computing device, including:
  • Memory used to store program instructions
  • the processor is configured to call the program instructions stored in the memory, and execute the above-mentioned database script performance test method according to the obtained program.
  • the embodiments of the present invention also provide a computer-readable non-volatile storage medium, including computer-readable instructions.
  • the computer reads and executes the computer-readable instructions, the computer executes the above-mentioned database. Script performance testing method.
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A database script performance testing method and device. Said method comprises: acquiring table information of an online production database, table information of a test database, and database script information in a test process (201); preprocessing sentences in the database script information to obtain abstraction sentences corresponding to the database script information, and determining an execution frequency of the abstraction sentences (202); determining consumption of the abstraction sentences according to the table information of the production database and/or the table information of the test database, the abstraction sentences corresponding to the database script information, and an execution plan of the abstraction sentences (203); and determining performance indicators of a database script according to the consumption of the abstraction sentences and the execution frequency of the abstraction sentences (204). As the performance of the database script is tested by combining the table information of the on-line production database and the table information of the test database, number fabrication in a test environment or the human labour and resource overhead for preparing a quasi-production environment can be reduced.

Description

一种数据库脚本性能测试的方法及装置Method and device for testing database script performance
相关申请的交叉引用Cross references to related applications
本申请要求在2019年06月26日提交中国专利局、申请号为201910562575.5、申请名称为“一种数据库脚本性能测试的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 26, 2019, the application number is 201910562575.5, and the application name is "a method and device for database script performance testing", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本发明实施例涉及金融科技(Fintech)领域,尤其涉及一种数据库脚本性能测试的方法及装置。The embodiment of the present invention relates to the field of financial technology (Fintech), and in particular to a method and device for testing database script performance.
背景技术Background technique
随着计算机技术的发展,越来越多的技术(如大数据、分布式、区块链(Blockchain)、人工智能等)应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,性能测试技术也不例外,但由于金融、支付行业的安全性、实时性要求,也对技术提出的更高的要求。With the development of computer technology, more and more technologies (such as big data, distributed, Blockchain, artificial intelligence, etc.) are applied in the financial field. The traditional financial industry is gradually changing to Fintech. Performance testing technology is no exception, but due to the security and real-time requirements of the financial and payment industries, it also places higher requirements on technology.
目前,银行/金融机构在进行数据库脚本性能测试时,通常有以下两种形式:第一种为银行/金融机构通过在测试环境造数的方式来模拟实际的线上生产数据库的数据。这种方案的缺点在于造数过程的时间开销,同时也必然加大测试环境的资源需求,数据量大了测试资源就需要扩充,数据量小了可能无法达到性能测试的数据需求。第二种为银行/金融机构通过线上生产数据库的数据副本,搭建一套准生产环境数据库用作性能测试使用。这种方案的缺点在与资源的需求上,对银行/金融机构的数据库资源需求会产生翻倍,造成资源紧缺。At present, when banks/financial institutions conduct database script performance testing, there are usually two forms: The first one is for banks/financial institutions to simulate actual online production database data by calculating numbers in the test environment. The disadvantage of this scheme lies in the time cost of the number creation process, and it will inevitably increase the resource requirements of the test environment. When the amount of data is large, the test resources need to be expanded. If the amount of data is small, the data requirements for performance testing may not be met. The second is for banks/financial institutions to build a set of quasi-production environment database for performance testing through online production database data copies. The shortcomings of this solution are related to resource requirements. The database resource requirements of banks/financial institutions will double, resulting in a shortage of resources.
发明内容Summary of the invention
本发明实施例提供一种数据库脚本性能测试的方法及装置,用以节约测 试环境造数或准备准生产环境的人力及资源开销。The embodiment of the present invention provides a method and device for testing database script performance, which is used to save the manpower and resource expenses of manufacturing a test environment or preparing a quasi-production environment.
第一方面,本发明实施例提供的一种数据库脚本性能测试的方法,包括:In the first aspect, an embodiment of the present invention provides a method for testing database script performance, including:
获取线上生产数据库的表信息、测试数据库的表信息以及所述测试数据库测试过程中的数据库脚本信息;Acquiring table information of the online production database, table information of the test database, and database script information during the test process of the test database;
对所述数据库脚本信息中的语句进行预处理,得到所述数据库脚本信息对应的抽象化语句;并确定出所述抽象化语句的执行频次;Preprocessing the sentences in the database script information to obtain abstract sentences corresponding to the database script information; and determining the execution frequency of the abstract sentences;
根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗;Determine the consumption of the abstraction sentence according to the table information of the production database and/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence;
根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标。According to the consumption of the abstract sentences and the execution frequency of the abstract sentences, the performance index of the database script is determined.
上述技术方案中,由于结合了线上生产数据库的表信息和测试数据库的表信息来进行数据库脚本的性能测试,从而得到数据库脚本的性能指标,用于指示判别脚本优化的优先级,可以节省测试环境造数的人力资源以及降低准备准生产环境的资源开销。In the above technical solution, since the table information of the online production database and the table information of the test database are combined to perform the performance test of the database script, the performance index of the database script is obtained, which is used to indicate the priority of determining the script optimization, which can save testing Human resources for environmental construction and reduction of resource expenditure for preparing quasi-production environment.
可选的,所述对所述数据库脚本信息中的语句进行预处理,包括:Optionally, the preprocessing of the statements in the database script information includes:
将所述数据库脚本信息的语句中的与实际业务相关的信息删除。The information related to the actual business in the sentence of the database script information is deleted.
可选的,所述抽象化语句的消耗包括当前消耗和预估消耗;所述生产数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;所述测试数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;Optionally, the consumption of the abstract sentence includes current consumption and estimated consumption; the table information of the production database includes data volume information, index information, index discrimination information, and data growth trend; table information of the test database Including data volume information, index information, index discrimination information, and data growth trends;
所述根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗,包括:Determining the consumption of the abstracted statement according to the table information of the production database and/or the table information of the test database, the abstracted statement corresponding to the database script information, and the execution plan of the abstracted statement, include:
对于存在生产数据库的表信息的抽象化语句,将生产数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的 抽象化语句的当前消耗;以及结合生产数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的预估消耗;For abstract sentences that exist in the table information of the production database, substituting the data volume information and/or index information and/or index discrimination information in the table information of the production database into the abstract statements that exist in the table information of the production database. The plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database The execution plan of the abstraction statement determines the estimated consumption of the abstraction statement that exists in the table information of the production database;
对于不存在生产数据库的表信息的抽象化语句,将所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息按照预设比例进行转换,将转换后的数据量信息和/或索引信息和/或索引区分度信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的当前消耗;以及结合所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的预估消耗。For abstract sentences that do not exist in the table information of the production database, the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present The information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined. The current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information The execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
可选的,所述根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标,包括:Optionally, the determining the performance index of the database script according to the consumption of the abstraction sentence and the execution frequency of the abstraction sentence includes:
根据所述抽象化语句的消耗和消耗阈值,确定出所述抽象化语句的转换后消耗;Determine the post-conversion consumption of the abstract sentence according to the consumption and consumption threshold of the abstract sentence;
根据所述抽象化语句的执行频次和频次阈值,确定出所述抽象化语句的转换后频次;Determine the converted frequency of the abstract sentence according to the execution frequency and the frequency threshold of the abstract sentence;
根据所述抽象化语句的消耗、所述抽象化语句的执行频次、所述抽象化语句的转换后消耗和所述抽象化语句的转换后频次,进行四象限分类,得到所述抽象化语句的性能指标。According to the consumption of the abstract sentence, the execution frequency of the abstract sentence, the post-conversion consumption of the abstract sentence, and the post-conversion frequency of the abstract sentence, a four-quadrant classification is performed to obtain the abstract sentence Performance.
可选的,在所述确定出所述数据库脚本的性能指标之后,还包括:Optionally, after determining the performance index of the database script, the method further includes:
若还存在数据核对状态为核对失败的同步信息,则根据所述数据核对状态为核对失败的同步信息中的时间分片信息,重新从所述联机处理系统的数据库中读取数据,插入到所述批量处理系统的数据库中,并更新所述补充处理状态根据所述数据库脚本的性能指标、所述数据库的特征、所述生产数据库的表信息、所述测试数据库的表信息和所述抽象化语句的执行频次,确定 出所述数据库脚本的优化方案。If there is still synchronization information whose data verification status is verification failure, according to the time slicing information in the synchronization information whose data verification status is verification failure, re-read data from the database of the online processing system and insert it into all In the database of the batch processing system, and update the supplementary processing state according to the performance index of the database script, the characteristics of the database, the table information of the production database, the table information of the test database, and the abstraction The execution frequency of the statement determines the optimization plan of the database script.
第二方面,本发明实施例提供一种数据库脚本性能测试的装置,包括:In a second aspect, an embodiment of the present invention provides an apparatus for testing database script performance, including:
获取单元,用于获取线上生产数据库的表信息、测试数据库的表信息以及所述测试数据库测试过程中的数据库脚本信息;The obtaining unit is used to obtain table information of the online production database, table information of the test database, and database script information in the test database test process;
处理单元,用于对所述数据库脚本信息中的语句进行预处理,得到所述数据库脚本信息对应的抽象化语句;并确定出所述抽象化语句的执行频次;根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗;根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标。The processing unit is used to preprocess the sentences in the database script information to obtain the abstract sentences corresponding to the database script information; and determine the execution frequency of the abstract sentences; according to the table information of the production database And/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence, determine the consumption of the abstraction sentence; The execution frequency of the abstract statement determines the performance index of the database script.
可选的,所述处理单元具体用于:Optionally, the processing unit is specifically configured to:
将所述数据库脚本信息的语句中的与实际业务相关的信息删除。The information related to the actual business in the sentence of the database script information is deleted.
可选的,所述抽象化语句的消耗包括当前消耗和预估消耗;所述生产数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;所述测试数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;Optionally, the consumption of the abstract sentence includes current consumption and estimated consumption; the table information of the production database includes data volume information, index information, index discrimination information, and data growth trend; table information of the test database Including data volume information, index information, index discrimination information, and data growth trends;
所述处理单元具体用于:The processing unit is specifically used for:
对于存在生产数据库的表信息的抽象化语句,将生产数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的当前消耗;以及结合生产数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的预估消耗;For abstract sentences that exist in the table information of the production database, substituting the data volume information and/or index information and/or index discrimination information in the table information of the production database into the abstract statements that exist in the table information of the production database. The plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database The execution plan of the abstraction statement determines the estimated consumption of the abstraction statement that exists in the table information of the production database;
对于不存在生产数据库的表信息的抽象化语句,将所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息按照预设比例进行转换,将转换后的数据量信息和/或索引信息和/或索引区分度信息代入所述不存在生产数据库的表信息的 抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的当前消耗;以及结合所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的预估消耗。For abstract sentences that do not exist in the table information of the production database, the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present The information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined. The current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information The execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
可选的,所述处理单元具体用于:Optionally, the processing unit is specifically configured to:
根据所述抽象化语句的消耗和消耗阈值,确定出所述抽象化语句的转换后消耗;Determine the post-conversion consumption of the abstract sentence according to the consumption and consumption threshold of the abstract sentence;
根据所述抽象化语句的执行频次和频次阈值,确定出所述抽象化语句的转换后频次;Determine the converted frequency of the abstract sentence according to the execution frequency and the frequency threshold of the abstract sentence;
根据所述抽象化语句的消耗、所述抽象化语句的执行频次、所述抽象化语句的转换后消耗和所述抽象化语句的转换后频次,进行四象限分类,得到所述抽象化语句的性能指标。According to the consumption of the abstract sentence, the execution frequency of the abstract sentence, the post-conversion consumption of the abstract sentence, and the post-conversion frequency of the abstract sentence, a four-quadrant classification is performed to obtain the abstract sentence Performance.
可选的,所述处理单元还用于:Optionally, the processing unit is further configured to:
在所述确定出所述数据库脚本的性能指标之后,根据所述数据库脚本的性能指标、所述数据库的特征、所述生产数据库的表信息、所述测试数据库的表信息和所述抽象化语句的执行频次,确定出所述数据库脚本的优化方案。After the performance index of the database script is determined, according to the performance index of the database script, the characteristics of the database, the table information of the production database, the table information of the test database, and the abstract sentence The execution frequency of the database script determines the optimization plan of the database script.
第三方面,本发明实施例还提供了一种计算设备,包括:In the third aspect, an embodiment of the present invention also provides a computing device, including:
存储器,用于存储程序指令;Memory, used to store program instructions;
处理器,用于调用所述存储器中存储的程序指令,按照获得的程序执行上述数据库脚本性能测试的方法。The processor is configured to call the program instructions stored in the memory, and execute the above-mentioned database script performance test method according to the obtained program.
第四方面,本发明实施例还提供了一种计算机可读非易失性存储介质,包括计算机可读指令,当计算机读取并执行所述计算机可读指令时,使得计算机执行上述数据库脚本性能测试的方法。In a fourth aspect, embodiments of the present invention also provide a computer-readable non-volatile storage medium, including computer-readable instructions, which when the computer reads and executes the computer-readable instructions, cause the computer to execute the aforementioned database script performance Method of testing.
附图说明Description of the drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中 所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present invention, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1为本发明实施例提供的一种系统架构的示意图;FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present invention;
图2为本发明实施例提供的一种数据库脚本性能测试的方法的流程示意图;2 is a schematic flowchart of a method for testing database script performance according to an embodiment of the present invention;
图3为本发明实施例提供的一种四象限分布的示意图;Figure 3 is a schematic diagram of a four-quadrant distribution provided by an embodiment of the present invention;
图4为本发明实施例提供的一种数据库脚本性能测试的装置的结构示意图。FIG. 4 is a schematic structural diagram of an apparatus for testing database script performance provided by an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all of them. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
图1示例性的示出了本发明实施例所适用的一种系统架构,该系统架构可以为银行/金融机构用于数据库脚本测试的服务器100,该服务器100可以包括数据库统计信息采集模块110、脚本语句捕获模块120、语句分析模块130和报表管理模块140。FIG. 1 exemplarily shows a system architecture to which the embodiment of the present invention is applicable. The system architecture may be a server 100 used by a bank/financial institution for database script testing. The server 100 may include a database statistical information collection module 110, The script sentence capture module 120, the sentence analysis module 130, and the report management module 140.
其中,数据库统计信息采集模块110可以采集银行/金融机构的数据库统计信息,也就是表信息,包括表字段信息、数据量信息、索引信息、索引区分度信息和数据增长趋势等。例如,数据库统计信息采集模块110可以采集银行/金融机构的线上生产数据库的表信息和测试数据库的表信息。The database statistical information collection module 110 can collect database statistical information of banks/financial institutions, that is, table information, including table field information, data volume information, index information, index discrimination information, and data growth trends. For example, the database statistical information collection module 110 can collect table information of an online production database of a bank/financial institution and table information of a test database.
脚本语句捕获模块120可以捕获测试环境下银行/金融机构的数据库测试过程中执行的语句信息,对捕获的语句进行抽象化,归类并统计频次。The script sentence capturing module 120 can capture the sentence information executed during the database test of the bank/financial institution in the test environment, abstract the captured sentences, classify and count the frequency.
语句分析模块130可以结合语句执行计划及银行/金融机构的数据库统计 信息来计算语句的消耗。The sentence analysis module 130 can calculate sentence consumption in combination with the sentence execution plan and the database statistics of the bank/financial institution.
报表管理模块140可以结合语句频次信息和语句的消耗,进行四象限分类。并生成相关语句性能测试报表。The report management module 140 can perform four-quadrant classification based on the sentence frequency information and sentence consumption. And generate related statement performance test report.
需要说明的是,上述图1所示的结构仅是一种示例,本发明实施例对此不做限定。It should be noted that the structure shown in FIG. 1 is only an example, which is not limited in the embodiment of the present invention.
基于上述描述,图2示例性的示出了本发明实施例提供的一种数据库脚本性能测试的方法的流程,该流程可以由数据库脚本性能测试的装置执行,该装置可以位于上述银行/金融机构的服务器100内,也可以是该银行/金融机构的服务器100。Based on the above description, FIG. 2 exemplarily shows the process of a method for testing database script performance provided by an embodiment of the present invention. The process can be executed by a database script performance testing device, which can be located in the above-mentioned bank/financial institution. The server 100 of the bank/financial institution may also be the server 100.
需要说明的是,下面在描述数据库脚本性能测试的流程中,将以SQL数据库为例,涉及的语句为SQL语句。本发明实施例仅是示例作用,并不限于该SQL数据库。It should be noted that in the following description of the database script performance test process, the SQL database will be taken as an example, and the statements involved are SQL statements. The embodiment of the present invention is only an example, and is not limited to the SQL database.
如图2所示,该流程具体包括:As shown in Figure 2, the process specifically includes:
步骤201,获取线上生产数据库的表信息、测试数据库的表信息以及所述测试数据库测试过程中的数据库脚本信息。Step 201: Obtain table information of the online production database, table information of the test database, and database script information in the test database test process.
在具体应用时,可以通过数据库统计信息采集模块110来采集线上生产数据库的表信息,以及采集测试数据库的表信息。通过脚本语句捕获模块120来捕获测试数据库测试过程中的数据库脚本信息,脚本信息包括大量的语句信息。需要说明的是,采集线上生成数据库的表信息时,可以采集生产数据库的备库的表信息。在本发明实施例中,表信息可以包括表字段信息、数据量信息、索引信息、索引区分度信息和数据增长趋势等信息。In specific applications, the database statistical information collection module 110 can be used to collect table information of the online production database and collect table information of the test database. The script sentence capturing module 120 is used to capture the database script information during the test database test, and the script information includes a large amount of sentence information. It should be noted that when collecting the table information of the online generation database, the table information of the standby database of the production database can be collected. In the embodiment of the present invention, the table information may include table field information, data volume information, index information, index discrimination information, and data growth trends.
步骤202,对所述数据库脚本信息中的语句进行预处理,得到所述数据库脚本信息对应的抽象化语句;并确定出所述抽象化语句的执行频次。Step 202: Preprocess the sentences in the database script information to obtain abstract sentences corresponding to the database script information; and determine the execution frequency of the abstract sentences.
当脚本语句捕获模块120捕获到数据库的脚本信息之后,就可以将数据库脚本信息的语句中的与实际业务相关的信息删除,保留执行计划分析需要的主干信息,相当于对语句进行抽象化处理。例如可以删除where条件中的数值信息,使得脚本只保留执行计划分析需要的主干信息,剩下的就是抽象 化语句。After the script statement capturing module 120 captures the script information of the database, it can delete the information related to the actual business in the statement of the database script information, and retain the main information required for execution plan analysis, which is equivalent to abstracting the statement. For example, you can delete the numerical information in the where condition, so that the script only retains the main information needed for execution plan analysis, and the rest are abstract statements.
举例来说,通过脚本语句捕获模块120捕获的SQL脚本信息可以如表1所示。在表1中,SQL语句的组成中,影响SQL语句执行计划的要素包括From关键字后的表名(或子查询),where条件后的字段信息及字段的先后排序,Select关键字后的简单的字段名(最常用的),Update语句中set关键词后常量数据及where关键字后的常量数据这块是不影响执行计划的,那么将上面表1中的数据进行抽象化预处理后,得到的SQL抽象化语句可以如图4所示。For example, the SQL script information captured by the script statement capturing module 120 may be as shown in Table 1. In Table 1, in the composition of the SQL statement, the elements that affect the execution plan of the SQL statement include the table name (or subquery) after the From keyword, the field information after the where condition and the order of the fields, and the simple after the Select keyword The field name (the most commonly used), the constant data after the set keyword in the Update statement and the constant data after the where keyword in the Update statement do not affect the execution plan. Then the data in Table 1 above is abstracted and preprocessed, The obtained SQL abstraction statement can be shown in Figure 4.
表1Table 1
select*from student where class=’103’and name=’Tom’;select*from student where class=’103’and name=’Tom’;
select name,score from student where class=’103’and name=’Tom’;select name, score from student where class=’103’ and name=’Tom’;
select name,score from student where class=’104’and name=’Juily’;select name, score from student where class=’104’ and name=’Juily’;
delete from student where class=’103’and name=’Tom’;delete from student where class=’103’and name=’Tom’;
update student set score=’100’where class=’103’and name=’Tom’;update student set score=’100’where class=’103’ and name=’Tom’;
然后将图2所示的SQL抽象化语句进行分类,按照执行计划依赖因素来统计。Then classify the SQL abstract statements shown in Figure 2 and count according to the execution plan dependent factors.
表2Table 2
Figure PCTCN2020096905-appb-000001
Figure PCTCN2020096905-appb-000001
可以将表2所示的5个抽象化语句同一为1类,如表3所示。完成SQL抽象化语句分类之后,按照出现的次数统计出SQL抽象化语句的执行频次,并将该SQL抽象化语句及其执行频次登记在表中。The five abstract sentences shown in Table 2 can be grouped into one category, as shown in Table 3. After completing the SQL abstraction sentence classification, the execution frequency of the SQL abstraction sentence is counted according to the number of occurrences, and the SQL abstraction sentence and its execution frequency are registered in the table.
表3table 3
Figure PCTCN2020096905-appb-000002
Figure PCTCN2020096905-appb-000002
步骤203,根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗。Step 203: Determine the consumption of the abstract sentence according to the table information of the production database and/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence .
本发明实施例中,抽象化语句的消耗包括当前消耗和预估消耗。在确定该抽象化语句的消耗时,具体可以为:对于存在生产数据库的表信息的抽象化语句,将生产数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息代入存在生产数据库的表信息的抽象化语句的执行计划,确定出存在生产数据库的表信息的抽象化语句的当前消耗。以及结合生产数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入存在生产数据库的表信息的抽象化语句的执行计划,确定出存在生产数据库的表信息的抽象化语句的预估消耗。In the embodiment of the present invention, the consumption of abstract sentences includes current consumption and estimated consumption. When determining the consumption of the abstract sentence, it can be specifically: for the abstract sentence that exists in the table information of the production database, substituting the data volume information and/or index information and/or index discrimination information in the table information of the production database into There is an execution plan for the abstract statement of the table information of the production database, and the current consumption of the abstract statement of the table information of the production database is determined. And combined with the data growth trend in the table information of the production database, the estimated recent data statistics information is substituted into the execution plan of the abstract statement of the table information of the production database, and the estimation of the abstract statement of the table information of the production database is determined. Consumption.
对于不存在生产数据库的表信息的抽象化语句,将不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息按照预设比例进行转换,将转换后的数据量信息和/或索引信息和/或索引区分度信息代入不存在生产数据库的表信息的抽象化语句的执行计划,确定出不存在生产数据库的表信息的抽象化语句的当前消耗。以及结合不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入不存在生产数据库的表信息的抽象化语句的执行计划,确定出不存在生产数据库的表信息的抽象化语句的预估消耗。For abstract statements that do not exist in the table information of the production database, the data volume information and/or index information and/or index discrimination information in the table information of the test database corresponding to the abstract statements in the table information of the production database are Perform conversion at a preset ratio, and substitute the converted data volume information and/or index information and/or index discrimination information into the execution plan of the abstract statement that does not exist in the production database table information, and determine that there is no production database table information The current consumption of abstract statements. And combined with the data growth trend in the table information of the test database corresponding to the abstract statement of the table information that does not exist in the production database, substitute the estimated recent data statistics into the execution plan of the abstract statement of the table information that does not exist in the production database, and determine The estimated consumption of abstract statements that do not exist in the production database table information.
举例来说,可以通过语句分析模块130,就SQL抽象化语句进行执行计划分析,在测试环境中对SQL抽象化语句执行SQL数据库自带的Explain命令得到SQL抽象化语句在测试环境运行的执行计划。For example, the statement analysis module 130 can be used to analyze the execution plan of the SQL abstract statement, and execute the Explain command that comes with the SQL database on the SQL abstract statement in the test environment to obtain the execution plan of the SQL abstract statement running in the test environment. .
对于存在产数据库的表信息的SQL抽象化语句,在执行计划中代入生产数据库采集的统计信息(表信息中包括的内容),计算SQL消耗,这里以执行计划中的扫描行数为统计纬度进行SQL消耗统计。For SQL abstract statements with table information in the production database, the statistical information collected by the production database (content included in the table information) is substituted into the execution plan to calculate SQL consumption. Here, the number of scanned rows in the execution plan is taken as the statistical latitude. SQL consumption statistics.
根据统计的当前正在统计的SQL抽象化语句对应的生产数据库的表的数据量信息可以计算出SQL抽象化语句的当前消耗,结合数据增长趋势,将预估的近期数据统计信息代入执行计划,计算出SQL抽象化语句的预估消耗。该预估的近期数据是依据历史的表中的数据结合数据增长趋势得到的。According to the statistics of the data volume information of the production database table corresponding to the SQL abstraction statement currently being counted, the current consumption of the SQL abstraction statement can be calculated, combined with the data growth trend, the estimated recent data statistics are substituted into the execution plan, and the calculation Estimated consumption of SQL abstraction statements. The estimated recent data is based on the data in the historical table combined with data growth trends.
对于执行计划中查询类型为“ALL”的,代入统计数据的全表的数据量信息进行消耗统计。对于查询类型为“range”和“index”的,结合全表的数据量信息及索引区分度信息,换算成索引数据量信息,进行消耗统计。For the query type "ALL" in the execution plan, the data volume information of the entire table substituted into the statistical data is used for consumption statistics. For the query types "range" and "index", the data volume information of the entire table and the index discrimination information are converted into index data volume information for consumption statistics.
例如,如表4所示的SQL抽象化语句在测试环境中的执行计划。For example, the execution plan of the SQL abstraction statement in the test environment as shown in Table 4.
表4Table 4
Figure PCTCN2020096905-appb-000003
Figure PCTCN2020096905-appb-000003
生产数据库的表信息中统计数据如下:The statistical data in the table information of the production database is as follows:
a表统计信息中数据量20000,增长趋势为线性增长,一个周期增长步长为3000;The data volume in table a is 20000, and the growth trend is linear growth, with a periodical growth step of 3000;
b表统计信息中数据量为50000,由于b表是主键索引,扫描数据量为1;The data volume in the statistics of table b is 50000, because table b is a primary key index, the amount of scanned data is 1;
c表数据量为10000,增长趋势为tmpsno字段前8位的索引区分度统计均值为1000,则结合换算得到的扫描数据量为10000/1000约为10。增长趋势为曲线增长,上几个周期的斜率变化为+0.15。The data volume of table c is 10000, and the increasing trend is that the statistical average value of the index discrimination degree of the first 8 bits of the tmpsno field is 1000, and the scanned data volume obtained by combining conversion is 10000/1000 about 10. The growth trend is a curve growth, and the slope change of the last few cycles is +0.15.
将上述a、b、c三个表整体代入执行计划,可以得到SQL抽象化语句的当前消耗为:20000*1+10约为20010。Substituting the above three tables a, b, and c into the execution plan, the current consumption of the SQL abstraction statement can be obtained: 20000*1+10 is about 20010.
SQL抽象化语句的预估消耗为:The estimated consumption of SQL abstract statements is:
1周期后SQL消耗为20000+(1*3000)+(10000+10000*0.15)/1000约为23012。After 1 cycle, SQL consumption is 20000+(1*3000)+(10000+10000*0.15)/1000, which is about 23012.
3周期后SQL消耗为20000+(3*3000)+(10000*(1+0.15)^3/1000)约为29015。After 3 cycles, the SQL consumption is 20000+(3*3000)+(10000*(1+0.15)^3/1000) about 29015.
对于生产上暂无数据的,根据测试数据库的表信息,按一定预设比例换算并代入计算SQL抽象化语句的消耗(包括当前消耗和预估消耗)。该预设比例可以依据经验设置。If there is no data in production, according to the table information of the test database, it is converted according to a certain preset ratio and substituted into the calculation of the consumption of the SQL abstraction statement (including current consumption and estimated consumption). The preset ratio can be set based on experience.
步骤204,根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标。Step 204: Determine the performance index of the database script according to the consumption of the abstract sentence and the execution frequency of the abstract sentence.
当得到抽象化语句的消耗之后,就可以根据抽象化语句的消耗和消耗阈值,确定出抽象化语句的转换后消耗,根据抽象化语句的执行频次和频次阈值,确定出抽象化语句的转换后频次。然后根据抽象化语句的消耗、抽象化语句的执行频次、抽象化语句的转换后消耗和抽象化语句的转换后频次,进行四象限分类,得到抽象化语句的性能指标。抽象化语句的性能指标也就是四象限分类后的内容。其中,抽象化语句的当前消耗和预估消耗都分别进行四象限分类。该消耗阈值和频次阈值可以依据经验设置。When the consumption of abstract sentences is obtained, the post-conversion consumption of abstract sentences can be determined according to the consumption and consumption threshold of abstract sentences, and the post-conversion consumption of abstract sentences can be determined according to the execution frequency and frequency threshold of abstract sentences. frequency. Then according to the consumption of abstract sentences, the execution frequency of abstract sentences, the consumption after conversion of abstract sentences, and the frequency after conversion of abstract sentences, the four-quadrant classification is performed to obtain the performance indicators of abstract sentences. The performance index of abstract sentences is the content after four-quadrant classification. Among them, the current consumption and estimated consumption of abstract sentences are classified into four quadrants respectively. The consumption threshold and frequency threshold can be set based on experience.
例如,假设现在有一组消耗信息如表2所示,其中,消耗阈值为20000,频次阈值为500。表2中的SQL消耗即为SQL抽象化语句的消耗。将上述表2中的数据进行四象限分类,可以得到如图3所示的SQL消耗四象限分布。其中,在对数据库脚本进行优化时,SQL消耗值为正数的是需要优化的,频次越高的优化优先级越高。For example, suppose there is a set of consumption information as shown in Table 2, where the consumption threshold is 20000 and the frequency threshold is 500. The SQL consumption in Table 2 is the consumption of SQL abstract statements. The four-quadrant classification of the data in Table 2 above can obtain the four-quadrant distribution of SQL consumption as shown in Figure 3. Among them, when optimizing the database script, the SQL consumption value is a positive number that needs to be optimized, and the higher the frequency, the higher the optimization priority.
表5table 5
SQL抽象化语句编号SQL abstraction statement number 执行频次Execution frequency SQL消耗SQL consumption 转换后频次Frequency after conversion 转换后消耗Consumption after conversion
11 1010 3200032000 -490-490 1200012000
22 150150 3030 -350-350 -19970-19970
33 350350 40004000 -150-150 -16000-16000
44 4040 40014001 -460-460 -15999-15999
55 280280 40024002 -220-220 -15998-15998
66 6060 40034003 -440-440 -15997-15997
77 55 40044004 -495-495 -15996-15996
88 8080 40054005 -420-420 -15995-15995
99 23twenty three 40064006 -477-477 -15994-15994
1010 100100 40074007 -400-400 -15993-15993
1111 110110 5000050000 -390-390 3000030000
1212 730730 4800048000 230230 2800028000
1313 750750 50005000 250250 -15000-15000
1414 770770 1500015000 270270 -5000-5000
1515 770770 33333333 270270 -16667-16667
1616 899899 5200052000 399399 3200032000
在得到抽象化语句的性能指标之后,还可以根据数据库脚本的性能指标、数据库的特征、生产数据库的表信息、测试数据库的表信息和抽象化语句的执行频次,确定出数据库脚本的优化方案。After obtaining the performance indicators of the abstract statement, an optimization plan for the database script can be determined based on the performance indicators of the database script, the characteristics of the database, the table information of the production database, the table information of the test database, and the execution frequency of the abstract statement.
这里主要是通过报表管理模块140,给出SQL脚本语句调整优化方案,SQL脚本语句优化现在业界已有成熟的优化实现方案,而且优化方案和业务逻辑关联较大,本发明实施例在原有优化方案的基础上,提出几点MYSQL数据库相关的特殊点及结合统计信息的脚本语句优化建议方案。Here, the report management module 140 is mainly used to provide the SQL script statement adjustment and optimization plan. The SQL script statement optimization has mature optimization implementation plans in the industry, and the optimization plan and business logic are relatively related. The embodiment of the present invention is based on the original optimization plan. On the basis of this, some special points related to the MYSQL database and the script sentence optimization proposal combined with statistical information are proposed.
例如,1、SQL数据库查询条件的字段顺序会影响SQL语句执行消耗,根据SQL抽象化语句的主干信息,遍历查询中的不同排序组合,分别统计SQL抽象化语句的消耗,取消耗最低的排序方式标记为最优排序。并将涉及的非最优排序的SQL抽象化语句标记为待排序优化SQL语句。For example, 1. The field order of SQL database query conditions will affect the execution consumption of SQL statements. According to the main information of SQL abstract statements, traverse the different sorting combinations in the query, respectively count the consumption of SQL abstract statements, and select the lowest cost sorting method Mark as the best sort. And mark the involved non-optimal sorted SQL abstraction statement as a sorted optimized SQL statement.
2、结合生产数据库的表信息和测试数据库的表信息,得出的数据库基础统计信息中的索引信息及抽象化语句的统计信息,进行二次统计,得出每张表对应的查询条件中字段的出现次数,确定字段是否存在索引、索引类型、字段在联合索引中的位置。统计高频出现的表对应的字段,如该字段无索引, 标记为可能需要优化的字段信息。2. Combined with the table information of the production database and the table information of the test database, the index information in the basic statistical information of the database and the statistical information of the abstract sentence are obtained, and the secondary statistics are performed to obtain the fields in the query conditions corresponding to each table. The number of occurrences to determine whether the field has an index, index type, and the position of the field in the joint index. Count the fields corresponding to the frequently occurring tables. If the field has no index, it is marked as the field information that may need to be optimized.
然后报表管理模块140可以整理分析结果和SQL脚本语句优化方案,生成相关SQL脚本语句性能测试报表,提交项目组进行SQL脚本语句优化。进行优化时的优先级可以为高频高消耗、低频高消耗。Then the report management module 140 can sort the analysis results and the SQL script sentence optimization plan, generate related SQL script sentence performance test reports, and submit it to the project team for SQL script sentence optimization. The priority during optimization can be high frequency high consumption, low frequency high consumption.
需要说明的是,在具体实施过程中,可以不采集表信息中的数据增长趋势信息,这样可以节约数据库统计信息采集模块110的工作量,可以只分析当前现状的SQL语句的当前消耗。这种情况下,无法实现SQL语句的预估消耗,无法进行提前预警。It should be noted that in the specific implementation process, the data growth trend information in the table information may not be collected, which can save the workload of the database statistical information collection module 110 and can only analyze the current consumption of the current current SQL statements. In this case, the estimated consumption of SQL statements cannot be achieved, and early warning cannot be given.
可选的,上述抽象化语句的执行频次可以不做统计,这种情况下只需对分析为高消耗的SQL抽象化语句进行全量优化。Optionally, the execution frequency of the above abstract statements may not be counted. In this case, it is only necessary to perform full optimization on the SQL abstract statements analyzed as high consumption.
本发明实施例表明,通过获取线上生产数据库的表信息、测试数据库的表信息、测试过程中的数据库脚本信息,对数据库脚本信息中的语句进行预处理,得到数据库脚本信息对应的抽象化语句,并确定出抽象化语句的执行频次,根据生产数据库的表信息和/或测试数据库的表信息、数据库脚本信息对应的抽象化语句以及抽象化语句的执行计划,确定抽象化语句的消耗,根据抽象化语句的消耗和所述抽象化语句的执行频次,确定出数据库脚本的性能指标。由于结合了线上生产数据库的表信息和测试数据库的表信息来进行数据库脚本的性能测试,可以节省测试环境造数或准备准生产环境的人力及资源开销。The embodiment of the present invention shows that by obtaining the table information of the online production database, the table information of the test database, and the database script information in the test process, the sentences in the database script information are preprocessed to obtain the abstract sentences corresponding to the database script information , And determine the execution frequency of the abstract statement, and determine the consumption of the abstract statement according to the table information of the production database and/or the table information of the test database, the abstract statement corresponding to the database script information, and the execution plan of the abstract statement. The consumption of abstract sentences and the execution frequency of the abstract sentences determine the performance index of the database script. Because the table information of the online production database and the table information of the test database are combined to perform the performance test of the database script, it can save the manpower and resource expenses of the test environment or the preparation of the quasi-production environment.
需要说明的是,本发明实施例中的数据库脚本性能测试的流程可以应用于银行/金融机构中的各种数据库脚本性能测试中。It should be noted that the database script performance test process in the embodiment of the present invention can be applied to various database script performance tests in banks/financial institutions.
基于相同的技术构思,图4示例性的示出了本发明实施例提供的一种数据库脚本性能测试的装置的结构,该装置可以执行数据库脚本性能测试的流程,该装置可以位于上述服务器100内,也可以是该服务器100。Based on the same technical concept, FIG. 4 exemplarily shows the structure of a database script performance test device provided by an embodiment of the present invention. The device can execute the process of database script performance test. The device can be located in the server 100. , Or the server 100.
如图4所示,该装置具体包括:As shown in Figure 4, the device specifically includes:
获取单元401,用于获取线上生产数据库的表信息、测试数据库的表信息以及所述测试数据库测试过程中的数据库脚本信息;The obtaining unit 401 is configured to obtain table information of the online production database, table information of the test database, and database script information during the test database test process;
处理单元402,用于对所述数据库脚本信息中的语句进行预处理,得到所述数据库脚本信息对应的抽象化语句;并确定出所述抽象化语句的执行频次;根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗;根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标。The processing unit 402 is configured to preprocess the sentences in the database script information to obtain abstract sentences corresponding to the database script information; and determine the execution frequency of the abstract sentences; according to the table of the production database Information and/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence, determine the consumption of the abstraction sentence; according to the consumption sum of the abstraction sentence The execution frequency of the abstract sentence determines the performance index of the database script.
可选的,所述处理单元402具体用于:Optionally, the processing unit 402 is specifically configured to:
将所述数据库脚本信息的语句中的与实际业务相关的信息删除。The information related to the actual business in the sentence of the database script information is deleted.
可选的,所述抽象化语句的消耗包括当前消耗和预估消耗;所述生产数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;所述测试数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;Optionally, the consumption of the abstract sentence includes current consumption and estimated consumption; the table information of the production database includes data volume information, index information, index discrimination information, and data growth trend; table information of the test database Including data volume information, index information, index discrimination information, and data growth trends;
所述处理单元402具体用于:The processing unit 402 is specifically configured to:
对于存在生产数据库的表信息的抽象化语句,将生产数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的当前消耗;以及结合生产数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的预估消耗;For abstract sentences that exist in the table information of the production database, substituting the data volume information and/or index information and/or index discrimination information in the table information of the production database into the abstract statements that exist in the table information of the production database. The plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database The execution plan of the abstraction statement determines the estimated consumption of the abstraction statement that exists in the table information of the production database;
对于不存在生产数据库的表信息的抽象化语句,将所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息按照预设比例进行转换,将转换后的数据量信息和/或索引信息和/或索引区分度信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的当前消耗;以及结合所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在 生产数据库的表信息的抽象化语句的预估消耗。For abstract sentences that do not exist in the table information of the production database, the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present The information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined. The current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information The execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
可选的,所述处理单元402具体用于:Optionally, the processing unit 402 is specifically configured to:
根据所述抽象化语句的消耗和消耗阈值,确定出所述抽象化语句的转换后消耗;Determine the post-conversion consumption of the abstract sentence according to the consumption and consumption threshold of the abstract sentence;
根据所述抽象化语句的执行频次和频次阈值,确定出所述抽象化语句的转换后频次;Determine the converted frequency of the abstract sentence according to the execution frequency and the frequency threshold of the abstract sentence;
根据所述抽象化语句的消耗、所述抽象化语句的执行频次、所述抽象化语句的转换后消耗和所述抽象化语句的转换后频次,进行四象限分类,得到所述抽象化语句的性能指标。According to the consumption of the abstract sentence, the execution frequency of the abstract sentence, the post-conversion consumption of the abstract sentence, and the post-conversion frequency of the abstract sentence, a four-quadrant classification is performed to obtain the abstract sentence Performance.
可选的,所述处理单元402还用于:Optionally, the processing unit 402 is further configured to:
在所述确定出所述数据库脚本的性能指标之后,根据所述数据库脚本的性能指标、所述数据库的特征、所述生产数据库的表信息、所述测试数据库的表信息和所述抽象化语句的执行频次,确定出所述数据库脚本的优化方案。After the performance index of the database script is determined, according to the performance index of the database script, the characteristics of the database, the table information of the production database, the table information of the test database, and the abstract sentence The execution frequency of the database script determines the optimization plan of the database script.
基于相同的技术构思,本发明实施例还提供了一种计算设备,包括:Based on the same technical concept, an embodiment of the present invention also provides a computing device, including:
存储器,用于存储程序指令;Memory, used to store program instructions;
处理器,用于调用所述存储器中存储的程序指令,按照获得的程序执行上述数据库脚本性能测试的方法。The processor is configured to call the program instructions stored in the memory, and execute the above-mentioned database script performance test method according to the obtained program.
基于相同的技术构思,本发明实施例还提供了一种计算机可读非易失性存储介质,包括计算机可读指令,当计算机读取并执行所述计算机可读指令时,使得计算机执行上述数据库脚本性能测试的方法。Based on the same technical concept, the embodiments of the present invention also provide a computer-readable non-volatile storage medium, including computer-readable instructions. When the computer reads and executes the computer-readable instructions, the computer executes the above-mentioned database. Script performance testing method.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流 程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowcharts and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present invention. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment can be generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。Although the preferred embodiments of the present invention have been described, those skilled in the art can make additional changes and modifications to these embodiments once they learn the basic creative concept. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and all changes and modifications falling within the scope of the present invention.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. In this way, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalent technologies, the present invention is also intended to include these modifications and variations.

Claims (12)

  1. 一种数据库脚本性能测试的方法,其特征在于,包括:A method for testing database script performance, which is characterized in that it includes:
    获取线上生产数据库的表信息、测试数据库的表信息以及所述测试数据库测试过程中的数据库脚本信息;Acquiring table information of the online production database, table information of the test database, and database script information during the test process of the test database;
    对所述数据库脚本信息中的语句进行预处理,得到所述数据库脚本信息对应的抽象化语句;并确定出所述抽象化语句的执行频次;Preprocessing the sentences in the database script information to obtain abstract sentences corresponding to the database script information; and determining the execution frequency of the abstract sentences;
    根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗;Determine the consumption of the abstraction sentence according to the table information of the production database and/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence;
    根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标。According to the consumption of the abstract sentences and the execution frequency of the abstract sentences, the performance index of the database script is determined.
  2. 如权利要求1所述的方法,其特征在于,所述对所述数据库脚本信息中的语句进行预处理,包括:The method according to claim 1, wherein the preprocessing the sentences in the database script information comprises:
    将所述数据库脚本信息的语句中的与实际业务相关的信息删除。The information related to the actual business in the sentence of the database script information is deleted.
  3. 如权利要求1所述的方法,其特征在于,所述抽象化语句的消耗包括当前消耗和预估消耗;所述生产数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;所述测试数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;The method of claim 1, wherein the consumption of the abstract sentence includes current consumption and estimated consumption; the table information of the production database includes data volume information, index information, index discrimination information, and data growth Trend; the table information of the test database includes data volume information, index information, index discrimination information, and data growth trends;
    所述根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗,包括:Determining the consumption of the abstracted statement according to the table information of the production database and/or the table information of the test database, the abstracted statement corresponding to the database script information, and the execution plan of the abstracted statement, include:
    对于存在生产数据库的表信息的抽象化语句,将生产数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的当前消耗;以及结合生产数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述存在生产数据库的表信息的抽象化语句的 执行计划,确定出所述存在生产数据库的表信息的抽象化语句的预估消耗;For abstract sentences that exist in the table information of the production database, substituting the data volume information and/or index information and/or index discrimination information in the table information of the production database into the abstract statements that exist in the table information of the production database. The plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database The execution plan of the abstract statement determines the estimated consumption of the abstract statement in the table information of the production database;
    对于不存在生产数据库的表信息的抽象化语句,将所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息按照预设比例进行转换,将转换后的数据量信息和/或索引信息和/或索引区分度信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的当前消耗;以及结合所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的预估消耗。For abstract sentences that do not exist in the table information of the production database, the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present The information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined. The current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information The execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
  4. 如权利要求1所述的方法,其特征在于,所述根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标,包括:3. The method of claim 1, wherein the determining the performance index of the database script according to the consumption of the abstraction sentence and the execution frequency of the abstraction sentence comprises:
    根据所述抽象化语句的消耗和消耗阈值,确定出所述抽象化语句的转换后消耗;Determine the post-conversion consumption of the abstract sentence according to the consumption and consumption threshold of the abstract sentence;
    根据所述抽象化语句的执行频次和频次阈值,确定出所述抽象化语句的转换后频次;Determine the converted frequency of the abstract sentence according to the execution frequency and the frequency threshold of the abstract sentence;
    根据所述抽象化语句的消耗、所述抽象化语句的执行频次、所述抽象化语句的转换后消耗和所述抽象化语句的转换后频次,进行四象限分类,得到所述抽象化语句的性能指标。According to the consumption of the abstract sentence, the execution frequency of the abstract sentence, the post-conversion consumption of the abstract sentence, and the post-conversion frequency of the abstract sentence, a four-quadrant classification is performed to obtain the abstract sentence Performance.
  5. 如权利要求1至4任一项所述的方法,其特征在于,在所述确定出所述数据库脚本的性能指标之后,还包括:The method according to any one of claims 1 to 4, wherein after the determining the performance index of the database script, the method further comprises:
    根据所述数据库脚本的性能指标、所述数据库的特征、所述生产数据库的表信息、所述测试数据库的表信息和所述抽象化语句的执行频次,确定出所述数据库脚本的优化方案。According to the performance index of the database script, the characteristics of the database, the table information of the production database, the table information of the test database, and the execution frequency of the abstract sentence, an optimization plan for the database script is determined.
  6. 一种数据库脚本性能测试的装置,其特征在于,包括:A device for testing database script performance, which is characterized in that it comprises:
    获取单元,用于获取线上生产数据库的表信息、测试数据库的表信息以 及所述测试数据库测试过程中的数据库脚本信息;The obtaining unit is used to obtain table information of the online production database, table information of the test database, and database script information in the test database test process;
    处理单元,用于对所述数据库脚本信息中的语句进行预处理,得到所述数据库脚本信息对应的抽象化语句;并确定出所述抽象化语句的执行频次;根据所述生产数据库的表信息和/或所述测试数据库的表信息、所述数据库脚本信息对应的抽象化语句以及所述抽象化语句的执行计划,确定所述抽象化语句的消耗;根据所述抽象化语句的消耗和所述抽象化语句的执行频次,确定出所述数据库脚本的性能指标。The processing unit is used to preprocess the sentences in the database script information to obtain the abstract sentences corresponding to the database script information; and determine the execution frequency of the abstract sentences; according to the table information of the production database And/or the table information of the test database, the abstraction sentence corresponding to the database script information, and the execution plan of the abstraction sentence, determine the consumption of the abstraction sentence; The execution frequency of the abstract statement determines the performance index of the database script.
  7. 如权利要求6所述的装置,其特征在于,所述处理单元具体用于:The device according to claim 6, wherein the processing unit is specifically configured to:
    将所述数据库脚本信息的语句中的与实际业务相关的信息删除。The information related to the actual business in the sentence of the database script information is deleted.
  8. 如权利要求6所述的装置,其特征在于,所述抽象化语句的消耗包括当前消耗和预估消耗;所述生产数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;所述测试数据库的表信息包括数据量信息、索引信息、索引区分度信息、数据增长趋势;7. The device of claim 6, wherein the consumption of the abstract sentence includes current consumption and estimated consumption; the table information of the production database includes data volume information, index information, index discrimination information, and data growth Trend; the table information of the test database includes data volume information, index information, index discrimination information, and data growth trends;
    所述处理单元具体用于:The processing unit is specifically used for:
    对于存在生产数据库的表信息的抽象化语句,将生产数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的当前消耗;以及结合生产数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所述存在生产数据库的表信息的抽象化语句的执行计划,确定出所述存在生产数据库的表信息的抽象化语句的预估消耗;For abstract sentences that exist in the table information of the production database, substituting the data volume information and/or index information and/or index discrimination information in the table information of the production database into the abstract statements that exist in the table information of the production database. The plan is to determine the current consumption of abstract sentences in the table information of the production database; and combine the data growth trend in the table information of the production database to substitute estimated recent data statistics into the table information of the production database The execution plan of the abstraction statement determines the estimated consumption of the abstraction statement that exists in the table information of the production database;
    对于不存在生产数据库的表信息的抽象化语句,将所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据量信息和/或索引信息和/或索引区分度信息按照预设比例进行转换,将转换后的数据量信息和/或索引信息和/或索引区分度信息代入所述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的当前消耗;以及结合所述不存在生产数据库的表信息的抽象化语句对应的测试数据库的表信息中的数据增长趋势将预估的近期数据统计信息代入所 述不存在生产数据库的表信息的抽象化语句的执行计划,确定出所述不存在生产数据库的表信息的抽象化语句的预估消耗。For abstract sentences that do not exist in the table information of the production database, the data volume information and/or index information and/or index discrimination degree in the table information of the test database corresponding to the abstract sentences in the table information of the production database are not present The information is converted according to a preset ratio, and the converted data volume information and/or index information and/or index discrimination information are substituted into the execution plan of the abstract statement that does not exist in the table information of the production database, and the non-existence is determined. The current consumption of abstract sentences that exist in the table information of the production database; and the data growth trend in the table information of the test database corresponding to the abstract sentences that do not exist in the table information of the production database is substituted into the estimated recent data statistics information The execution plan of the abstract sentence without the table information of the production database determines the estimated consumption of the abstract sentence without the table information of the production database.
  9. 如权利要求6所述的装置,其特征在于,所述处理单元具体用于:The device according to claim 6, wherein the processing unit is specifically configured to:
    根据所述抽象化语句的消耗和消耗阈值,确定出所述抽象化语句的转换后消耗;Determine the post-conversion consumption of the abstract sentence according to the consumption and consumption threshold of the abstract sentence;
    根据所述抽象化语句的执行频次和频次阈值,确定出所述抽象化语句的转换后频次;Determine the converted frequency of the abstract sentence according to the execution frequency and the frequency threshold of the abstract sentence;
    根据所述抽象化语句的消耗、所述抽象化语句的执行频次、所述抽象化语句的转换后消耗和所述抽象化语句的转换后频次,进行四象限分类,得到所述抽象化语句的性能指标。According to the consumption of the abstract sentence, the execution frequency of the abstract sentence, the post-conversion consumption of the abstract sentence, and the post-conversion frequency of the abstract sentence, a four-quadrant classification is performed to obtain the abstract sentence Performance.
  10. 如权利要求6至9任一项所述的装置,其特征在于,所述处理单元还用于:The device according to any one of claims 6 to 9, wherein the processing unit is further configured to:
    在所述确定出所述数据库脚本的性能指标之后,根据所述数据库脚本的性能指标、所述数据库的特征、所述生产数据库的表信息、所述测试数据库的表信息和所述抽象化语句的执行频次,确定出所述数据库脚本的优化方案。After the performance index of the database script is determined, according to the performance index of the database script, the characteristics of the database, the table information of the production database, the table information of the test database, and the abstract sentence The execution frequency of the database script determines the optimization plan of the database script.
  11. 一种计算设备,其特征在于,包括:A computing device, characterized by comprising:
    存储器,用于存储程序指令;Memory, used to store program instructions;
    处理器,用于调用所述存储器中存储的程序指令,按照获得的程序执行权利要求1至5任一项所述的方法。The processor is configured to call the program instructions stored in the memory, and execute the method according to any one of claims 1 to 5 according to the obtained program.
  12. 一种计算机可读非易失性存储介质,其特征在于,包括计算机可读指令,当计算机读取并执行所述计算机可读指令时,使得计算机执行如权利要求1至5任一项所述的方法。A computer-readable non-volatile storage medium, characterized by comprising computer-readable instructions, when the computer reads and executes the computer-readable instructions, the computer is caused to execute any one of claims 1 to 5 Methods.
PCT/CN2020/096905 2019-06-26 2020-06-18 Database script performance testing method and device WO2020259391A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910562575.5 2019-06-26
CN201910562575.5A CN110287114B (en) 2019-06-26 2019-06-26 Method and device for testing performance of database script

Publications (1)

Publication Number Publication Date
WO2020259391A1 true WO2020259391A1 (en) 2020-12-30

Family

ID=68006171

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096905 WO2020259391A1 (en) 2019-06-26 2020-06-18 Database script performance testing method and device

Country Status (2)

Country Link
CN (1) CN110287114B (en)
WO (1) WO2020259391A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287114B (en) * 2019-06-26 2021-06-04 深圳前海微众银行股份有限公司 Method and device for testing performance of database script
CN112035513A (en) * 2020-09-02 2020-12-04 中国平安人寿保险股份有限公司 SQL statement performance optimization method, device, terminal and storage medium
CN112181831A (en) * 2020-09-28 2021-01-05 中国平安财产保险股份有限公司 Script performance verification method, device and equipment based on keywords and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080250046A1 (en) * 2005-03-15 2008-10-09 International Business Machines Corporation Analysis of performance data from a relational database system for applications using stored procedures or sql
CN102486748A (en) * 2010-12-02 2012-06-06 金蝶软件(中国)有限公司 Method and device for performance test
CN105760286A (en) * 2016-02-17 2016-07-13 中国工商银行股份有限公司 Application database dynamic property detection method and detection device
CN107844425A (en) * 2017-11-16 2018-03-27 中国银行股份有限公司 A kind of database statement inspection method and device
CN110287114A (en) * 2019-06-26 2019-09-27 深圳前海微众银行股份有限公司 A kind of method and device of database script performance test

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7860899B2 (en) * 2007-03-26 2010-12-28 Oracle International Corporation Automatically determining a database representation for an abstract datatype
US20100306591A1 (en) * 2009-06-01 2010-12-02 Murali Mallela Krishna Method and system for performing testing on a database system
CN103399851B (en) * 2013-06-25 2017-02-08 上海携程商务有限公司 Method and system for analyzing and predicting performance of structured query language (SQL) scrip
CN103390066B (en) * 2013-08-08 2016-02-17 上海新炬网络信息技术有限公司 A kind of database overall automation optimizes prior-warning device and disposal route thereof
CN103617273B (en) * 2013-12-05 2016-09-14 用友网络科技股份有限公司 The method and system of SQL scripting object
CN103761080B (en) * 2013-12-25 2017-02-15 中国农业大学 Structured query language (SQL) based MapReduce operation generating method and system
CN108984374B (en) * 2017-06-01 2021-05-07 腾讯科技(深圳)有限公司 Method and system for testing database performance
CN108170775A (en) * 2017-12-26 2018-06-15 上海新炬网络技术有限公司 A kind of database SQL indexes dynamic optimization method
CN108388626A (en) * 2018-02-12 2018-08-10 平安科技(深圳)有限公司 SQL automatic optimization methods, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080250046A1 (en) * 2005-03-15 2008-10-09 International Business Machines Corporation Analysis of performance data from a relational database system for applications using stored procedures or sql
CN102486748A (en) * 2010-12-02 2012-06-06 金蝶软件(中国)有限公司 Method and device for performance test
CN105760286A (en) * 2016-02-17 2016-07-13 中国工商银行股份有限公司 Application database dynamic property detection method and detection device
CN107844425A (en) * 2017-11-16 2018-03-27 中国银行股份有限公司 A kind of database statement inspection method and device
CN110287114A (en) * 2019-06-26 2019-09-27 深圳前海微众银行股份有限公司 A kind of method and device of database script performance test

Also Published As

Publication number Publication date
CN110287114A (en) 2019-09-27
CN110287114B (en) 2021-06-04

Similar Documents

Publication Publication Date Title
WO2020259391A1 (en) Database script performance testing method and device
CN107633265B (en) Data processing method and device for optimizing credit evaluation model
CN106845717B (en) Energy efficiency evaluation method based on multi-model fusion strategy
CN111240662A (en) Spark machine learning system and learning method based on task visual dragging
CN105373606A (en) Unbalanced data sampling method in improved C4.5 decision tree algorithm
CN108470022A (en) A kind of intelligent work order quality detecting method based on operation management
CN112417176B (en) Method, equipment and medium for mining implicit association relation between enterprises based on graph characteristics
WO2021128523A1 (en) Technology readiness level determination method and system based on science and technology big data
Zhang et al. Fast fine-grained air quality index level prediction using random forest algorithm on cluster computing of spark
CN109376247A (en) A kind of software defect automatic classification method based on correlation rule
CN110225055A (en) A kind of network flow abnormal detecting method and system based on KNN semi-supervised learning model
CN103593470A (en) Double-degree integrated unbalanced data stream classification algorithm
CN109255029A (en) A method of automatic Bug report distribution is enhanced using weighted optimization training set
CN110737805A (en) Method and device for processing graph model data and terminal equipment
CN104537383A (en) Massive organizational structure data classification method and system based on particle swarm
CN112785156B (en) Industrial collar and sleeve identification method based on clustering and comprehensive evaluation
CN109840536A (en) A kind of power grid power supply reliability horizontal clustering method and system
WO2023082788A1 (en) Method and apparatus for predicting oxygen content in flue gas and load, method and apparatus for selecting prediction model, and method and apparatus for predicting flue gas emission
WO2022143431A1 (en) Method and apparatus for training anti-money laundering model
CN116185684A (en) Construction method and application of server fault root cause analysis model
CN111861056A (en) Evaluation method for technical innovation level
CN115470279A (en) Data source conversion method, device, equipment and medium based on enterprise data
Liu et al. Application of master data classification model in enterprises
Zou Research on data cleaning in big data environment
CN112948469A (en) Data mining method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20830942

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20830942

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 060422)

122 Ep: pct application non-entry in european phase

Ref document number: 20830942

Country of ref document: EP

Kind code of ref document: A1