CA3130988A1 - Method and device for identifying repetitive association calculation and computer system - Google Patents

Method and device for identifying repetitive association calculation and computer system Download PDF

Info

Publication number
CA3130988A1
CA3130988A1 CA3130988A CA3130988A CA3130988A1 CA 3130988 A1 CA3130988 A1 CA 3130988A1 CA 3130988 A CA3130988 A CA 3130988A CA 3130988 A CA3130988 A CA 3130988A CA 3130988 A1 CA3130988 A1 CA 3130988A1
Authority
CA
Canada
Prior art keywords
correlation
sql statement
query
datasheet
identifying
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3130988A
Other languages
French (fr)
Inventor
Qingyan Ding
Wei Xu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3130988A1 publication Critical patent/CA3130988A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application makes public a method of and a device for identifying repeated correlation calculation, and a computer system. The method comprises obtaining a first SQL
statement and a second SQL statement to be identified; analyzing the first SQL
statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement; analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.

Description

METHOD AND DEVICE FOR IDENTIFYING REPETITIVE ASSOCIATION
CALCULATION AND COMPUTER SYSTEM
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the field of data processing technology, and more particularly to a method of identifying repeated correlation calculation, a device for identifying repeated correlation calculation, and a computer system.
Description of Related Art
[0002] In such data processing scenarios as big data offline tasks, great quantities of SQL
statements should be processed. During the process of executing the great quantities of SQL statements, repeated correlation calculation of two datasheets usually occurs. Such repeated correlation calculations cause great losses to computation resources and storage resources, severely affect the operating efficiency of data platforms, and increase the operation costs of data platforms. Accordingly, there is an urgent need for a method of identifying repeated correlation calculations included in plural SQL
statements.
SUMMARY OF THE INVENTION
[0003] In order to address the deficiencies in the state of the art, a major objective of the present invention is to provide a method of identifying repeated correlation calculation, a device for identifying repeated correlation calculation, and a computer system, so as to identify repeated correlation calculations included in SQL statements.
[0004] In order to achieve the above objective, according to the first aspect of the present invention, there is provided a method of identifying repeated correlation calculation, and the method comprises:

Date Recue/Date Received 2021-11-16
[0005] obtaining a first SQL statement and a second SQL statement to be identified;
[0006] analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement;
[0007] analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and
[0008] determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
[0009] In some embodiments, the correlation calculation includes corresponding datasheets and a correlation relation keyword, wherein the correlation relation keyword is employed to describe correlation calculation required to be performed between datasheets, and the step of analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement includes:
[0010] analyzing the first SQL statement, and identifying a first correlation relation keyword included in the first SQL statement, and a first datasheet and a second datasheet to which the first correlation relation keyword corresponds; and
[0011] determining first correlation calculation included in the first correlation query according to the first datasheet, the second datasheet and the first correlation relation keyword.
[0012] In some embodiments, the step of analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement includes:
[0013] analyzing the first SQL statement, and generating json data to which the first SQL
statement corresponds; and
[0014] identifying the first correlation query included in the first SQL
statement according to the json data.
[0015] In some embodiments, the second SQL statement includes a sub-query and a correlation Date Recue/Date Received 2021-11-16 query between the sub-query and a third datasheet, and the step of analyzing the second SQL statement, and identifying a second correlation query included in the second SQL
statement includes:
[0016] analyzing the second SQL statement, and identifying a second correlation relation keyword included in the sub-query, and a fourth datasheet and a fifth datasheet to which the second correlation relation keyword corresponds;
[0017] determining second correlation calculation included in the second correlation query according to the second correlation relation keyword, and the fourth datasheet and the fifth datasheet to which the second correlation relation keyword corresponds;
[0018] identifying a third correlation relation keyword included in the second SQL statement, and the third datasheet and the sub-query to which the third correlation relation keyword corresponds;
[0019] determining third correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fourth datasheet; and
[0020] determining fourth correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fifth datasheet.
[0021] In some embodiments, the first SQL statement and the second SQL
statement include corresponding datasheets to be processed, and the method comprises:
[0022] replacing, when the corresponding datasheets to be processed are temporary sheets, the datasheets to be processed with corresponding entity sheets.
[0023] In some embodiments, the step of determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query includes:
[0024] grouping the correlation calculations according to the correlation relation keywords; and Date Recue/Date Received 2021-11-16
[0025] determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when any group includes identical correlation calculation of corresponding datasheets.
[0026] According to the second aspect of the present invention, the present application provides a device for identifying repeated correlation calculation, and the device comprises:
[0027] an obtaining module, for obtaining a first SQL statement and a second SQL statement to be identified;
[0028] an analyzing module, for analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement; analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and
[0029] a processing module, for determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
[0030] In some embodiments, the analyzing module can be further employed for analyzing the first SQL statement, and identifying a first correlation relation keyword included in the first SQL statement, and a first datasheet and a second datasheet to which the first correlation relation keyword corresponds; and determining first correlation calculation included in the first correlation query according to the first datasheet, the second datasheet and the first correlation relation keyword.
[0031] In some embodiments, the second SQL statement includes a sub-query and a correlation query between the sub-query and a third datasheet, and the analyzing module can be further employed for analyzing the second SQL statement, and identifying a second correlation relation keyword included in the sub-query, and a fourth datasheet and a fifth datasheet to Date Recue/Date Received 2021-11-16 which the second correlation relation keyword corresponds; determining second correlation calculation included in the second correlation query according to the second correlation relation keyword, and the fourth datasheet and the fifth datasheet to which the second correlation relation keyword corresponds; identifying a third correlation relation keyword included in the analyzed second SQL statement, wherein the fourth correlation relation keyword is employed to describe correlation query of the sub-query and the third datasheet; determining third correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fourth datasheet; and determining fourth correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fifth datasheet.
[0032] According to the third aspect of the present invention, the present application provides a computer system, and the system comprises:
[0033] one or more processor(s);
[0034] a memory, associated with the one or more processor(s), wherein the memory is employed to store a program instruction, and the program instruction performs the following operations when it is read and executed by the one or more processor(s):
[0035] obtaining a first SQL statement and a second SQL statement to be identified;
[0036] analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement;
[0037] analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and
[0038] determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
[0039] The present invention achieves the following advantageous effects.
Date Recue/Date Received 2021-11-16
[0040] The present application provides a method of identifying repeated correlation calculation, and the method comprises obtaining a first SQL statement and a second SQL
statement to be identified; analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement; analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL
statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query. The method is applicable to the identification as to whether plural SQL statements contain repeated correlation calculations, so as to subsequently optimize and readjust the SQL statements that contain repeated correlation calculations, and to further enhance operating efficiency of the data platform.
BRIEF DESCRIPTION OF THE DRAWINGS
[0041] To more clearly describe the technical solutions in the embodiments of the present invention, drawings required to illustrate the embodiments are briefly introduced below.
Apparently, the drawings introduced below are merely directed to some embodiments of the present invention, while persons ordinarily skilled in the art may further acquire other drawings on the basis of these drawings without spending creative effort in the process.
[0042] Fig. 1 is a flowchart illustrating identification of repeated correlation calculation provided by an embodiment of the present application;
[0043] Fig. 2 is a flowchart illustrating identification of repeated correlation calculation of a task provided by an embodiment of the present application;

Date Recue/Date Received 2021-11-16
[0044] Fig. 3 is a flowchart illustrating a method provided by an embodiment of the present application;
[0045] Fig. 4 is a view illustrating the structure of a device provided by an embodiment of the present application; and
[0046] Fig. 5 is a view illustrating the structure of a computer system provided by an embodiment of the present application.
DETAILED DESCRIPTION OF THE INVENTION
[0047] To make more lucid and clear the objectives, technical solutions and advantages of the present invention, the technical solutions in the embodiments of the present invention will be clearly and comprehensively described below with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the embodiments as described are merely partial, rather than the entire, embodiments of the present invention.
Any other embodiments makeable by persons ordinarily skilled in the art on the basis of the embodiments in the present invention without creative effort shall all fall within the protection scope of the present invention.
[0048] As noted in the Description of Related Art, to solve the aforementioned problems, the present application provides a method of identifying repeated correlation calculation, as shown in Figs. 1 and 2, identification of repeated correlation calculation employing this method includes the following steps.
[0049] Step 1 ¨ obtaining an SQL statement to be analyzed.
[0050] Udf function can be used to extract an original SQL query task from a Hive task and a sparkSQL task, and the SQL statement to be analyzed is subsequently extracted from the Date Recue/Date Received 2021-11-16 original task.
[0051] Step 2 ¨ analyzing the SQL statement, and respectively generating corresponding Json data according to a correlation query obtained by analysis.
[0052] The udf function can be developed by using antlr technique for analyzing SQL
statements.
[0053] The udf function can analyze and obtain, on the basis of the input SQL
statement, correlation relation keywords included in the SQL statement, datasheets to be enquired to which the correlation relation keywords correspond, a database in which the datasheets to be enquired locate, and correlation conditions to which the correlation relation keywords correspond, and base on all data obtained by analysis to generate the corresponding Json data.
[0054] When the datasheets to be enquired include temporary sheets, they can be replaced with corresponding entity sheets.
[0055] A correlation relation keyword can include Join and union. Join includes JOIN, LEFT
JOIN, INNER JOIN, RIGHT JOIN, CROSS JOIN, FULL JOIN and NOT JOIN. In order to ensure precision of identification, corresponding analyzing rules containing various types of SQL statements can be preset, and the types include SQL statements containing JOIN, SQL statements containing UNION, and SQL statements simultaneously containing JOIN
and UNION or sub-queries. When an SQL statement obtained by analysis contains a correlation relation keyword, a corresponding analyzing rule is based on to further analyze the SQL statement, so as to identify such data as the datasheet to be enquired to which the correlation relation keyword corresponds, the database in which the datasheet to be enquired locates, and the correlation condition to which the correlation relation keyword corresponds.

Date Recue/Date Received 2021-11-16
[0056] Step 3 ¨ determining a correlation query included in the SQL statement according to the Json data to which the SQL statement corresponds.
[0057] In order to identify the correlation query, one or more correlation calculation(s) included in the correlation query should be identified, the correlation relation keyword and the corresponding datasheet included in the SQL statement can be determined according to the data contained in the Json data, and the correlation relation keyword and the corresponding datasheet constitutes one correlation calculation.
[0058] For instance, as is obtained according to Json data, a first SQL
statement includes a correlation query of table A and a sub-query of database Ti, and the sub-query includes a correlation query of table B and table C. The correlation relation keyword of table B and table C is LEFT JOIN; the correlation condition of table A and the sub-query is Union, and the correlation condition is a first correlation condition. It can then be determined that the first correlation relation keyword included in this SQL statement is JOIN, the corresponding datasheets to be enquired are table B and table C, and the database in which the datasheets to be enquired locate is Ti; the second correlation relation keyword as obtained is Union, the corresponding datasheets to be enquired are table A and table B, and the correlation condition is the first correlation condition; the third correlation relation keyword is Union, the corresponding datasheets to be enquired are table A and table C, and the correlation condition is the first correlation condition. Then, the first SQL statement includes three times of correlation calculations, which are respectively Union correlation calculation of table A and table B, Union correlation calculation of table A
and table C, and JOIN correlation calculation of table B and table C.
[0059] As is obtained according to Json data, the SQL statement includes a correlation query of table B and table C of database Ti, and the correlation relation keyword is RIGHT JOIN, then the correlation calculation included in the second SQL statement is JOIN
correlation Date Recue/Date Received 2021-11-16 calculation of table B and table C.
[0060] Step 4 ¨ grouping the json data according to the included correlation relation keyword, and counting in the same group whether identical json data of the included datasheets to be processed occurs and the number of times of such occurrence.
[0061] Step 5 ¨ judging, when the number of times of occurrence is not less than a preset threshold, that the identical json data of the included datasheets to be processed contains repeated correlation calculation.
[0062] Preferably, when the number of times of occurrence is not less than 1, it is judged that the identical json data of the included datasheets to be processed contains repeated correlation calculation.
[0063] Since the first SQL statement and the second SQL statement both include JOIN
correlation calculation of table B and table C, it can be judged that there is repeated correlation calculation existent in the first SQL statement and the second SQL
statement.
[0064] A hive table can be generated according to identified repeated correlation calculations and the corresponding SQL statements, and submitted to technical personnel for reference, so that the technical personnel can optimize and readjust the SQL statements and the process of enquiring the SQL statements to thereby enhance operating efficiency of the system.
Embodiment 2
[0065] Corresponding to the aforementioned embodiment, the present application proposes a method of identifying repeated correlation calculation, as shown in Fig. 3, the method comprises:
Date Recue/Date Received 2021-11-16
[0066] 310 - obtaining a first SQL statement and a second SQL statement to be identified;
[0067] preferably, the first SQL statement and the second SQL statement include corresponding datasheets to be processed;
[0068] 311 ¨ replacing, when the corresponding datasheets to be processed are temporary sheets, the datasheets to be processed with corresponding entity sheets.
[0069] 320 ¨ analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL
statement;
[0070] preferably, the correlation calculation includes corresponding datasheets and a correlation relation keyword, wherein the correlation relation keyword is employed to describe correlation calculation required to be performed between datasheets, and the step of analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement includes:
[0071] 321 ¨ analyzing the first SQL statement, and identifying a first correlation relation keyword included in the first SQL statement, and a first datasheet and a second datasheet to which the first correlation relation keyword corresponds;
[0072] 322 ¨ determining first correlation calculation included in the first correlation query according to the first datasheet, the second datasheet and the first correlation relation keyword.
[0073] Preferably, the step of analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement includes:
[0074] 323 ¨ analyzing the first SQL statement, and generating json data to which the first SQL
statement corresponds;
[0075] 324 ¨ identifying the first correlation query included in the first SQL
statement according to the json data.
[0076] 330 ¨ analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement;
[0077] preferably, the second SQL statement includes a sub-query and a correlation query between the sub-query and a third datasheet, and the step of analyzing the second SQL

Date Recue/Date Received 2021-11-16 statement, and identifying a second correlation query included in the second SQL statement includes:
[0078] 331 ¨ analyzing the second SQL statement, and identifying a second correlation relation keyword included in the sub-query, and a fourth datasheet and a fifth datasheet to which the second correlation relation keyword corresponds;
[0079] 332 ¨ determining second correlation calculation included in the second correlation query according to the second correlation relation keyword, and the fourth datasheet and the fifth datasheet to which the second correlation relation keyword corresponds;
[0080] 333 ¨ identifying a third correlation relation keyword included in the second SQL
statement, and the third datasheet and the sub-query to which the third correlation relation keyword corresponds;
[0081] 334 ¨ determining third correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fourth datasheet;
[0082] 335 ¨ determining fourth correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fifth datasheet.
[0083] 340 ¨ determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
[0084] Preferably, the step of determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query includes:
[0085] 341 ¨ grouping the correlation calculations according to the correlation relation keywords;
[0086] 342 ¨ determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when any group includes identical correlation calculation of corresponding datasheets.

Date Recue/Date Received 2021-11-16 Embodiment 3
[0087] Corresponding to the above method, the present application proposes a device for identifying repeated correlation calculation, as shown in Fig. 4, the device comprises:
[0088] an obtaining module 410, for obtaining a first SQL statement and a second SQL
statement to be identified;
[0089] an analyzing module 420, for analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement; analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and
[0090] a processing module 430, for determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
[0091] Preferably, the analyzing module 420 can be further employed for analyzing the first SQL statement, and identifying a first correlation relation keyword included in the first SQL statement, and a first datasheet and a second datasheet to which the first correlation relation keyword corresponds; and determining first correlation calculation included in the first correlation query according to the first datasheet, the second datasheet and the first correlation relation keyword.
[0092] Preferably, the second SQL statement includes a sub-query and a correlation query between the sub-query and a third datasheet, and the analyzing module 420 can be further employed for analyzing the second SQL statement, and identifying a second correlation relation keyword included in the sub-query, and a fourth datasheet and a fifth datasheet to which the second correlation relation keyword corresponds; determining second correlation calculation included in the second correlation query according to the second Date Recue/Date Received 2021-11-16 correlation relation keyword, and the fourth datasheet and the fifth datasheet to which the second correlation relation keyword corresponds; identifying a third correlation relation keyword included in the analyzed second SQL statement, wherein the fourth correlation relation keyword is employed to describe correlation query of the sub-query and the third datasheet; determining third correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fourth datasheet; and determining fourth correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fifth datasheet.
[0093] Preferably, the analyzing module 420 can be further employed for analyzing the first SQL statement, and generating json data to which the first SQL statement corresponds; and identifying the first correlation query included in the first SQL statement according to the json data.
[0094] Preferably, the first SQL statement and the second SQL statement include corresponding datasheets to be processed, and the obtaining module 410 can be further employed for replacing, when the corresponding datasheets to be processed are temporary sheets, the datasheets to be processed with corresponding entity sheets.
[0095] Preferably, the processing module 430 can be further employed for grouping the correlation calculations according to the correlation relation keywords; and determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when any group includes identical correlation calculation of corresponding datasheets.
Embodiment 4
[0096] Corresponding to the aforementioned method, device and system, embodiment 4 of the Date Recue/Date Received 2021-11-16 present application provides a computer system that comprises one or more processor(s);
and a memory, associated with the one or more processor(s), wherein the memory is employed to store a program instruction, and the program instruction performs the following operations when it is read and executed by the one or more processor(s):
[0097] obtaining a first SQL statement and a second SQL statement to be identified;
[0098] analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement;
[0099] analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and
[0100] determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
[0101] Fig. 5 exemplarily illustrates the framework of the computer system that can specifically include a processor 1510, a video display adapter 1511, a magnetic disk driver 1512, an input/output interface 1513, a network interface 1514, and a memory 1520. The processor 1510, the video display adapter 1511, the magnetic disk driver 1512, the input/output interface 1513, the network interface 1514, and the memory 1520 can be communicably connected with one another via a communication bus 1530.
[0102] The processor 1510 can be embodied in such a way as a general CPU
(Central Processing Unit), a microprocessor, an ASIC (Application Specific Integrated Circuit), or one or more integrated circuit(s) for executing relevant program(s) to realize the technical solutions provided by the present application.
[0103] The memory 1520 can be embodied in such a form as an ROM (Read Only Memory), an RAM (Random Access Memory), a static storage device, or a dynamic storage device.
The memory 1520 can store an operating system 1521 for controlling the running of a Date Recue/Date Received 2021-11-16 computer system 1500, and a basic input/output system (BIOS) for controlling lower-level operations of the computer system 1500. In addition, the memory 1520 can also store a web browser 1523, a data storage management system 1524, and an icon font processing system 1525, etc. The icon font processing system 1525 can be an application program that specifically realizes the aforementioned various step operations in the embodiments of the present application. To sum it up, when the technical solutions provided by the present application are to be realized via software or firmware, the relevant program codes are stored in the memory 1520, and invoked and executed by the processor 1510.
[0104] The input/output interface 1513 is employed to connect with an input/output module to realize input and output of information. The input/output module can be equipped in the device as a component part (not shown in the drawings), and can also be externally connected with the device to provide corresponding functions. The input means can include a keyboard, a mouse, a touch screen, a microphone, and various sensors etc., and the output means can include a display, a loudspeaker, a vibrator, an indicator light etc.
[0105] The network interface 1514 is employed to connect to a communication module (not shown in the drawings) to realize intercommunication between the current device and other devices. The communication module can realize communication in a wired mode (via USB, network cable, for example) or in a wireless mode (via mobile network, WIFI, Bluetooth, etc.).
[0106] The bus 1530 includes a passageway transmitting information between various component parts of the device (such as the processor 1510, the video display adapter 1511, the magnetic disk driver 1512, the input/output interface 1513, the network interface 1514, and the memory 1520).
[0107] Additionally, the computer system 1500 may further obtain information of specific collection conditions from a virtual resource object collection condition information Date Recue/Date Received 2021-11-16 database 1541 for judgment on conditions, and so on.
[0108] As should be noted, although merely the processor 1510, the video display adapter 1511, the magnetic disk driver 1512, the input/output interface 1513, the network interface 1514, the memory 1520, and the bus 1530 are illustrated for the aforementioned device, the device may further include other component parts prerequisite for realizing normal running during specific implementation. In addition, as can be understood by persons skilled in the art, the aforementioned device may as well only include component parts necessary for realizing the solutions of the present application, without including the entire component parts as illustrated.
[0109] As can be known through the description to the aforementioned embodiments, it is clearly learnt by person skilled in the art that the present application can be realized through software plus a required general hardware platform. Based on such understanding, the technical solutions of the present application, or the contributions made thereby over the state of the art, can be essentially embodied in the form of a software product, and such a computer software product can be stored in a storage medium, such as an ROM/RAM, a magnetic disk, an optical disk etc., and includes plural instructions enabling a computer equipment (such as a personal computer, a cloud server, or a network device etc.) to execute the methods described in various embodiments or some sections of the embodiments of the present application.
[0110] The various embodiments are progressively described in the Description, identical or similar sections among the various embodiments can be inferred from one another, and each embodiment stresses what is different from other embodiments.
Particularly, with respect to the system or system embodiment, since it is essentially similar to the method embodiment, its description is relatively simple, and the relevant sections thereof can be inferred from the corresponding sections of the method embodiment. The system or system embodiment as described above is merely exemplary in nature, units therein described as Date Recue/Date Received 2021-11-16 separate parts can be or may not be physically separate, parts displayed as units can be or may not be physical units, that is to say, they can be located in a single site, or distributed over a plurality of network units. It is possible to base on practical requirements to select partial modules or the entire modules to realize the objectives of the embodied solutions.
It is understandable and implementable by persons ordinarily skilled in the art without spending creative effort in the process.
[0111] What is described above is merely directed to preferred embodiments of the present invention, and they are not meant to restrict the present invention. Any amendment, equivalent replacement and improvement makeable within the spirit and scope of the present invention shall all be covered within the protection scope of the present invention.

Date Recue/Date Received 2021-11-16

Claims (10)

What is claimed is:
1. A method of identifying repeated correlation calculation, characterized in that the method comprises:
obtaining a first SQL statement and a second SQL statement to be identified;
analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement;
analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
2. The method according to Claim 1, characterized in that the correlation calculation includes corresponding datasheets and a correlation relation keyword, wherein the correlation relation keyword is employed to describe correlation calculation required to be performed between datasheets, and that the step of analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement includes:
analyzing the first SQL statement, and identifying a first correlation relation keyword included in the first SQL statement, and a first datasheet and a second datasheet to which the first correlation relation keyword corresponds; and Date Recue/Date Received 2021-11-16 determining first correlation calculation included in the first correlation query according to the first datasheet, the second datasheet and the first correlation relation keyword.
3. The method according to Claim 1, characterized in that the step of analyzing the first SQL
statement, and identifying a first correlation query included in the first SQL
statement includes:
analyzing the first SQL statement, and generating json data to which the first SQL
statement corresponds; and identifying the first correlation query included in the first SQL statement according to the json data.
4. The method according to Claim 1 or 2, characterized in that the second SQL
statement includes a sub-query and a correlation query between the sub-query and a third datasheet, and that the step of analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement includes:
analyzing the second SQL statement, and identifying a second correlation relation keyword included in the sub-query, and a fourth datasheet and a fifth datasheet to which the second correlation relation keyword corresponds;
determining second correlation calculation included in the second correlation query according to the second correlation relation keyword, and the fourth datasheet and the fifth datasheet to which the second correlation relation keyword corresponds;
identifying a third correlation relation keyword included in the second SQL
statement, and the third datasheet and the sub-query to which the third correlation relation keyword Date Recue/Date Received 2021-11-16 corresponds;
determining third correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fourth datasheet; and determining fourth correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fifth datasheet.
5. The method according to anyone of Claims 1 to 3, characterized in that the first SQL
statement and the second SQL statement include corresponding datasheets to be processed, and that the method comprises:
replacing, when the corresponding datasheets to be processed are temporary sheets, the datasheets to be processed with corresponding entity sheets.
6. The method according to Claim 2, characterized in that the step of determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL
statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query includes:
grouping the correlation calculations according to the correlation relation keywords; and determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when any group includes identical correlation calculation of corresponding datasheets.
7. A device for identifying repeated correlation calculation, characterized in that the device comprises:

Date Recue/Date Received 2021-11-16 an obtaining module, for obtaining a first SQL statement and a second SQL
statement to be identified;
an analyzing module, for analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL
statement;
analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and a processing module, for determining that there is repeated correlation calculation existent in the first SQL statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.
8. The device according to Claim 7, characterized in that the analyzing module can be further employed for analyzing the first SQL statement, and identifying a first correlation relation keyword included in the first SQL statement, and a first datasheet and a second datasheet to which the first correlation relation keyword corresponds; and determining first correlation calculation included in the first correlation query according to the first datasheet, the second datasheet and the first correlation relation keyword.
9. The device according to Claim 7 or 8, characterized in that the second SQL
statement includes a sub-query and a correlation query between the sub-query and a third datasheet, and that the analyzing module can be further employed for analyzing the second SQL
statement, and identifying a second correlation relation keyword included in the sub-query, and a fourth datasheet and a fifth datasheet to which the second correlation relation keyword corresponds;
determining second correlation calculation included in the second correlation query according to the second correlation relation keyword, and the fourth datasheet and the fifth datasheet to which the second correlation relation keyword corresponds;
identifying a third correlation relation keyword included in the analyzed second SQL statement, wherein the Date Recue/Date Received 2021-11-16 fourth correlation relation keyword is employed to describe correlation query of the sub-query and the third datasheet; determining third correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fourth datasheet; and determining fourth correlation calculation included in the second correlation query according to the third correlation relation keyword, the third datasheet and the fifth datasheet.
10. A computer system, characterized in that the system comprises:
one or more processor(s); and a memory, associated with the one or more processor(s), wherein the memory is employed to store a program instruction, and the program instruction performs the following operations when it is read and executed by the one or more processor(s):
obtaining a first SQL statement and a second SQL statement to be identified;
analyzing the first SQL statement, and identifying a first correlation query included in the first SQL statement, wherein the correlation query includes correlation calculation between datasheets required to be performed for executing the SQL statement;
analyzing the second SQL statement, and identifying a second correlation query included in the second SQL statement; and determining that there is repeated correlation calculation existent in the first SQL
statement and the second SQL statement when there is repeated correlation calculation existent in the first correlation query and the second correlation query.

Date Recue/Date Received 2021-11-16
CA3130988A 2020-09-16 2021-09-16 Method and device for identifying repetitive association calculation and computer system Pending CA3130988A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010973509.X 2020-09-16
CN202010973509.XA CN112307050B (en) 2020-09-16 2020-09-16 Identification method and device for repeated correlation calculation and computer system

Publications (1)

Publication Number Publication Date
CA3130988A1 true CA3130988A1 (en) 2022-03-16

Family

ID=74483971

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3130988A Pending CA3130988A1 (en) 2020-09-16 2021-09-16 Method and device for identifying repetitive association calculation and computer system

Country Status (2)

Country Link
CN (1) CN112307050B (en)
CA (1) CA3130988A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108038135A (en) * 2017-11-21 2018-05-15 平安科技(深圳)有限公司 Electronic device, the method for multilist correlation inquiry and storage medium
CN109656946B (en) * 2018-09-29 2022-12-16 创新先进技术有限公司 Multi-table association query method, device and equipment
CN110909016B (en) * 2019-10-12 2023-06-16 中国平安财产保险股份有限公司 Repeated association detection method, device, equipment and storage medium based on database

Also Published As

Publication number Publication date
CN112307050A (en) 2021-02-02
CN112307050B (en) 2022-11-15

Similar Documents

Publication Publication Date Title
US10175954B2 (en) Method of processing big data, including arranging icons in a workflow GUI by a user, checking process availability and syntax, converting the workflow into execution code, monitoring the workflow, and displaying associated information
CN103346912A (en) Method, device and system for conducting warning correlation analysis
CN109901987B (en) Method and device for generating test data
CN110704772A (en) Page abnormity monitoring method, system, device, electronic equipment and computer readable medium
CN111090666A (en) Data processing method, device and system and computer readable storage medium
CN113987086A (en) Data processing method, data processing device, electronic device, and storage medium
CN113962597A (en) Data analysis method and device, electronic equipment and storage medium
CN112214770B (en) Malicious sample identification method, device, computing equipment and medium
CN113495825A (en) Line alarm processing method and device, electronic equipment and readable storage medium
US20220414095A1 (en) Method of processing event data, electronic device, and medium
CN111639016A (en) Big data log analysis method and device and computer storage medium
CA3130988A1 (en) Method and device for identifying repetitive association calculation and computer system
CN116483888A (en) Program evaluation method and device, electronic equipment and computer readable storage medium
CN113495841B (en) Compatibility detection method, device, equipment, storage medium and program product
CA3153550A1 (en) Core recommendation method, device and system
CN112671567B (en) 5G core network topology discovery method and device based on service interface
CA3144122A1 (en) Data verifying method, device and system
CN114116924A (en) Data query method based on map data, map data construction method and device
CN112214497A (en) Label processing method and device and computer system
CN114329164A (en) Method, apparatus, device, medium and product for processing data
CN113691403A (en) Topological node configuration method, related device and computer program product
CN112035159A (en) Configuration method, device, equipment and storage medium of audit model
CN112328960B (en) Optimization method and device for data operation, electronic equipment and storage medium
US20140067798A1 (en) Scoring records for sorting by user-specific weights based on relative importance
CN113360765B (en) Event information processing method and device, electronic equipment and medium

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916