CA3144052A1 - Method and apparatus for recognizing new sql statements in database audit systems - Google Patents

Method and apparatus for recognizing new sql statements in database audit systems Download PDF

Info

Publication number
CA3144052A1
CA3144052A1 CA3144052A CA3144052A CA3144052A1 CA 3144052 A1 CA3144052 A1 CA 3144052A1 CA 3144052 A CA3144052 A CA 3144052A CA 3144052 A CA3144052 A CA 3144052A CA 3144052 A1 CA3144052 A1 CA 3144052A1
Authority
CA
Canada
Prior art keywords
statement
values
sql
skeleton
sql statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3144052A
Other languages
French (fr)
Inventor
Jianning Wei
Yi Ge
Zonglin LI
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3144052A1 publication Critical patent/CA3144052A1/en
Pending legal-status Critical Current

Links

Abstract

A method and an apparatus for recognizing new SQL statements in database audit systems relate to the technical field of big data, and feature a capability of effectively reducing resources occupancy of database audit systems, thereby enhancing system performance. The method includes: receiving and modularization processing a SQL statement in a network message to extract a skeleton statement; performing digest computation on the skeleton statement, and processing a resulting digest value into an integer value; and checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL statement. The apparatus implements the method.

Description

METHOD AND APPARATUS FOR RECOGNIZING NEW SQL STATEMENTS IN
DATABASE AUDIT SYSTEMS
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of big data, and more particularly to a method and an apparatus for recognizing new SQL statements in database audit systems.
Description of Related Art
[0002] Database audit systems are mainly used to audit various operations on database servers.
They acquire network messages of database servers by means of bypass images or plugins, and then extract and store SQL statements from the messages to their databases for subsequent operations such as querying, filtering, and analysis, thereby achieving monitoring and audit for database servers.
[0003] Database audit systems analyze the extracted SQL statements to determine whether database servers under audit have vulnerability or are attacked. From the perspective of technical implementations, the process is to extract, analyze, and store SQL
statements.
Therein, analysis is to audit whether SQL statements have vulnerability or are attacked.
For ensuring accuracy of audit, audit systems have to analyze all SQL
statements. As the number of database servers deployed by an audit system increases, the business flow of the database under audit increasingly grows, making the existing software/hardware platforms incompetent to meet requirements in the industry.
[0004] As revealed by analysis, in the actual business flow, there are many identical SQL
statements, and there are also some statements having the same skeleton with the only difference therebetween relying on parameter values. In fact, these identical or similar statements need only to be analyzed once in an audit system. Assuming that a database audit system is focused on different SQL statements, it will require merely a small part Date Recue/Date Received 2021-12-24 of the flow currently caused. To focus an audit system on the difference, it is essential to tell which statements appear for the first time prior to SQL analysis.
[0005] The conventional approaches to recognition of SQL statements mainly include: character matching, AT recognition, and SQL semantic parsing. These known approaches, however, have their shortcomings. For example, character matching is unable to ignore different parameter values. In a case with ID = 3 and ID = 4, with character matching, the two will be regarded as two different statements, making it less effecting in screening out repeated statements. In addition, handing massive extra-long SQL statements is highly demanding in terms of CPU and storage space. AT recognition in practical use takes time to learn for it to become accurate and it necessitates additional hardware servers, leading to increased costs for deployment. SQL semantic parsing is also resource-consuming, and the only purpose of recognizing new statements is to screen out numerous repeated statements and lower subsequent workload of SQL semantic parsing. Therefore, only new statements require semantic parsing.
SUMMARY OF THE INVENTION
[0006] The objective of the present invention is to provide a method and an apparatus for recognizing new SQL statements in database audit systems, featuring a capability of effectively reduce resource occupancy of database audit systems, thereby enhancing system performance.
[0007] For achieving the foregoing objective, in a first aspect, the present invention provides a method for recognizing new SQL statements in database audit systems, which comprises:
[0008] receiving and modularization processing a SQL statement in a network message to extract a skeleton statement;
[0009] performing digest computation on the skeleton statement, and processing a resulting digest value into an integer value; and
[0010] checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL statement.

Date Recue/Date Received 2021-12-24
[0011] Preferably, the step of modularization processing a SQL statement to extract a skeleton statement comprises:
[0012] recognizing SQL semantic keywords in the SQL statement, and weeding out contents of the SQL statement other than the SQL semantic keywords so as to obtain the skeleton statement.
[0013] More preferably, before the step of performing digest computation on the skeleton statement, the method further comprises:
[0014] determining whether the skeleton statement contains combination values;
[0015] if the skeleton statement contains combination values, merging the combination values, and using a universal character to replace parameter values contained therein for devaluing; and
[0016] if the skeleton statement does not contain combination values, directly replacing parameter values contained therein with the universal character for devaluing, in which
[0017] the parameter values include variable parameters and/or character strings.
[0018] More preferably, before the step of performing digest computation on the skeleton statement, the method further comprises:
[0019] replacing non-ascii codes of the skeleton statement with the universal character.
[0020] Preferably, the step of performing digest computation on the skeleton statement, and converting digest values so as to obtain integer values comprises:
[0021] using a MD5 algorithm to perform the digest computation on the processed skeleton statement, so as to obtain the digest values; and
[0022] extracting some byte data with fixed positions from the digest values, and converting the byte data into the integer values by means of an integer algorithm.
[0023] Further, the step of checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL
statement comprises:
[0024] making the table contain a plurality of key-value pairs that correspond to the skeleton statement in a one-on-one manner, wherein the key values correspond to the integer values of the skeleton statement, and the value values are receipt times of the Date Recue/Date Received 2021-12-24 corresponding SQL statement;
[0025] checking the table for the integer values corresponding to the currently received SQL
statement, if the table contains key values identical to the integer values, determining that the SQL statement is not a new SQL statement, using the integer values as the key values while using the receipt times of the SQL statement as the value values, and storing them into the table in the form of key-value pairs; and
[0026] if the table does not contain a key value identical to any of the integer values, determining that the SQL statement is a new SQL statement.
[0027] Further, after the step of determining that the SQL statement is not a new SQL statement, the method further comprises:
[0028] determining a difference of receipt times between the SQL statement newly added into the table and the SQL statement that already exists in the table and has the identical key-value pair, and when the difference of the receipt times exceeds a threshold, deleting the key-value pairs pre-existing in the table.
[0029] Preferably, the method of claim 1 further comprises:
[0030] marking a new statement identification of the SQL statement recognized as a new SQL
statement;
[0031] integrating identification results, the corresponding SQL statement, and the corresponding integer values into an entry of json data, submitting the json data to a message queue for consumption; and
[0032] using a risk engine to consume the json data from the message queue, so as to provide risk analysis services for the new SQL statement.
[0033] As compared to the prior art, the method for recognizing new SQL
statements in database audit systems of the present invention has the following beneficial effects:
[0034] According to the disclosed method for recognizing new SQL statements in database audit systems, for analyzing SQL statements in received network messages, modularized processing is followed by the process of extracting skeleton statements, which are then subject to digest computation. The resulting digest values are converted into integer values to be compared with a table. If there is a key value in the table identical to the Date Recue/Date Received 2021-12-24 integer value, it is determined that the SQL statement is not a new SQL
statement, and the integer values are used as key values while receipt times of SQL
statements are used as value values to be sotted into the table in the form of key-value pairs. If there is not a key value in the table identical to the integer value, it is determined that the SQL statement is a new SQL statement.
[0035] It is thus clear that the present invention first templates SQL
statements and extracts SQL
syntactic skeletons that are then subject to digest computation. At last, one integer value is used to mark a set of SQL statements having the same semantic meaning. As compared to the known character matching scheme, the scheme of the present invention shiels off changing value value information, and focuses on statement skeletons. As compared to the known SQL semantic parsing scheme, the scheme of the present invention only considers whether statements are identical instead of performing semantic parsing, so as to prevent system resources from being unnecessarily occupied by semantic parsing. As demonstrated by data from actual use in production environments, the method for recognizing new statements as described in the present invention has resource occupancy minimized to an ignorable level in a system, thereby effectively reducing resource consumption of the database audit system and improving system performance.
[0036] In a second aspect, the present invention provides an apparatus for recognizing new SQL
statements in database audit systems to be used in the method for recognizing new SQL
statements in database audit systems as disclosed in the foregoing technical scheme. The apparatus comprises:
[0037] a skeleton extracting unit, for receiving and modularization processing a SQL statement in a network message to extract a skeleton statement;
[0038] an integer processing unit, for performing digest computation on the skeleton statement, and processing a resulting digest value into an integer value; and
[0039] a recognizing unit, for checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL
statement.
[0040] As compared to the prior art, the disclosed apparatus for recognizing new SQL statements in database audit systems provides beneficial effects that are similar to those provided by Date Recue/Date Received 2021-12-24 the disclosed method for recognizing new SQL statements in database audit systems as enumerated above, and thus no repetitions are made herein.
[0041] In a third aspect, the present invention provides a computer-readable storage medium storing therein a computer program, wherein the computer program when executed by a processor performing the steps of the method for recognizing new SQL
statements in database audit systems as described previously.
[0042] As compared to the prior art, the disclosed computer-readable storage medium provides beneficial effects that are similar to those provided by the disclosed method for recognizing new SQL statements in database audit systems as enumerated above, and thus no repetitions are made herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0043] The accompanying drawings are provided herein for better understanding of the present invention and form a part of this disclosure. The illustrative embodiments and their descriptions are for explaining the present invention and by no means form any improper limitation to the present invention, wherein:
[0044] FIG. 1 is a flowchart of a method for recognizing new SQL statements in database audit systems according to one embodiment of the present invention;
[0045] FIG. 2 is a flowchart of a process for templating SQL statements according to one embodiment of the present invention;
[0046] FIG. 3 is a flowchart of a process for recognizing a new SQL statement according to one embodiment of the present invention; and
[0047] FIG. 4 is a flowchart of a process for a risk engine to detect a new SQL statement according to one embodiment of the present invention DETAILED DESCRIPTION OF THE INVENTION
[0048] To make the foregoing objectives, features, and advantages of the present invention clearer and more understandable, the following description will be directed to some Date Recue/Date Received 2021-12-24 embodiments as depicted in the accompanying drawings to detail the technical schemes disclosed in these embodiments. It is, however, to be understood that the embodiments referred herein are only a part of all possible embodiments and thus not exhaustive. Based on the embodiments of the present invention, all the other embodiments can be conceived without creative labor by people of ordinary skill in the art, and all these and other embodiments shall be embraced in the scope of the present invention Embodiment 1
[0049] Referring to FIG. 1-FIG. 4, the present embodiment provides a method for recognizing new SQL statements in database audit systems. The method comprises:
[0050] receiving and modularization processing a SQL statement in a network message to extract a skeleton statement; performing digest computation on the skeleton statement, and processing a resulting digest value into an integer value; and checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL statement.
[0051] According to the disclosed method for recognizing new SQL statements in database audit systems, for analyzing SQL statements in received network messages, modularized processing is followed by the process of extracting skeleton statements, which are then subject to digest computation. The resulting digest values are converted into integer values to be compared with a table. If there is a key value in the table identical to the integer value, it is determined that the SQL statement is not a new SQL
statement, and the integer values are used as key values while receipt times of SQL
statements are used as value values to be sotted into the table in the form of key-value pairs. If there is not a key value in the table identical to the integer value, it is determined that the SQL statement is a new SQL statement.
[0052] It is thus clear that the present embodiment first templates SQL
statements and extracts SQL syntactic skeletons that are then subject to digest computation. At last, one integer value is used to mark a set of SQL statements having the same semantic meaning. As compared to the known character matching scheme, the scheme of the present embodiment shiels off changing value value information, and focuses on statement Date Recue/Date Received 2021-12-24 skeletons. As compared to the known SQL semantic parsing scheme, the scheme of the present embodiment only considers whether statements are identical instead of performing semantic parsing, so as to prevent system resources from being unnecessarily occupied by semantic parsing. As demonstrated by data from actual use in production environments, the method for recognizing new statements as described in the present embodiment has resource occupancy minimized to an ignorable level in a system, thereby effectively reducing resource consumption of the database audit system and improving system performance.
[0053] Referring to FIG. 2, in the embodiment described above, the step of modularization processing a SQL statement to extract a skeleton statement comprises:
[0054] recognizing SQL semantic keywords in the SQL statement, and weeding out contents of the SQL statement other than the SQL semantic keywords so as to obtain the skeleton statement. To be exemplificative, the SQL semantic keywords include SELECT and INSERT.
[0055] In the foregoing embodiment, before the step of performing digest computation on the skeleton statement, the method further comprises:
[0056] determining whether the skeleton statement contains combination values;
if the skeleton statement contains combination values, merging the combination values, and using a universal character to replace parameter values contained therein for devaluing; and if the skeleton statement does not contain combination values, directly replacing parameter values contained therein with the universal character for devaluing, in which, for example, the combination values are VALUES (1,`TOM'), (2,`JIMMV), (3,`SAM'), (4,`LF).
In specific implementations, the universal character may be "?", and the value value replaced with the universal character is VALUES(?). The parameter values include variable parameters and/or character strings. To be exemplificative, the variable parameter may be "1", and the character string may be "TOM".
[0057] In the foregoing embodiment, before the step of performing digest computation on the skeleton statement, the method further comprises:
[0058] replacing non-ascii codes in the skeleton statements with the universal characters. For Date Recue/Date Received 2021-12-24 example, for concat('%',VAM,'%'), the Chinese character string is replaced by "?".
[0059] In the foregoing embodiment, the step of performing digest computation on the skeleton statement, and converting digest values so as to obtain integer values comprises:
[0060] using a MD5 algorithm to perform the digest computation on the processed skeleton statement, so as to obtain the digest values; and extracting some byte data with fixed positions from the digest values, and converting the byte data into the integer values by means of an integer algorithm. To be exemplificative, for a digest value that is a 16-bit number, 10055897659947028311, the byte data of the 8th through 15th bits among the 16 bits are taken and transformed into a 64-bit integer value.
[0061] In the foregoing embodiment, the step of checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL
statement comprises:
[0062] making the table contain a plurality of key-value pairs that correspond to the skeleton statement in a one-on-one manner, wherein the key values correspond to the integer values of the skeleton statement, and the value values are receipt times of the corresponding SQL statement; checking the table for the integer values corresponding to the currently received SQL statement, if the table contains key values identical to the integer values, determining that the SQL statement is not a new SQL statement, using the integer values as the key values while using the receipt times of the SQL
statement as the value values, and storing them into the table in the form of key-value pairs;
and if the table does not contain a key value identical to any of the integer values, determining that the SQL statement is a new SQL statement.
[0063] In the foregoing embodiment, after the step of determining that the SQL
statement is not a new SQL statement, the method further comprises:
[0064] determining a difference of receipt times between the SQL statement newly added into the table and the SQL statement that already exists in the table and has the identical key-value pair, and when the difference of the receipt times exceeds a threshold, deleting the key-value pairs pre-existing in the table. In specific implementations, the system periodically detects key-value pairs in the table. If the key value of a newly added SQL

Date Recue/Date Received 2021-12-24 statement has a matched key-value pair in the table, the difference between the receipt times is calculated according to the value value of the newly added SQL
statement and the corresponding value value in the table. If the difference between the receipt times exceeds the threshold, the pre-existing key-value pair is deleted from the table.
[0065] In the foregoing embodiment, after the step of checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL statement, the method further comprises:
[0066] marking a new statement identification of the SQL statement recognized as a new SQL
statement; integrating identification results, the corresponding SQL
statement, and the corresponding integer values into an entry of json data, submitting the json data to a message queue for consumption; and using a risk engine to consume the json data from the message queue, so as to provide risk analysis services for the new SQL
statement.
[0067] For facilitating understanding, the scheme of the present embodiment will be described exemplarily below:
[0068] In specific implementations, SQL statements are templated. Only SQL
semantic keywords, e.g., SELECT, INSERT, in the SQL statements are retained, and value values, such as numeral values and character string, int the SQL statements are screened off.
Additionally, some particular statements WHERE AND( ), multi-line statements inserted by INSERT are merged so that only SQL syntactic skeletons are retained and extracted into one skeleton statement that does not include any specific numeral value.
Then the extracted skeleton statement is then subject to MD5 digest computation. The highest 8 bits among the 16 bits are taken and converted into a 64-bit integer value.
The purpose of this is to use one long integer value to represent a SQL statement. At last, the long integer value is stored in the HASH table. A specific implementation may be realized through the following steps.
[0069] Step 1 involves receiving SQL statements extracted from network messages, wherein a SQL statement is a complete SQ statement that includes INSERT and/or SELECT
SQL
semantic keywords, such as an INSERT semantic keyword of INSERT INTO
USERINFO 0 (USERID, USERNAME) VALUES (1,`TOM'), (2,`JIMMY'), (3,' SAM'), Date Recue/Date Received 2021-12-24 (4,1F), and a SELECT semantic keyword of SELECT * FROM USERINFO 0 WHERE
ID=1;
[0070] Step 2 is about devaluing the SQL statement by replacing all the numeral values or character strings in the SQL statement with"?". For example, WHERE ID=1 is rewritten into WHERE ID=?, and WHERE NAME="titlename" is rewritten into WHERE
NAME="?". It is to be noted that a completer numeral value can be replaced by one question mark only. For example, 123 is replaced by "?", but not"? ? ?".
[0071] After the processing of Step 2, the SELECT semantic keyword as mentioned above is rewritten as SELECT * FROM USERINFO 0 WHERE ID=?.
[0072] At Step 3, the combined value of multiple entries of records in the INSERT statement are merged. In an example where the INSERT statement inserts a JSON array object data that contains multiple entries of records, the different record numbers will be recognized as different statements, and thus these records have to be merged into one.
[0073] After Step 3, the INSERT statement is rewritten as INSERT INTO USERINFO

(USERID, USERNAME) VALUES (?, ?).
[0074] Step 4 involves merging other statements in the SELECT semantic keywords that need a merged value. For example, plural conditions in the statements WHERE AND, WHERE
OR, and WHERE NOT are merged.
[0075] At Step 5, the non-ascii code is processed. For example, in concat('%',WM,'%'), the Chinese character string is replaced by "?".
[0076] At Step 6, the single skeleton statement obtained after the process of Step 5 is subject to MD5 digest operation. For example, the digest values corresponding to INSERT
statement and the SELECT statement at the previous step, respectively, are 10055897659947028311 and 3935916992245373588.
[0077] Step 7 is about simplifying the digest value by taking the 8th through the 15th byte data among the 16 bytes, and transforming the 8-bit value into a 64-bit integer value.
[0078] When a complete SQL statement is simplified into one integer value through the steps shown in FIG. 2, they are adopted into a hashmap table whose key is the simplified integer value of the SQL statement and value is the timestamp of when the SQL
statement was Date Recue/Date Received 2021-12-24 received.
[0079] Step 1: extracting SQL statements from network messages, templating the SQL
statements through for example the step shown in FIG. 2, and then extracting digest values;
[0080] Step 2: using the integer value as the product of digest computation as a key for reading the hashmap storage table, if the key can be read from the hashmap storage table, determining that it is not a new SQL statement, for which case only the timestamp of the node is updated, wherein the timestamp is used for determining timeout, and if the key cannot be read from the hashmap storage table, determining that it is a new SQL statement, for which case the key value and the value value are added into the hashmap storage table;
[0081] Step 3: if it is a new SQL statement, adding a new statement mark, and organizing the new statement mark, the SQL statement, and the SQL statement digest integer value into an entry of json data;
[0082] Step 4: submitting the json data to a message queue for consumption.
[0083] FIG. 4 is a flowchart illustrates how a risk engine processes a new SQL
statement. The risk engine performs security analysis on the SQL statement, and the risk engine only processes SQL statements with new statement marks.
[0084] At Step 1, the risk engine reads JSON data form message queues, wherein the JSON data contains a complete SQL statement and a new statement mark.
[0085] At Step 2, the risk engine determines that the SQL statement is a new statement according to the new statement mark, and performs statement analysis to check whether a risk exists.
If the SQL statement is not a new statement, it is not processed.
[0086] To sum up, the present embodiment has the following beneficial effects:
[0087] SQL statement templating: SQL statements are devalued, merged, extracted for statement skeletons to form SQL statement templates, thereby reducing occupation of system resources and improving system performance.
[0088] Digest processing: the processed SQL statement template are subject to digest processing, so that a skeleton statement can be represented by a long integer numeral value, which facilitates fast table looking-up.

Date Recue/Date Received 2021-12-24
[0089] KMP optimizing algorithm: For statement matching required by the template process, the known, intuitive character matching has a time complexity of 0(MxN), yet the KMP
algorithm has a time complexity of 0(M+N). The present embodiment optimizes the KMP algorithm for SQL statements, and has a reduced time complexity of 0(M-2N) or below.
Embodiment 2
[0090] The present embodiment provides an apparatus for recognizing new SQL
statements in database audit systems, which comprises:
[0091] a skeleton extracting unit, for receiving and modularization processing a SQL statement in a network message to extract a skeleton statement;
[0092] an integer processing unit, for performing digest computation on the skeleton statement, and processing a resulting digest value into an integer value; and
[0093] a recognizing unit, for checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL
statement.
[0094] As compared to the prior art, the disclosed apparatus for recognizing new SQL statements in database audit systems provides beneficial effects that are similar to those provided by the disclosed method for recognizing new SQL statements in database audit systems as enumerated above, and thus no repetitions are made herein.
Embodiment 3
[0095] The present embodiment provides a computer-readable storage medium, storing thereon a computer program. When the computer program is executed by a processor, it implements the steps of the method for recognizing new SQL statements in database audit systems as described previously.
[0096] As compared to the prior art, the disclosed apparatus for recognizing new SQL statements in database audit systems provides beneficial effects that are similar to those provided by the disclosed method for recognizing new SQL statements in database audit systems as enumerated above, and thus no repetitions are made herein.
[0097] As will be appreciated by people of ordinary skill in the art, implementation of all or a part of the steps of the method of the present invention as described previously may be Date Recue/Date Received 2021-12-24 realized by having a program instruct related hardware components. The program may be stored in a computer-readable storage medium, and the program is about performing the individual steps of the methods described in the foregoing embodiments.
The storage medium may be a ROM/RAM, a disk, a compact disk, a memory card or the like.
[0098] The present invention has been described with reference to the preferred embodiments and it is understood that the embodiments are not intended to limit the scope of the present invention. Moreover, as the contents disclosed herein should be readily understood and can be implemented by a person skilled in the art, all equivalent changes or modifications which do not depart from the concept of the present invention should be encompassed by the appended claims. Hence, the scope of the present invention shall only be defined by the appended claims.

Date Recue/Date Received 2021-12-24

Claims (10)

What is claimed is:
1. A method for recognizing new SQL statements in database audit systems, wherein comprising:
receiving and modularization processing a SQL statement in a network message to extract a skeleton statement;
performing digest computation on the skeleton statement, and processing a resulting digest value into an integer value; and checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL statement.
2. The method of claim 1, wherein the step of modularization processing a SQL
statement to extract a skeleton statement comprises:
recognizing SQL semantic keywords in the SQL statement, and weeding out contents of the SQL
statement other than the SQL semantic keywords so as to obtain the skeleton statement.
3. The method of claim 1 or 2, wherein before the step of performing digest computation on the skeleton statement, the method further comprises:
determining whether the skeleton statement contains combination values;
if the skeleton statement contains combination values, merging the combination values, and using a universal character to replace parameter values contained therein for devaluing; and if the skeleton statement does not contain combination values, directly replacing parameter values contained therein with the universal character for devaluing, in which the parameter values include variable parameters and/or character strings.
4. The method of claim 1 or 3, wherein before the step of performing digest computation on the skeleton statement, the method further comprises:
replacing non-ascii codes of the skeleton statement with the universal character.
Date Recue/Date Received 2021-12-24
5. The method of claim 3 or 4, wherein the step of performing digest computation on the skeleton statement, and converting digest values so as to obtain integer values comprises:
using a MD5 algorithm to perform the digest computation on the processed skeleton statement, so as to obtain the digest values; and extracting some byte data with fixed positions from the digest values, and converting the byte data into the integer values by means of an integer algorithm.
6. The method of claim 5, wherein the step of checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL
statement comprises:
making the table contain a plurality of key-value pairs that correspond to the skeleton statement in a one-on-one manner, wherein the key values correspond to the integer values of the skeleton statement, and the value values are receipt times of the corresponding SQL
statement;
checking the table for the integer values corresponding to the currently received SQL statement, if the table contains key values identical to the integer values, determining that the SQL statement is not a new SQL statement, using the integer values as the key values while using the receipt times of the SQL statement as the value values, and storing them into the table in the form of key-value pairs; and if the table does not contain a key value identical to any of the integer values, determining that the SQL statement is a new SQL statement.
7. The method of claim 6, wherein after the step of determining that the SQL
statement is not a new SQL statement, the method further comprises:
determining a difference of receipt times of the SQL statements corresponding to the SQL
statement newly added into the table with the identical key values pre-existing in the table, and when the difference of the receipt times exceeds a threshold, deleting the key-value pairs pre-existing in the table.

Date Recue/Date Received 2021-12-24
8. The method of claim 1, wherein further comprising:
marking a new statement identification of the SQL statement recognized as a new SQL statement;
integrating identification results, the corresponding SQL statement, and the corresponding integer values into an entry of json data, submitting the json data to a message queue for consumption; and using a risk engine to consume the json data from the message queue, so as to provide risk analysis services for the new SQL statement.
9. An apparatus for recognizing new SQL statements in database audit systems, wherein comprising:
a skeleton extracting unit, for receiving and modularization processing a SQL
statement in a network message to extract a skeleton statement;
an integer processing unit, for performing digest computation on the skeleton statement, and processing a resulting digest value into an integer value; and a recognizing unit, for checking a table for comparing the integer value, and based on a comparative result determining whether the SQL statement is a new SQL
statement.
10. A computer-readable storage medium storing therein a computer program, wherein the computer program when executed by a processor performing a method as described in any of claims 1 through 8.

Date Recue/Date Received 2021-12-24
CA3144052A 2020-12-28 2021-12-24 Method and apparatus for recognizing new sql statements in database audit systems Pending CA3144052A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011579141.5 2020-12-28
CN202011579141.5A CN112686029A (en) 2020-12-28 2020-12-28 SQL new sentence identification method and device for database audit system

Publications (1)

Publication Number Publication Date
CA3144052A1 true CA3144052A1 (en) 2022-06-28

Family

ID=75452621

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3144052A Pending CA3144052A1 (en) 2020-12-28 2021-12-24 Method and apparatus for recognizing new sql statements in database audit systems

Country Status (2)

Country Link
CN (1) CN112686029A (en)
CA (1) CA3144052A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114172890B (en) * 2021-11-03 2024-02-27 阿里巴巴(中国)有限公司 File second transmission processing method and device, storage medium and electronic equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453358B (en) * 2007-12-06 2011-05-11 北京启明星辰信息技术股份有限公司 Sql sentence audit method and system for oracle database binding variable
EP2944055A4 (en) * 2013-01-11 2016-08-17 Db Networks Inc Systems and methods for detecting and mitigating threats to a structured data storage system
CN103365979A (en) * 2013-07-03 2013-10-23 交通银行股份有限公司 Long-distance double-center online processing method and system based on open database
CN103886021B (en) * 2014-02-24 2018-10-02 上海上讯信息技术股份有限公司 A kind of database audit result interpretation method and system based on natural language
CN105373607B (en) * 2015-11-13 2020-04-24 国网智能电网研究院 Method for compressing SQL access log of power business system
CN105912594B (en) * 2016-04-05 2020-01-07 深信服科技股份有限公司 SQL statement processing method and system

Also Published As

Publication number Publication date
CN112686029A (en) 2021-04-20

Similar Documents

Publication Publication Date Title
US10402163B2 (en) Intelligent data extraction
CN111352907A (en) Method and device for analyzing pipeline file, computer equipment and storage medium
CN110929125A (en) Search recall method, apparatus, device and storage medium thereof
CN112364637B (en) Sensitive word detection method and device, electronic equipment and storage medium
CN111984792A (en) Website classification method and device, computer equipment and storage medium
CN112035599A (en) Query method and device based on vertical search, computer equipment and storage medium
CN114818643A (en) Log template extraction method for reserving specific service information
CA3144052A1 (en) Method and apparatus for recognizing new sql statements in database audit systems
CN114372144A (en) Gambling domain name identification method based on certificate and domain name resolution
CN111767543B (en) Replay attack vulnerability determination method, device, equipment and readable storage medium
CN116302089B (en) Picture similarity-based code clone detection method, system and storage medium
CN115801455B (en) Method and device for detecting counterfeit website based on website fingerprint
CN111314109A (en) Weak key-based large-scale Internet of things equipment firmware identification method
CN111291535A (en) Script processing method and device, electronic equipment and computer readable storage medium
CN112711678A (en) Data analysis method, device, equipment and storage medium
CN113472686B (en) Information identification method, device, equipment and storage medium
CN114266046A (en) Network virus identification method and device, computer equipment and storage medium
CN111460268B (en) Method and device for determining database query request and computer equipment
CN115392238A (en) Equipment identification method, device, equipment and readable storage medium
CN116414976A (en) Document detection method and device and electronic equipment
CN114281761A (en) Data file loading method and device, computer equipment and storage medium
CN114528908A (en) Network request data classification model training method, classification method and storage medium
CN114385436A (en) Server grouping method and device, electronic equipment and storage medium
EP1365331A2 (en) Determination of a semantic snapshot
CN115048543B (en) Image similarity judgment method, image searching method and device

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916