CN107784003B - Data query anomaly detection method, device, equipment and system - Google Patents

Data query anomaly detection method, device, equipment and system Download PDF

Info

Publication number
CN107784003B
CN107784003B CN201610742829.8A CN201610742829A CN107784003B CN 107784003 B CN107784003 B CN 107784003B CN 201610742829 A CN201610742829 A CN 201610742829A CN 107784003 B CN107784003 B CN 107784003B
Authority
CN
China
Prior art keywords
data query
data
query instruction
condition
symbol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610742829.8A
Other languages
Chinese (zh)
Other versions
CN107784003A (en
Inventor
蔡聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cainiao Smart Logistics Holding Ltd
Original Assignee
Cainiao Smart Logistics Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cainiao Smart Logistics Holding Ltd filed Critical Cainiao Smart Logistics Holding Ltd
Priority to CN201610742829.8A priority Critical patent/CN107784003B/en
Publication of CN107784003A publication Critical patent/CN107784003A/en
Application granted granted Critical
Publication of CN107784003B publication Critical patent/CN107784003B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data query abnormity detection method, device, equipment and system. The method comprises the following steps: acquiring a data query instruction, wherein the data query instruction is used for querying a data table; determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table; and if the data query instruction comprises the association condition, determining whether a query result corresponding to the data query instruction is abnormal or not according to the association condition. In the embodiment, by obtaining the association condition in the data query instruction, such as SQL, and predicting whether the query result of the SQL may be abnormal according to the association condition, a pre-warning is performed to allow a data developer to modify the SQL in advance, so that the efficiency of detecting the data quantity abnormality is improved.

Description

Data query anomaly detection method, device, equipment and system
Technical Field
The present application relates to internet technologies, and in particular, to a method, an apparatus, a device, and a system for detecting data query anomalies.
Background
Structured Query Language (SQL) is a database Query and programming Language for accessing data and querying, updating, and managing databases.
A large number of data tables are stored in the database, the data tables with a certain incidence relation can be inquired through SQL inquiry sentences, taking shopping web pages as an example, a large number of users log on the shopping web pages every day, browse commodities in the shopping web pages, click to purchase the commodities, the behavior of reviewing the purchased goods in the shopping webpage generates a large amount of data tables, which are stored in the database, for example, the database stores data table A of user identification number and user nickname, data table B of user nickname and user transaction amount, data table C of user transaction date and user transaction amount, etc., if the data table including user identification number and user transaction amount is required, the data table including the user identification number and the user transaction amount can be obtained by correlating the data table A and the data table B through the SQL query statement.
However, when a problem occurs in the SQL query statement itself, the amount of data queried is abnormally large or small, and the queried result does not meet the query condition.
Disclosure of Invention
The application processing method and device, the vehicle-mounted equipment and the vehicle machine are used for simplifying the operation process and improving the processing efficiency.
In one aspect, the present application provides a data query anomaly detection method, including:
acquiring a data query instruction, wherein the data query instruction is used for querying a data table;
determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table;
and if the data query instruction comprises the association condition, determining whether a query result corresponding to the data query instruction is abnormal or not according to the association condition.
In another aspect, the present application provides a data query abnormality detection apparatus, including:
the data query module is used for querying a data table;
the first determination module is used for determining whether the data query instruction has an associated condition, and the associated condition is used for associating a data table;
and a second determining module, configured to determine, when the first determining module determines that the data query instruction includes the association condition, whether a query result corresponding to the data query instruction is abnormal according to the association condition.
In yet another aspect, the present application provides a server, comprising: a processing unit and a storage unit;
the storage unit is used for storing a data table;
the processing unit is coupled to the storage unit and used for acquiring a data query instruction, and the data query instruction is used for querying a data table in the storage unit; determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table; and if the data query instruction comprises the association condition, determining whether a query result corresponding to the data query instruction is abnormal or not according to the association condition.
In still another aspect, the present application provides a data query abnormality detection system, including the data query abnormality detection apparatus, or the server.
According to the method and the device, the association condition in the data query instruction such as SQL is obtained, whether the query result of the SQL is possible to be abnormal or not is predicted according to the association condition, and early warning is carried out so that a data developer can modify the SQL in advance, and therefore the efficiency of data quantity abnormality detection is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of a data query anomaly detection method according to an embodiment of the present application;
FIG. 2 is a diagram illustrating a structure of a syntax tree according to an embodiment of the present application;
fig. 3 is a flowchart of a data query anomaly detection method according to a second embodiment of the present application;
fig. 4 is a flowchart of a data query anomaly detection method according to a third embodiment of the present application;
fig. 5 is a flowchart of a data query anomaly detection method according to the fourth embodiment of the present application;
fig. 6 is a flowchart of a data query anomaly detection method according to a fifth embodiment of the present application;
fig. 7 is a flowchart of a data query anomaly detection method according to a sixth embodiment of the present application;
fig. 8 is a flowchart of a data query anomaly detection method according to a seventh embodiment of the present application;
fig. 9 is a schematic structural diagram of a data query anomaly detection apparatus according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a data query anomaly detection apparatus according to a second embodiment of the present application;
fig. 11 is a schematic structural diagram of a data query anomaly detection apparatus according to a third embodiment of the present application;
fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application;
fig. 13 is a schematic structural diagram of a server according to another embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
In the prior art, SQL select statement can be used to select data from a data table, assuming that the data table of member ID, member name, and registration time is stored in the data table of a certain shopping site, which is shown in table 1 below:
TABLE 1
Figure BDA0001094490520000031
Figure BDA0001094490520000041
The member ID and member name whose registration time is 2016 (6 months) can be selected from table 1 by the corresponding select statement, and the selected data is put into a new table such as table 2:
TABLE2
Member ID Member name Registration time
003 King of a certain 2016 month 6
004 Zhao (a certain thing) 2016 month 6
005 To be specific to Korean 2016 month 6
006 Root of Bai Dynasty 2016 month 6
In addition, the database of the shopping site may also store a data table about the member order, which is shown in table 3 below:
TABLE 3
Order number Member ID Time of order
1000 001 2016 month 1
1001 001 2016 month 2
1002 001 2016 (3 months) year
1003 002 2016 month 6
1004 003 2016 month 6
1005 004 2016 month 6
1008 007 2016 (8 months) year
1009 001 2016 (9 months) year
As shown in Table 3, the time when the member places the order and the order number can be known, the number of orders per month can be selected from Table 3 by the select statement in SQL, if an order number indicates an order, the order of a month, for example, 2016, 6 can be selected from Table 3 by the corresponding select statement, and the selected data can be put into a new table, for example, Table 4:
TABLE4
Figure BDA0001094490520000042
Figure BDA0001094490520000051
In addition, tables in the database may be linked to each other by a key, the "registration time" of table2 may be used as a key, the "order time" of table4 may also be used as a key, table2 and table4 may be linked by the "registration time" of table2 and the "order time" of table4, a join statement of SQL may be used to implement the linking operation of the data tables, the join may identify the linking operation, and if table2 is referred to as table2, table4 is referred to as table time 4, a member ID is referred to as ID-P, a member Name is referred to as Name-P, a registration time is referred to as regist time, an order number is referred to as OrderNO, an order time is referred to, and when table2 and table4 are linked by the join statement, the linking condition may be table2.regist is referred to table4.order, and the SQL instruction corresponding to the linking operation may be referred to as join on table2.
Since the values in the column of "registration time" in table2 are the same, and are all 2016 years for 6 months, and the values in the column of "order time" in table4 are the same, and are all 2016 years for 6 months, it can be seen that the values in the column of "registration time" in table2 are the same as the values in the column of "order time" in table4, and then the cartesian product will occur when table2 and table4 are associated by "registration time" and "order time", and the result after the association operation is shown in table 5:
TABLE 5
Member ID Member name Order number
003 King of a certain 1003
004 Zhao (a certain thing) 1003
005 To be specific to Korean 1003
006 Root of Bai Dynasty 1003
003 King of a certain 1004
004 Zhao (a certain thing) 1004
005 To be specific to Korean 1004
006 Root of Bai Dynasty 1004
003 King of a certain 1005
004 Zhao (a certain thing) 1005
005 To be specific to Korean 1005
006 Root of Bai Dynasty 1005
As shown in table2, "member ID", "member name", "registration time" are column names of table2, each column including 4 data, i.e., table2 includes 4 rows of data. As shown in Table4, "order number", "order time" is the column name of Table4, each column comprising 3 data, i.e., Table4 comprises 3 rows of data. After the cartesian product has occurred, the data table formed by the association of table2 and table4 will include 4 x 3 to 12 rows of data as shown in table 5.
For the internet system, the data included in tables 1 and 3 may be thousands of data, and the data included in tables 2 and 4 may also be thousands of data, and after two tables including thousands of rows of data are associated and generate cartesian products, the amount of data included in the data table formed by association becomes abnormally large, which may be a catastrophic result for the internet system.
As can be seen from tables 2 and 4, the cause of the cartesian product is: the value of the column of "registration time" in table2 is 2016 and 6 months, and the value of the column of "order time" in table4 is 2016 and 6 months, if table2. registry time is used as a variable and table4.order time is also used as a variable, the value of the variable table2. registry time is unique, the value of the variable table4.order time is unique, and the value of the variable table2. registry time and the value of the variable table4.order time are always equal in the association process of table2 and table4, in this case, it can be considered that: the association condition table2. reginsttime ═ table4.ordertime holds consistently in the association process of table2 and table4.
It can be seen that the constant establishment of the association condition is a sufficient condition for the occurrence of an abnormal SQL query result. Therefore, in order to avoid the abnormal SQL query result, whether the association condition is always satisfied can be detected before the SQL association operation is executed, and the association condition is avoided as much as possible, so that the abnormal SQL query result probability can be reduced.
In addition, the expression form of the association condition may be composed of a field expression 1, a relation symbol, and a field expression 2, and the field expression 1 and the field expression 2 are respectively located on both sides of the relation symbol; the field expression may be a constant, a character, a string, a variable name, or the like, or may be a combination of at least two of the constant, the character, the string, and the variable name, for example, table2.regist time and table4.order time may be used as the field expression, the constant string aaa may be used as the field expression, and a value may be used as the field expression, (constant string + value) may be used as the field expression; the association condition can be represented by field expression 1, a relationship symbol, and a logical relationship between field expressions 2, specifically, the relationship symbol can be an equal sign, an unequal sign, a greater than sign, a less than sign, a greater than or equal sign, a less than or equal sign, and the like.
Taking the relation symbol as an equal sign as an example, the following methods can be used to detect whether the association condition is always satisfied:
1) whether field expression 1 and field expression 2 are the same value. If the field expression 1 is a numerical value of 100 and the field expression 2 is a numerical value of 100, the association condition is 100 — 100, and since the numerical value of 100 is equal to itself and is always established, in this case, it can be determined that the association condition is always established.
2) Whether field expression 1 and field expression 2 are the same character or string. If the field expression 1 is the character string aaa and the field expression 2 is the character string aaa, the association condition is aaa ═ aaa, and since the character string aaa is equal to itself and is established forever, in this case, it is determined that the association condition is established forever.
3) Whether the field expression 1 and the field expression 2 respectively point to the same value, character or character string, that is, the field expression 1 and the field expression 2 are formally different, but the field expression 1 and the field expression 2 respectively point to the same value, character or character string, for example, the association condition of table2.registtime ═ table4. orderbime, as described above, and table2.registtime and table4. orderbetime are formally different, but the value of the variable table2.registtime is 2016 year 6 month in the association process of table2 and table4, and the value of table4. orderbetime is 2016 year 6 month in the association process of table2 and table4, that is, the value of the variable table2.registtime and the value of table4. orderbetime are 2016 year 6 month in the association process of table2 and table4, that the values of the variable table2.registtime and table4 are always equal, and that the association condition is also considered to be constant.
In addition, it can also be exemplified that the relation symbol is an unequal number, for example, the field expression 1 is a numerical value 50, the field expression 2 is a numerical value 100, if the association condition is that the field expression 1 is not equal to the field expression 2, that is, the association condition is 50! 100, "! "means not equal number, since 50 never equals 100, then the association condition 50! 100 is always true.
Without loss of generality, whether the association condition is constantly established is judged, and the key is as follows: whether the logical relationship among the field expression 1, the relation symbol, and the field expression 2 constituting the association condition is unconditionally established or not, or whether the logical relationship is always established during the execution of the association operation, or whether the logical relationship is always established during the association of the data table.
Therefore, in order to avoid the abnormality of the SQL query result, in this embodiment, before the SQL correlation operation is executed, whether the correlation condition is constantly established or not is detected, and when the correlation condition is detected to be constantly established, an early warning may be performed in advance, so that a data developer modifies the SQL query statement in advance, that is, before the SQL query statement is executed, the SQL query statement itself is detected, the occurrence of the situation that the correlation condition is constantly established is avoided as much as possible, and compared with the case that the abnormality of the SQL query result is detected after the abnormality occurs, the abnormality detection efficiency of the SQL query result is improved.
Fig. 1 is a flowchart of a data query anomaly detection method according to an embodiment of the present application, and as shown in fig. 1, the method includes the following steps:
step S101, a data query instruction is obtained, and the data query instruction is used for querying a data table.
The execution subject of this embodiment may be a computer, or may be a terminal device that includes a processor and has a specific processing function, taking the computer as an example, the computer includes a database, or the computer is connected to a database server, the database server includes a database, a large number of data tables are stored in the database, and the computer can access the data tables in the database. The data query instruction can be SQL which is input to a computer by a data developer and can query a data table with a certain incidence relation in a database. For example, a database of a shopping site stores a data table of member ID, member name, and registration time, which is shown in table 1; the select statement in SQL selects the member ID and member name whose registration time is 2016 (6 months) from table 1 and places the selected data in a new table such as table2. In addition, the database of the shopping site may also store a data table about the member order, which is shown in table 3; the monthly orders can be selected from table 3 by the select statement in SQL, and if an order number indicates an order, the 2016 year 6 month orders can be selected from table 3 by the corresponding select statement, and the selected data can be put into a new table such as table4. In addition, the join statement of SQL can be used to associate Table2 with Table4.
And S102, determining whether the data query instruction has associated operation.
The method for judging whether the SQL has the associated operation can be to judge whether the SQL has the associated operation through the regular matching of the SQL text or the syntax tree analysis, and in addition, whether the SQL has the associated operation can be determined through whether the SQL has the keywords of the associated operation, in the SQL, the keyword join can be used for identifying the associated operation of the SQL, the generality is not lost, and other command symbols or sentences can be used for identifying the associated operation. In addition, as shown in fig. 1, if the SQL does not include the association operation, the execution of the subsequent steps is ended. Step S103, determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table.
After determining that the SQL includes the association operation, further determining whether the SQL includes an association condition, and determining whether the SQL includes the association condition may specifically be implemented as: and analyzing the syntax of the data query instruction to obtain a syntax tree, and determining whether the data query instruction has an associated condition according to whether the syntax tree comprises a target child node, wherein the target child node is used for identifying the associated condition.
In this embodiment, the data query instruction may be an SQL query instruction, and according to the association operation performed on table2 and table4 described in the above embodiment, the SQL query instruction for implementing the association operation may be join on table2. registry time ═ table4.OrderTime, and the SQL query instruction expresses that: table2 and table4 are associated with table2. reginsttime ═ table4.ordertime as an association condition, table2 indicates table2 in the above embodiment, and table4 indicates table4 in the above embodiment. An SQL query may correspond to a syntax tree, and the tree structure shown in fig. 2 is a partial structure of the syntax tree corresponding to join on table2.regist time ═ table4.ordertime, and the syntax tree can be obtained by parsing the SQL query according to the prior art. As shown in fig. 2, the syntax tree includes TOK _ JOIN nodes. In order to determine whether the SQL query instruction has an association condition, the determination may be performed through a syntax tree corresponding to the SQL query instruction, as shown in fig. 2, specifically, if three child nodes are included below the TOK _ JOIN node, it may be determined that the SQL query instruction includes the association condition, as shown in fig. 2, a third child node below the TOK _ JOIN node is a "═ symbol," -symbol two child nodes are "-" symbols, left and right child nodes of the left "-" symbol are TOK _ TABLE _ OR _ COL and reginsttime, respectively, where TOK _ TABLE _ OR _ COL denotes TABLE 2; the left and right child nodes of the symbol are TOK _ TABLE _ OR _ COL and OrderTime, respectively, where TOK _ TABLE _ OR _ COL denotes TABLE 4; then the third child node of TOK _ JOIN means: table2. registry time is consistent with the association condition table2. registry time is consistent with table4.OrderTime described in the foregoing embodiment, therefore, the third child node of TOK _ JOIN can be used as the target child node to identify whether the SQL query instruction has an association condition, or if the node of TOK _ JOIN has three child nodes, it can determine that the SQL query instruction has an association condition.
In addition, in other embodiments, determining whether the SQL includes the association condition may be further specifically implemented as: analyzing the data query instruction to obtain a plurality of field expressions; and determining whether the data query instruction has an associated condition according to whether a target field expression is included in the field expressions, wherein the target field expression is used for identifying the associated condition.
For example, it may be determined from the literal of join on table2. registry time ═ table4.OrderTime whether the SQL query instruction includes on, and if the SQL query instruction includes on, then the logical expression following on is the association condition, so that it may be determined whether the SQL query instruction includes the association condition by whether a specific character such as on is included in the SQL query instruction. Since the key word join of the SQL query instruction includes On, it indicates that the SQL query instruction includes an association condition, specifically, table2.regist time is table4. ordertime. The association condition is used to associate table2 with table4 according to table2. reginsttime ═ table4. ordertetime, where reginsttime is a column of table2 and ordertetime is a column of table4.
And step S104, if the data query instruction comprises the association condition, determining whether a query result corresponding to the data query instruction is abnormal or not according to the association condition.
As can be seen from step S102, in the SQL shown in this embodiment, if there is an association condition, specifically, table2. registry time is table4.OrderTime, it is further predicted whether the query result of the SQL is abnormal or not according to the association condition table2. registry time is table4. OrderTime. Specifically, whether the query result of the SQL is abnormal or not can be determined by judging the expression value of the association condition table2. reginsttime ═ table4.ordertime, or judging the type of the field expression in the association condition expression.
Step S105, if the data query instruction does not include the association condition, determining that the number of query results corresponding to the data query instruction is abnormal.
In this embodiment, if the SQL does not include the association condition, a cartesian product will occur during execution of the SQL, which may cause an exception to the number of query results of the SQL.
In the embodiment, by acquiring the association condition in the data query instruction, such as SQL, and predicting whether the query result of the SQL may be abnormal according to the association condition, a pre-warning is performed to modify the SQL in advance by a data developer, so that the efficiency of detecting the data quantity abnormality is improved.
Fig. 3 is a flowchart of a data query anomaly detection method according to a second embodiment of the present application, and as shown in fig. 3, the method includes the following steps:
step S301, a data query instruction is obtained, and the data query instruction is used for querying a data table.
Step S302, determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table.
Step S301 is the same as step S101, and step S302 is the same as step S102, and the detailed method is not repeated here.
Step S303, if the data query instruction includes the association condition, determining whether the number of query results corresponding to the data query instruction is abnormal according to the association condition.
As can be seen from step S102, in the SQL shown in this embodiment, if there is an association condition, specifically, table2. registry time is table4. orderlime, then it is further predicted whether the number of query results of the SQL is abnormal or not according to the association condition table2. registry time is table4. orderlime. Specifically, whether the number of the query results of the SQL is abnormal or not may be determined by determining the value of the associated condition table2. reginsttime ═ table4.ordertime expression or determining the type of the field expression in the associated condition expression.
For the association condition table2. registry time ═ table4.OrderTime, formally, table2. registry time ═ table4.OrderTime is different, but as can be seen from tables 2 and 4, table2. registry time and table4.OrderTime both represent 2016 month 6, 2016, and 2016 are the only values for table2. registry time and table4.OrderTime, respectively, i.e., the equation for the association condition both sides are 2016 month 6, and the values for the variable table2. registry time and table4.OrderTime are always equal during the association of tables 2 and 4, in which case: since the association condition table2. registration time ═ table4. orderlime is always true in the association process of table2 and table4, cartesian product will occur when table2 and table4 are associated by "registration time" and "order time", the result after the association operation is shown in table 5, and "member ID", "member name", and "registration time" are column names of table2, and each column includes 4 data, that is, table2 includes 4 rows of data. As shown in Table4, "order number", "order time" is the column name of Table4, each column comprising 3 data, i.e., Table4 comprises 3 rows of data. After the cartesian product has occurred, the data table formed by the association of table2 and table4 will include 4 x 3 to 12 rows of data as shown in table 5.
For the internet system, the data included in tables 1 and 3 may be thousands of data, and the data included in tables 2 and 4 may also be thousands of data, and after two tables including thousands of rows of data are associated and generate cartesian products, the amount of data included in the data table formed by association becomes abnormally large, which may be a catastrophic result for the internet system.
Without loss of generality, if the data table a includes m rows of data and the data table b includes n rows of data, after a cartesian product occurs, the data table formed by associating the data table a and the data table b will include m × n rows of data.
In the embodiment, by acquiring the association condition in the data query instruction, such as SQL, and predicting whether the quantity of the query results of the SQL is likely to be abnormal according to the association condition, the abnormality of the query results of the SQL is reflected by the abnormality of the quantity of the query results, and early warning is performed to allow a data developer to modify the SQL in advance, so that the efficiency of data quantity abnormality detection is improved.
Fig. 4 is a flowchart of a data query anomaly detection method provided in the third embodiment of the present application, and as shown in fig. 4, the method includes the following steps:
step S401, a data query instruction is obtained, and the data query instruction is used for querying a data table.
Step S402, determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table.
Step S401 is the same as step S101, and step S402 is the same as step S102, and the detailed method is not described here again.
Step S403, if the data query instruction includes the association condition, acquiring a relationship symbol in the association condition and field expressions on the left and right sides of the relationship symbol.
As can be seen from step S101, the correlation condition included in SQL is table2. reginsttime ═ table4.ordertime, the relation symbol in the correlation condition is equal sign ═ and the field expressions on the left and right sides are table2. reginsttime and table4.ordertime, respectively.
In addition, in other embodiments, the association condition may also be an unequal relational expression such as table2.regist time < > '2016 year 1 month', and the relational symbol in the association condition is an unequal sign "< >".
In addition, in other embodiments, the association condition may also be an expression containing a relationship, such as a.id in ('bbb'), where the relationship symbol in the association condition is a containing symbol "in", where a may be a table name broadly referred to and id may be a column name of table a.
Step S404, determining whether the quantity of the query results corresponding to the data query instruction is abnormal or not according to the relationship symbol and the field expressions on the left side and the right side of the relationship symbol.
For the association condition table2. reginsttime ═ table4.ordertime, this embodiment predicts whether the number of query results of the SQL will be abnormal when the SQL is executed, according to the equal sign "═" and the equal sign "═" field expressions table2. reginsttime and table4.ordertime on both sides.
In addition, in other embodiments, whether the number of query results of the SQL is abnormal when the SQL is executed may be predicted according to the unequal sign "< >" and the field expressions table2.regist time and '1 month 2016' on the left and right of the unequal sign "< >".
In addition, in other embodiments, whether the number of query results of the SQL is abnormal when the SQL is executed may be predicted according to the field expressions a.id and 'bbb' including the symbol "in" and both sides of the symbol "in".
In this embodiment, whether the number of the query results of the SQL is likely to be abnormal is predicted by obtaining the relational symbols in the association conditions and the field expressions on the left and right sides of the relational symbols according to the relational symbols and the field expressions on the left and right sides of the relational symbols, and the abnormality of the query results of the SQL is reflected by the abnormality of the number of the query results, so that a data developer modifies the SQL in advance, thereby improving the efficiency of detecting the data quantity abnormality.
Fig. 5 is a flowchart of a data query anomaly detection method according to a fourth embodiment of the present application, and as shown in fig. 5, the method includes the following steps:
step S501, a data query instruction is obtained, and the data query instruction is used for querying a data table.
Step S502, determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table.
Step S501 is the same as step S101, and step S502 is the same as step S102, and the detailed method is not repeated here.
Step S503, if the data query instruction includes the association condition, obtaining a relationship symbol in the association condition and field expressions on the left and right sides of the relationship symbol.
Step S503 is the same as step S403, and the detailed method is not described herein again.
Step S504, determining whether the association condition is always satisfied according to the relationship symbol and the field expressions on the left side and the right side of the relationship symbol.
For the association condition table2. reginsttime ═ table4. ordetime, the present embodiment determines whether the association condition table2. reginsttime ═ table4. ordetime is constantly satisfied from the field expressions table2. reginsttime and table4. ordetime on the left and right sides of the equal sign "═ and the equal sign.
In addition, in other embodiments, whether the association condition table2. reginsttime < > '2016 year 1 month' is always satisfied may be determined based on the inequality sign "< >" and the field expressions table2. reginsttime and '2016 year 1 month' on both sides of the inequality sign "< >" and on both sides of the inequality sign "< >". As can be seen from table2, since the only values of table2. reginsttime are '6 th 2016' and table2. reginsttime are not '1 th 2016', table2 shows that table2. reginsttime < > '1 th 2016' always holds, that is, table2 shows that the probability of table2. reginsttime being '1 th 2016' is 0, and thus table2. reginsttime < > '1 th 2016' can also be determined to always hold.
Further, in other embodiments, it may be determined whether the association condition a.id in ('bbb') is always true according to the field expressions a.id and 'bbb' including the symbol "in" and both sides of the symbol "in".
Step S505, if the association condition is always satisfied, determining that the number of query results corresponding to the data query instruction is abnormal.
In this embodiment, if the association condition table2. registry time ═ table4. orderlime is always satisfied, a cartesian product occurs when the SQL is executed, which causes an abnormality in the number of query results corresponding to the SQL.
In addition, in other embodiments, if the association condition table2. reginsttime < > '2016 year and 1 month' is always satisfied, a cartesian product may occur when the SQL is executed, resulting in an abnormal number of query results corresponding to the SQL.
In addition, in other embodiments, if the association condition a.id in ('bbb') is always satisfied, a cartesian product occurs when the SQL is executed, resulting in an exception to the number of query results corresponding to the SQL.
In this embodiment, whether the association condition is constantly satisfied is determined by obtaining the relationship symbol in the association condition and the field expressions on the left and right sides of the relationship symbol, and according to the relationship symbol and the field expressions on the left and right sides of the relationship symbol, if the association condition is constantly satisfied, a cartesian product may occur when the SQL is executed, which may cause an abnormality in the number of query results corresponding to the SQL, and perform a pre-warning to allow a data developer to modify the SQL in advance, thereby improving the efficiency of data amount abnormality detection.
Fig. 6 is a flowchart of a data query anomaly detection method according to a fifth embodiment of the present application, and as shown in fig. 6, the method includes the following steps:
step S601, a data query instruction is obtained, and the data query instruction is used for querying a data table.
Step S602, determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table.
Step S601 is the same as step S101, and step S602 is the same as step S102, and the detailed method is not repeated here.
Step S603, if the data query instruction includes the association condition, obtaining a relationship symbol in the association condition and field expressions on the left and right sides of the relationship symbol.
Step S603 is the same as step S403, and the detailed method is not described herein again.
Optionally, the relationship symbol is an equal sign, for example, table2. reginsttime ═ table4. ordertime.
Step S604, whether the field expressions on the left side and the right side of the equal sign are the same or not is determined.
In the present embodiment, for the association condition table2. reginsttime ═ table4. ordetime, it is determined whether the field expressions table2. reginsttime and table4. ordetime on the left and right sides of the equal sign ═ are the same.
And step S605, if the expressions of the fields on the left side and the right side of the equal sign are the same, determining that the association condition is always satisfied.
Formally, table2. reginsttime and table4.ordert time are different, but table2. reginsttime and table4.ordert time both indicate that 2016 is 6 months and 2016 is fixed, and the correlation condition will always hold.
In addition, in other embodiments, the association condition may also be a.id ═ a.id or join on 1 ═ 1, and since the field expressions on both left and right sides with equal signs are both a.id or 1, the association condition a.id ═ a.id or join on 1 ═ 1 may always be satisfied.
Step S606, if the association condition is always satisfied, determining that the number of the query results corresponding to the data query instruction is abnormal.
If the association condition is always satisfied, a cartesian product may occur during the association operation, and the number of query results generated after the association operation is performed may be abnormally increased, resulting in an abnormal number of query results.
In this embodiment, by obtaining the relationship symbol in the association condition and the field expressions on the left and right sides of the relationship symbol, if the relationship symbol is an equal sign, it is determined whether the field expressions on the two sides of the equal sign are the same, and if the field expressions on the two sides of the equal sign are the same, it is determined that the relationship symbol is always true, a cartesian product occurs during association operation, it is predicted that the number of query results generated after the association operation is executed will be abnormally increased, which causes the number of query results to be abnormal, and a pre-warning is performed to allow a data developer to modify the SQL in advance, so that the efficiency of data amount abnormality detection is improved.
Fig. 7 is a flowchart of a data query anomaly detection method according to a sixth embodiment of the present application, and as shown in fig. 7, the method includes the following steps:
step S701, a data query instruction is obtained, and the data query instruction is used for querying a data table.
Step S702, determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table.
Step S701 is the same as step S101, and step S702 is the same as step S102, and the detailed method is not repeated here.
Step S703, if the data query instruction includes the association condition, obtaining a relationship symbol in the association condition and field expressions on the left and right sides of the relationship symbol.
Step S703 is the same as step S403, and the detailed method is not described herein again.
Optionally, the relationship symbol is an equal sign, for example, table2. reginsttime ═ table4. ordertime.
Step S704, determining whether the field expressions on the left side and the right side of the equal sign point to the same constant.
For the association condition TABLE2. registry time is different from TABLE4.order time, and based on whether TABLE2. registry time and TABLE4.order time point to the same constant, specifically, based on the syntax tree shown in fig. 2, it is determined whether TABLE2. registry time and TABLE4.order time point to the same constant, as shown in fig. 2, the third child node of node TOK _ JOIN represents the relation operator "TABLE 2. registry time ═ TABLE4.order time", the left sub-tree of this child node is subjected to middle-order, and the text contents of all nodes except the node of TOK _ taor _ COL are sequentially spliced, and the text contents of TABLE2. registry character, and the right sub-tree are subjected to traversal to the same character string.
As shown in fig. 2, the syntax tree is used to query the constants represented by table2.regist time, and the specific query method comprises the following steps:
1) table2. reginsttime is first divided into table2, which represents the alias of SQL sub-query, and reginsttime, which represents the field name.
2) From all the TOK _ SUBQUERY child nodes under TOK _ JOIN, the TOK _ SUBQUERY node with the text content of the right child node as table2 is found.
3) In the sub-tree of the TOK _ SUBQUERY node, if the text content of the right child node of the TOK _ SELEXPR node exists under TOK _ INSERT and is Registrtime, and the left child node is a constant, the value of the constant is the value represented by Registrtime, and if not, the representation of the Registrtime is not a constant value. In the syntax tree graph shown in fig. 2, the value represented by regist time is found as constant '201606', 201606 represents 2016 for 6 months.
The same is true for steps 1), 2), 3), the constant represented by table4.OrderTime can be queried and obtained is also constant '201606'.
Step S705, if the field expressions on the left and right sides of the equal sign point to the same constant, determining that the association condition is always true.
As can be seen from step S704, the field expressions table2, reginsttime and table4, ordertime on the left and right sides point to the same constant '201606', thereby determining that the association condition is always satisfied.
Step S706, if the association condition is constantly satisfied, determining that the number of query results corresponding to the data query instruction is abnormal.
If the correlation condition is always satisfied, a cartesian product may occur during the correlation operation, and it is predicted that the data size of the query result of the SQL may be abnormal.
In this embodiment, whether the association condition is constantly satisfied is determined by detecting whether the field expressions on the left and right sides of the equal sign of the association condition point to the same constant, the field expressions on the left and right sides of the equal sign of the association condition may be different, but the field expressions on the left and right sides of the equal sign of the association condition may point to the same constant, if the field expressions on the left and right sides of the equal sign of the association condition point to the same constant, the field expressions on the left and right sides of the equal sign are essentially the same, and the association condition is constantly satisfied, so that it is avoided that the field expressions may point to the same constant due to the different field expression forms on the left and right sides of the equal sign, the association condition is prevented from being erroneously determined, and the accuracy of data amount abnormality detection is improved.
Fig. 8 is a flowchart of a data query anomaly detection method according to a seventh embodiment of the present application, and as shown in fig. 8, the method includes the following steps:
step S801, acquiring a data query instruction, wherein the data query instruction is used for querying a data table.
Step S802, determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table.
Step S301 is the same as step S101, and step S302 is the same as step S102, and the detailed method is not repeated here.
Step S803, if the data query instruction includes the association condition, obtaining a relationship symbol in the association condition and field expressions on the left and right sides of the relationship symbol.
Step S803 is the same as step S403, and the detailed method is not described here again.
Step S804, determining whether the types of the field expressions on the left side and the right side of the relation symbol are the same according to the field expressions on the left side and the right side of the relation symbol.
For example, for the association condition table2. reginsttime ═ table4.ordert time, the relation symbol is equal "═ and the field expressions on the left and right sides of equal" ═ are table2. reginsttime and table4.ordert time, respectively, and the types of table2. reginsttime and table4.ordert time may be the same or may be different.
And step S805, if the types of the field expressions on the left side and the right side of the relational symbol are different, performing type conversion on the types of the field expressions on the left side and the right side of the relational symbol.
If the equal sign "═ is different between the left and right field expressions table2. registry time and table4.OrderTime in type, the types of table2. registry time and table4.OrderTime are subjected to type conversion so that the converted types are the same.
Optionally, type conversion is performed on the types of the field expressions on the left and right sides of the relational symbol, which may be implemented to convert the types of the field expressions on the left and right sides of the relational symbol into the same target type, specifically, conversion may be performed according to the corresponding relationship shown in table 1:
TABLE 1
Bigint Double String Datetime Boolean Decimal
Bigint Double Double N N Decimal
Double Double Double N N Decimal
String Double Double Datetime N Decimal
Datetime N N Datetime N N
Boolean N N N N N
Decimal Decimal Decimal Decimal N N
In table 1, big represents a long integer, Double represents a Double-precision floating point, String represents a String type, Datetime represents a time type, Boolean represents a Boolean type, Decimal represents a Decimal type, "N" represents that two data types cannot be type-converted, and "-" represents that two data types do not need to be type-converted.
The conversion relationship between different data types is not limited to the type conversion relationship shown in table 1, and the type conversion relationship shown in table 1 is only one of a plurality of types of conversion relationships. In other embodiments, the conversion relationships between different data types may also be different from the type conversion relationships shown in Table 1.
Assuming that table2.regist time represents 7 th 30 th 2016, and "7 th 30 th 2016" may be 201606070730 when stored in the table as a numerical value, table4.ordert time represents 200 minutes at 7 th 30 th 55 th 2016, and "200 milliseconds at 200 th 55 th 30 th 55 th 6 th 2016 when 7 th 30 th 55 th" may be 20160607073055200 when stored in the table as a numerical value, it can be seen that when table2.regist time takes the value of time 201606070730 and table4.ordert time takes the value of 20160607073055200, table2.regist time and table4.ordert time are not equal, since the number of bits of table4. orgert time is more than the number of bits of table2.regist time, for comparison of equivalent number of bits, it is necessary to convert 64 and 20160607073055200 to the numerical value of equivalent number of bits, optionally, 55 seconds 200 seconds are negligible, and when the number of table4. orgert 25, 674. butler time is equal to the value of equivalent number of table4. 201606070730. It can be seen that the two originally unequal variables become equal after the equal-digit conversion, that is, the originally unrealistic correlation condition becomes constant after the equal-digit conversion. In data storage of the internet, data type conversion often occurs, for example, data of a Double type is converted into data of a String type, the data of the Double type and the data of the String type may not be equal before conversion, but the data of the same type may become equal after type conversion, and if the two data are constant, the association condition may become constant from non-establishment.
And step S806, determining whether the converted field expression can have accuracy loss.
In this embodiment, if the precision of the field expression is lost during the conversion process, the association condition before the conversion is not established and the association condition after the conversion is established, or the association condition before the conversion is established and the association condition after the conversion is not established.
Step S807, determining that the number of query results corresponding to the data query instruction is abnormal if the association condition is satisfied after performing type conversion on the types of the field expressions on the left and right sides of the relational symbol.
Since the longest data length of the String type may be 8MB, in other embodiments, the data length of the String type is greater than the data length of the Double type, and when converting the data of the String type into the data of the Double type, the data of the String type needs to be truncated, and the truncated data of the String type may be the same as the data of the Double type, that is, the data of the String type before the type conversion is different from the data of the Double type, and the data of the String type after the type conversion may be the same as the data of the Double type. The amount of data associated in this way is larger than the real amount of data, so that the number of query results is abnormal.
In this embodiment, by comparing the types of the field expressions on both sides of the equal sign of the association condition, two different types of field expressions are converted into the same type of field expression, so that the two unequal field expressions before the type conversion may be equal after the type conversion, and by detecting that the two unequal field expressions before the type conversion become equal after the type conversion, it is determined that the associated data amount is larger than the real data amount, which causes an abnormality in the number of query results, and the accuracy of data amount abnormality detection is further improved.
A data query abnormality detection apparatus according to one or more embodiments of the present application will be described in detail below. These data query abnormality detection means may be implemented in a server. Those skilled in the art will appreciate that these data query anomaly detection means can be constructed by configuring the steps taught in the present embodiment using commercially available hardware components. For example, the processor components (or processing modules, processing units) may use components such as single-chip, micro-controllers, microprocessors, etc. from texas instruments, intel corporation, ARM corporation, etc.
Fig. 9 is a schematic structural diagram of a data query anomaly detection apparatus according to an embodiment of the present application, as shown in fig. 9, the apparatus includes: an acquisition module 11, a first determination module 12, and a second determination module 13.
The obtaining module 11 is configured to obtain a data query instruction, where the data query instruction is used to query a data table;
a first determining module 12, configured to determine whether there is an association condition in the data query instruction, where the association condition is used to associate a data table;
a second determining module 13, configured to, when the first determining module determines that the data query instruction includes the association condition, determine whether a query result corresponding to the data query instruction is abnormal according to the association condition.
The apparatus shown in fig. 9 may execute the data query abnormality detection method described in the embodiment shown in fig. 1, and the implementation principle and the technical effect are not described again.
Fig. 10 is a schematic structural diagram of the data query anomaly detection apparatus according to the second embodiment of the present application, and as shown in fig. 10, the obtaining module 11 is further configured to obtain a relationship symbol in the association condition and field expressions on the left and right sides of the relationship symbol; the second determining module 13 is specifically configured to determine whether the number of query results corresponding to the data query instruction is abnormal according to the relationship symbol and the field expressions on the left and right sides of the relationship symbol.
The second determination module 13 includes: a first determining unit 131 and a second determining unit 132.
The first determining unit 131 is configured to determine whether the association condition is always satisfied according to the relationship symbol and the field expressions on the left and right sides of the relationship symbol.
A second determining unit 132, configured to determine that an exception occurs in the number of query results corresponding to the data query instruction if the association condition is constantly satisfied.
The apparatus shown in fig. 10 may execute the data query abnormality detection method described in the embodiment shown in fig. 3, and the implementation principle and the technical effect are not described again.
On the basis of the embodiment shown in fig. 10, the relationship symbols are equal signs; the first determining unit 12 is specifically configured to determine whether field expressions on the left and right sides of the equal sign are the same; and if the expressions of the field on the left side and the right side of the equal sign are the same, determining that the association condition is always satisfied.
Or, the relation symbols are equal signs; the first determining unit 12 is specifically configured to determine whether the field expressions on the left and right sides of the equal sign point to the same constant; and if the field expressions on the left side and the right side of the equal sign point to the same constant, determining that the association condition is always satisfied.
The apparatus shown in fig. 10 may execute the data query abnormality detection method described in the embodiments shown in fig. 4, 5, and 6, and the implementation principle and technical effect thereof are not described again.
Fig. 11 is a schematic structural diagram of a data query anomaly detection apparatus according to a third embodiment of the present application, and as shown in fig. 11, the second determining module 13 includes: a first determining unit 131, a second determining unit 132, and a type converting unit 133.
A first determining unit 131, configured to determine whether types of field expressions on left and right sides of the relationship symbol are the same according to the field expressions on the left and right sides of the relationship symbol;
a type conversion unit 133, configured to perform type conversion on the types of the field expressions on the left and right sides of the relational symbol when the types of the field expressions on the left and right sides of the relational symbol are different;
the second determining unit 132 determines that the number of query results corresponding to the data query instruction is abnormal when the association condition is not satisfied before the type conversion is performed on the types of the left and right field expressions of the relational symbol and the association condition is satisfied after the type conversion is performed on the types of the left and right field expressions of the relational symbol.
The apparatus shown in fig. 11 may execute the data query abnormality detection method described in the embodiments shown in fig. 7 and 8, and the implementation principle and technical effect thereof are not described again.
With regard to the data query abnormality detection apparatus in the above embodiment, the specific manner in which each module and unit performs operations has been described in detail in the embodiment related to the method, and will not be elaborated here.
Having described the internal functions and structure of the data query abnormality detection apparatus as described above, as shown in fig. 12, in practice, the data query abnormality detection apparatus may be implemented as a server including: a processing unit and a storage unit.
The storage unit is used for storing a data table.
The processing unit is coupled to the storage unit and used for acquiring a data query instruction, and the data query instruction is used for querying a data table in the storage unit; determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table; and if the data query instruction comprises the association condition, determining whether a query result corresponding to the data query instruction is abnormal or not according to the association condition.
In the embodiment, by acquiring the association condition in the data query instruction, such as SQL, and predicting whether the query result of the SQL may be abnormal according to the association condition, a pre-warning is performed to allow a data developer to modify the SQL in advance, so that the efficiency of detecting the data quantity abnormality is improved.
Fig. 13 is a block diagram of a server according to another embodiment of the present application, and as shown in fig. 13, the server 1900 includes a processing component 1922, which further includes one or more processors and memory resources, represented by memory 1932, for storing instructions, such as application programs, that are executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the methods of steps S101-S807 described above.
The device 1900 may also include a power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input/output (I/O) interface 1958. The device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In addition, the embodiment of the present application also provides a data query abnormality detection system, which includes the data query abnormality detection apparatus shown in fig. 9 to 11, or includes the server shown in fig. 12 or 13.
In the embodiment, by acquiring the association condition in the data query instruction, such as SQL, and predicting whether the query result of the SQL may be abnormal according to the association condition, a pre-warning is performed to allow a data developer to modify the SQL in advance, so that the efficiency of detecting the data quantity abnormality is improved.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (15)

1. A method for data query anomaly detection, comprising:
acquiring a data query instruction, wherein the data query instruction is used for querying a data table;
determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table;
if the data query instruction comprises the association condition, determining whether a query result corresponding to the data query instruction is abnormal or not according to the association condition;
wherein, the determining whether the number of the query results corresponding to the data query instruction is abnormal according to the association condition includes:
acquiring a relation symbol in the association condition and field expressions on the left side and the right side of the relation symbol;
determining whether the association condition is constantly satisfied according to the relationship symbol and the field expressions on the left side and the right side of the relationship symbol;
and if the association condition is constantly satisfied, determining that the number of the query results corresponding to the data query instruction is abnormal.
2. The method according to claim 1, wherein the determining whether the query result corresponding to the data query instruction is abnormal according to the association condition comprises:
and determining whether the quantity of the query results corresponding to the data query instruction is abnormal or not according to the association condition.
3. The method of claim 1, wherein the relationship symbols are equal signs;
determining whether the association condition is always satisfied according to the relationship symbol and the field expressions on the left side and the right side of the relationship symbol includes:
determining whether the field expressions on the left side and the right side of the equal sign are the same;
and if the expressions of the field on the left side and the right side of the equal sign are the same, determining that the association condition is always satisfied.
4. The method of claim 1, wherein the relationship symbols are equal signs;
determining whether the association condition is always satisfied according to the relationship symbol and the field expressions on the left side and the right side of the relationship symbol includes:
determining whether the field expressions on the left side and the right side of the equal sign point to the same constant;
and if the field expressions on the left side and the right side of the equal sign point to the same constant, determining that the association condition is always satisfied.
5. The method according to claim 1, wherein the determining whether the number of query results corresponding to the data query instruction is abnormal according to the relationship symbol and the field expressions on the left and right sides of the relationship symbol comprises:
determining whether the types of the field expressions on the left side and the right side of the relational symbol are the same according to the field expressions on the left side and the right side of the relational symbol;
if the types of the field expressions on the left side and the right side of the relational symbol are different, performing type conversion on the types of the field expressions on the left side and the right side of the relational symbol;
and if the association condition is established after the type conversion is carried out on the types of the field expressions on the left side and the right side of the relational symbol, determining that the quantity of the query results corresponding to the data query instruction is abnormal.
6. The method of claim 5, wherein the type converting the type of the field expression on the left and right sides of the relational symbol comprises:
and converting the types of the field expressions on the left side and the right side of the relational symbol into the same target type.
7. The method of claim 1, wherein after determining whether an associated condition exists in the data query instruction, further comprising:
and if the data query instruction does not include the association condition, determining that the number of query results corresponding to the data query instruction is abnormal.
8. The method according to any one of claims 1-7, wherein the determining whether there is an associated condition in the data query instruction comprises:
carrying out syntax analysis on the data query instruction to obtain a syntax tree;
and determining whether the data query instruction has an associated condition according to whether the syntax tree comprises a target child node, wherein the target child node is used for identifying the associated condition.
9. The method according to any one of claims 1-7, wherein the determining whether there is an associated condition in the data query instruction comprises:
analyzing the data query instruction to obtain a plurality of field expressions;
and determining whether the data query instruction has an associated condition according to whether a target field expression is included in the field expressions, wherein the target field expression is used for identifying the associated condition.
10. A data query abnormality detection device characterized by comprising:
the data query module is used for querying a data table;
the first determination module is used for determining whether the data query instruction has an associated condition, and the associated condition is used for associating a data table;
a second determining module, configured to determine, when the first determining module determines that the data query instruction includes the association condition, whether a query result corresponding to the data query instruction is abnormal according to the association condition;
the obtaining module is further configured to obtain a relationship symbol in the association condition and field expressions on the left and right sides of the relationship symbol;
the second determining module includes:
a first determining unit, configured to determine whether the association condition is satisfied constantly according to the relationship symbol and field expressions on the left and right sides of the relationship symbol;
and the second determining unit is used for determining that the number of the query results corresponding to the data query instruction is abnormal if the association condition is constantly satisfied.
11. The data query anomaly detection device according to claim 10, wherein said relational symbols are equal signs;
the first determining unit is specifically configured to determine whether field expressions on the left side and the right side of the equal sign are the same; and if the expressions of the field on the left side and the right side of the equal sign are the same, determining that the association condition is always satisfied.
12. The data query anomaly detection device according to claim 10, wherein said relational symbols are equal signs;
the first determining unit is specifically configured to determine whether the field expressions on the left and right sides of the equal sign point to the same constant; and if the field expressions on the left side and the right side of the equal sign point to the same constant, determining that the association condition is always satisfied.
13. The data query anomaly detection device according to claim 10, wherein said second determination module comprises:
a first determining unit, configured to determine whether types of field expressions on left and right sides of the relationship symbol are the same according to the field expressions on the left and right sides of the relationship symbol;
the type conversion unit is used for performing type conversion on the types of the field expressions on the left side and the right side of the relational symbol when the types of the field expressions on the left side and the right side of the relational symbol are different;
and the second determining unit is used for determining that the number of the query results corresponding to the data query instruction is abnormal when the association condition is not established before the type conversion is performed on the types of the field expressions on the left side and the right side of the relational symbol and the association condition is established after the type conversion is performed on the types of the field expressions on the left side and the right side of the relational symbol.
14. A server, comprising: a processing unit and a storage unit;
the storage unit is used for storing a data table;
the processing unit is coupled to the storage unit and used for acquiring a data query instruction, and the data query instruction is used for querying a data table in the storage unit; determining whether the data query instruction has an associated condition, wherein the associated condition is used for associating a data table; if the data query instruction comprises the association condition, determining whether a query result corresponding to the data query instruction is abnormal or not according to the association condition;
wherein, the determining whether the number of the query results corresponding to the data query instruction is abnormal according to the association condition includes:
acquiring a relation symbol in the association condition and field expressions on the left side and the right side of the relation symbol;
determining whether the association condition is constantly satisfied according to the relationship symbol and the field expressions on the left side and the right side of the relationship symbol;
and if the association condition is constantly satisfied, determining that the number of the query results corresponding to the data query instruction is abnormal.
15. A data query anomaly detection system, comprising: data query anomaly detection device according to any one of claims 10-13, or server according to claim 14.
CN201610742829.8A 2016-08-26 2016-08-26 Data query anomaly detection method, device, equipment and system Active CN107784003B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610742829.8A CN107784003B (en) 2016-08-26 2016-08-26 Data query anomaly detection method, device, equipment and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610742829.8A CN107784003B (en) 2016-08-26 2016-08-26 Data query anomaly detection method, device, equipment and system

Publications (2)

Publication Number Publication Date
CN107784003A CN107784003A (en) 2018-03-09
CN107784003B true CN107784003B (en) 2021-09-21

Family

ID=61441304

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610742829.8A Active CN107784003B (en) 2016-08-26 2016-08-26 Data query anomaly detection method, device, equipment and system

Country Status (1)

Country Link
CN (1) CN107784003B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110046153B (en) * 2018-11-19 2023-09-05 创新先进技术有限公司 Account fund checking method, device and equipment
CN110717228B (en) * 2019-10-10 2024-05-17 中国航发沈阳发动机研究所 Method and system for acquiring boundary conditions of air system of gas turbine engine
CN111190906B (en) * 2019-12-31 2023-06-20 全球能源互联网研究院有限公司 Sensor network data anomaly detection method
CN112181831B (en) * 2020-09-28 2024-09-27 中国平安财产保险股份有限公司 Script performance verification method, device, equipment and storage medium based on keywords
CN114647636B (en) * 2022-05-13 2022-08-12 杭银消费金融股份有限公司 Big data anomaly detection method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521416A (en) * 2011-12-28 2012-06-27 用友软件股份有限公司 Data correlation query method and data correlation query device
US8499170B1 (en) * 2008-10-08 2013-07-30 Trend Micro, Inc. SQL injection prevention
CN103338208A (en) * 2013-07-16 2013-10-02 五八同城信息技术有限公司 Method and system for SQL injection and defense
CN104361035A (en) * 2014-10-27 2015-02-18 深信服网络科技(深圳)有限公司 Method and device for detecting database tampering behavior
US9031969B2 (en) * 2010-06-29 2015-05-12 Oracle International Corporation Guaranteed in-flight SQL insert operation support during an RAC database failover
CN104778185A (en) * 2014-01-15 2015-07-15 中国移动通信集团北京有限公司 Determination method for abnormal SQL (structured query language) statement and server

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8499170B1 (en) * 2008-10-08 2013-07-30 Trend Micro, Inc. SQL injection prevention
US9031969B2 (en) * 2010-06-29 2015-05-12 Oracle International Corporation Guaranteed in-flight SQL insert operation support during an RAC database failover
CN102521416A (en) * 2011-12-28 2012-06-27 用友软件股份有限公司 Data correlation query method and data correlation query device
CN103338208A (en) * 2013-07-16 2013-10-02 五八同城信息技术有限公司 Method and system for SQL injection and defense
CN104778185A (en) * 2014-01-15 2015-07-15 中国移动通信集团北京有限公司 Determination method for abnormal SQL (structured query language) statement and server
CN104361035A (en) * 2014-10-27 2015-02-18 深信服网络科技(深圳)有限公司 Method and device for detecting database tampering behavior

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于分类的SQL注入攻击双层防御模型研究;田玉杰等;《信息网络安全》;20150630;第2015年卷(第6期);第1-6页 *

Also Published As

Publication number Publication date
CN107784003A (en) 2018-03-09

Similar Documents

Publication Publication Date Title
US20230409835A1 (en) Discovering a semantic meaning of data fields from profile data of the data fields
CN107784003B (en) Data query anomaly detection method, device, equipment and system
US20190303141A1 (en) Syntax Based Source Code Search
EP3080721B1 (en) Query techniques and ranking results for knowledge-based matching
US8559731B2 (en) Personalized tag ranking
CA2939915C (en) Managing data profiling operations related to data type
US10318595B2 (en) Analytics based on pipes programming model
EP3080723B1 (en) Building features and indexing for knowledge-based matching
US10642897B2 (en) Distance in contextual network graph
US11461321B2 (en) Technology to reduce cost of concatenation for hash array
US9959116B2 (en) Scalable transitive violation matching
US9384236B2 (en) Method and system for operating on database queries
US20210042589A1 (en) System and method for content-based data visualization using a universal knowledge graph
US9990268B2 (en) System and method for detection of duplicate bug reports
US20190213007A1 (en) Method and device for executing the distributed computation task
CN111881183B (en) Enterprise name matching method and device, storage medium and electronic equipment
US20130326048A1 (en) Contextual network access optimizer
WO2022149088A1 (en) System and method for selection and discovery of vulnerable software packages
Tekli et al. Approximate XML structure validation based on document–grammar tree similarity
Ramesh et al. Granite: A distributed engine for scalable path queries over temporal property graphs
US11809862B2 (en) Related change analysis of multiple version control systems
Esmaeilpour et al. Design pattern mining using distributed learning automata and DNA sequence alignment
US20240086941A1 (en) Systems and methods to identify technographics for a company
US20150371140A1 (en) Minimizing Symbolic Finite Automata
US11010387B2 (en) Join operation and interface for wildcards

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20180418

Address after: Four story 847 mailbox of the capital mansion of Cayman Islands, Cayman Islands, Cayman

Applicant after: CAINIAO SMART LOGISTICS HOLDING Ltd.

Address before: Cayman Islands Grand Cayman capital building a four storey No. 847 mailbox

Applicant before: ALIBABA GROUP HOLDING Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant