CN112131245A

CN112131245A - High-performance data access system and method of mimicry defense architecture

Info

Publication number: CN112131245A
Application number: CN202011011365.6A
Authority: CN
Inventors: 徐悦; 倪明; 余新胜; 解维
Original assignee: CETC 32 Research Institute
Current assignee: CETC 32 Research Institute
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2020-12-25

Abstract

The invention provides a high-performance data access system and a method of a mimicry defense architecture, which comprises the following steps: the SQL attack detection module: in order to detect the impersonable non-bypass SQL attack, an SQL injection detection algorithm is integrated, and once the statement is detected to contain SQL injection, the statement is discarded; the SQL request processing module: the SQL request is processed, and the read-write separation technology is adopted, so that the access pressure of a single node is reduced; the database request processing module: and the database slicing technology is used, and the application requirements are met by subsequently adding nodes. The invention simplifies n database clusters into 1 database cluster, thereby removing the redundancy of the background database. The invention can efficiently realize the data request combination of each heterogeneous executive body and solve the problem of high concurrent access. And the middleware adds a defense mechanism to make up for the reduction of the security performance caused by the number of the database clusters.

Description

High-performance data access system and method of mimicry defense architecture

Technical Field

The invention relates to the technical field of mimicry defense, in particular to a high-performance data access system and a method of a mimicry defense architecture.

Background

Mimicry defense techniques: the Dynamic Heterogeneous Redundancy (DHR) framework is designed with the intrinsic function of a target system, and a fusion defense function is obtained after a decision mechanism, a strategy scheduling mechanism, a negative feedback control mechanism and a multi-dimensional dynamic reconstruction mechanism based on a mimicry camouflage strategy are introduced.

Dynamic heterogeneous redundancy architecture: the framework is a closed-loop iterative multi-dimensional dynamic reconstruction robust control structure which is composed of an equivalent heterogeneous executive body, an input agent, an output agent, a resolution strategy, a feedback controller and a scheduler. The method is a core framework of a mimicry defense technology, and can effectively deal with security threats brought by unknown vulnerabilities and backdoors.

A. The distributed transaction model is as follows: the distributed transaction means that a participant of the transaction, a server supporting the transaction, a resource server and a transaction manager are respectively located on different nodes of different distributed systems. The X/Open organization proposes a distributed transaction specification XA defining a distributed processing model X/Open DTP comprising three component applications AP, a resource manager RM and a transaction manager TM, the relationships of which are shown in the following figure. Among the three components, the AP communicates with the TM and RM, which can communicate with each other. An XA interface is defined in the DTP model, and the TM and the RM carry out bidirectional communication through the XA interface.

B. Read-write separation: the operation of the database mainly comprises four operations of adding, deleting, changing and checking, a master database (master) is used for processing three writing operations of adding, modifying and deleting, a slave database (slave) is used for processing a reading operation of inquiring, and then the data between the master database and the slave database are synchronized by master-slave copy. Fig. 2 is a scenario in which a common application uses read-write separation, so that read operations can be distributed to each node, and write operations are still performed on the master node, which can reduce the pressure caused by the master node I/O operations, thereby increasing the stability of the database system. For example, in MySQL, MySQL proxy is generally used as an intermediate layer for read-write separation, and mainly between a client and a back-end database cluster, functions such as creating a new command, load balancing, fault analysis, filtering, modifying query, and the like can be realized.

C. CAP theory: CAP theory was first proposed by Eric Brewer in 2000, mainly to illustrate the reality of distributed transactions. The CAP theory is shown in fig. 3 and includes consistency, availability, and partition fault tolerance. Consistency means that all nodes read the same data at the same time. Availability means that each request must have a feedback, whether successful or not. Partition fault tolerance means that the system can continue to operate normally even if there is a partial loss of information or other abnormal problem in the system. But in a distributed system, at most, only two performances can be simultaneously improved. This is because the total area of the circle is constant and one or both of the properties need to be improved at the expense of the other.

D. Slicing the database: there are two ways to slice a database, horizontal and vertical. Horizontal slicing is a unit of behavior in a table, and is used for splitting data in the table into different nodes and is often used for relieving the pressure of a single library and a single table. When a table with overlarge data amount is processed and data in the table is still dynamically increased, a slicing rule can be bound for the table in a horizontal slicing mode, and the data is split to a plurality of data nodes. Common horizontal slice fragmentation rules include taking values by field true values, by field range, and by field hash. The vertical slice is divided into different databases by taking the table as a basic unit, and different tables are stored on one data node. Vertical slicing is a relatively simple slicing method, and is applied to scenes with an excessive number of tables in a library. The data tables in the same service are aggregated to one node, so that the influence on the application program with clear service logic and low coupling degree is small.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a high-performance data access system and method of a mimicry defense architecture.

The invention provides a high-performance data access system of a mimicry defense architecture, which comprises:

the SQL attack detection module: in order to detect the non-bypass SQL attack which cannot be defended by mimicry, an SQL injection detection algorithm is integrated, and once a statement is detected to contain SQL injection, the statement is discarded;

the SQL request processing module: the SQL request is processed, and the read-write separation technology is adopted, so that the access pressure of a single node is reduced;

the database request processing module: and the database slicing technology is used, and the application requirements are met by subsequently adding nodes.

Preferably, the SQL request processing module adopts a read-write separation technique:

one master node corresponds to a plurality of slave nodes, and each slave node carries out full backup on data in the master node;

the writing operation is taken charge of by the master node, and simultaneously, the content is written into the local binary file;

the Slave node is responsible for reading operation, establishes connection with a master through an I/O thread and sends a binlog dump instruction to the master;

the master node stores binlog data into a relay log through a slave node;

and synchronizing the data to other slave nodes through other SQL threads by the slave node.

Preferably, the SQL request processing module:

the process flow for each SQL request is as follows:

the client side initiates an SQL request, and an SQL request statement is sent to an SQL actuator and is sent to an SQL analyzer by the SQL actuator;

the SQL parser parses the conditional information in the statement, acquires SQL request routing information from the SQL router according to a parsing result, and returns the SQL request routing information to the SQL actuator;

the SQL executor sends the SQL statements to corresponding nodes for execution, and the execution results are returned to the SQL executor;

and each heterogeneous execution body converges the same SQL request result to the SQL actuator for merging, and returns the final result to the client.

Preferably, the SQL request processing module:

for n heterogeneous executors which are strategically scheduled at each stage in the mimicry DHR architecture, a read-write request of a user is sent to a database through each heterogeneous executer;

in order to reduce the burden of the database, a queue and an internal judgment module are arranged in the data middleware, a credible request is selected to be sent to the database, and (n-1) data amount is reduced while each database request is sent;

in the data interaction model, a cache area is set, so that data can be read out from the cache.

Preferably, the database slicing technique uses a horizontal slicing technique;

the database processing module: after the horizontal slicing technique is used, the received SQL statement is processed as follows:

SQL analysis: analyzing the SQL statement to obtain an abstract syntax tree, wherein the abstract syntax tree comprises SQL information;

SQL routing: the library route and the table route form an SQL route;

SQL rewriting: changing the SQL statement into a proper correct execution statement;

SQL executes: concurrent execution after the statement is rewritten is realized, and the execution efficiency of the system is improved;

and (4) merging the result sets: and the system is responsible for merging the execution result sets from all the sub-library and sub-table operations, and the result is the final output of the whole data interaction model.

The invention provides a high-performance data access method of a mimicry defense architecture, which comprises the following steps:

SQL attack detection: in order to detect the non-bypass SQL attack which cannot be defended by mimicry, an SQL injection detection algorithm is integrated, and once a statement is detected to contain SQL injection, the statement is discarded;

SQL request processing steps: the SQL request is processed, and the read-write separation technology is adopted, so that the access pressure of a single node is reduced;

database request processing: and the database slicing technology is used, and the application requirements are met by subsequently adding nodes.

Preferably, the SQL request processing step employs a read-write separation technique:

the master node stores binlog data into a relay log through a slave node;

Preferably, the SQL request processing step:

the process flow for each SQL request is as follows:

Preferably, the SQL request processing step:

in order to reduce the burden of the database, a queue and an internal judgment step are set in the data middleware, a credible request is selected to be sent to the database, and (n-1) data amount is reduced while each database request is sent;

Preferably, the database slicing technique uses a horizontal slicing technique;

the database processing step: after the horizontal slicing technique is used, the received SQL statement is processed as follows:

SQL routing: the library route and the table route form an SQL route;

Compared with the prior art, the invention has the following beneficial effects:

1) by introducing the data access middleware between the heterogeneous executive body and the database, the strong consistency of background data of the large-scale website is ensured, data synchronization between the databases is not needed, and the application simulation transformation difficulty is reduced.

2) The designed middleware provides a universal data access interface, and the system can conveniently extend heterogeneous executors and database types.

3) The redundancy of the background database is removed by simplifying the n database clusters into 1 database cluster. The invention can efficiently realize the data request combination of each heterogeneous executive body and solve the problem of high concurrent access. And the middleware adds a defense mechanism to make up for the reduction of the security performance caused by the number of the database clusters.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a schematic diagram of a distributed transaction model provided by the present invention.

Fig. 2 is a schematic view of a scenario in which the application uses read-write separation.

FIG. 3 is a schematic diagram of the CAP theory provided by the present invention.

FIG. 4 is a simplified high availability architecture diagram of data access middleware provided between heterogeneous executors and a database according to the present invention.

FIG. 5 is a block diagram of the overall architecture of the middleware system for data access between heterogeneous executors and databases according to the present invention.

Fig. 6 is a schematic diagram of the SQL request processing module according to the present invention, which employs a read-write separation technique.

Fig. 7 is a schematic diagram of the execution process of the SQL request processing module provided in the present invention for each SQL request.

FIG. 8 is a diagram illustrating the database and table division provided by the present invention.

Fig. 9 is a schematic view of a processing flow of the received SQL statement after the database processing module provided by the present invention uses the horizontal slicing technique.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.

The present invention will be described more specifically with reference to examples.

Example (b):

aiming at the application of the mimicry defense technology to a large website, a data access middleware between a heterogeneous executive body and a database is designed, and the simplified high-availability architecture is shown in the following figure 4. First, the configuration center centralizes configuration information of each library in the database cluster, including IP addresses, Port ports, and the like. And the monitoring service module is responsible for monitoring the database cluster in real time, and if the data information changes, the changed information is sent to the configuration center. Meanwhile, the data middleware monitors the configuration center in real time and makes corresponding response.

After a series of distribution decisions, the heterogeneous executive sends the client request to the data access middleware, and in order to solve the problem of database redundancy and the problem of inconsistency of most data, we simplify n database clusters into 1. Since the heterogeneous executives are in a many-to-one relationship with the database, then there must be a high concurrency access problem. The high-performance data access middleware of the mimicry defense architecture comprises an SQL attack detection module, an SQL request processing module and a database request processing module, and data request combination is completed through the three modules together, so that the application access to the database is as efficient as possible. By the data access middleware, the problem of inconsistency of most data can be effectively solved, the application mimicry transformation difficulty is reduced, the application of a mimicry defense technology of a large website is facilitated, and the development of a mimicry defense theory is promoted. The overall architecture of the system is shown in fig. 5.

A. The SQL attack detection module: the module mainly aims at the traditional network attack method and the vulnerability backdoor of the database, when an application system is attacked by a bypass, a DHR framework can normally make a judgment and timely log off an attacked executive body; however, if the client is attacked, if someone maliciously tampers with the authority of the database, the mimicry may not respond to the attack in time, thereby destructively damaging the database. Therefore, an SQL attack detection module needs to be added to the data access middleware part to defend against the security threat (the attack detection module is mainly used for detecting a mimicry non-bypass SQL attack, and an SQL injection detection algorithm is integrated in the module, and once the statement is detected to contain SQL injection, the statement is discarded to defend against the security threat), so as to make up for the performance degradation caused by reducing the number of databases. The attack detection module is mainly used for detecting a mimicry non-defensive bypass SQL attack, an SQL injection detection algorithm is integrated in the module, and once the statement is detected to contain SQL injection, the statement is discarded so as to defend the security threat.

B. The SQL request processing module: for many large web sites, the frequency of "read" requests in traffic is greater than "write" requests. In some systems with data volume reaching ten thousand levels and access volume being more than one hundred million levels, and data growing dynamically every day, a single executive body only accesses a single database and bears huge pressure. The traditional data interaction model also needs data synchronization between databases, which is almost impossible to accomplish. For such a situation, a read-write separation technology is adopted in the SQL request processing module, so that the access pressure of a single node is relieved. As shown in fig. 6, one master node corresponds to a plurality of slave nodes, and each slave node performs full backup on data in the master node. The "write" operation is mainly handled by the master node, and the content is written into the local binary file at the same time. The Slave node is mainly responsible for reading operation, establishes connection with a master through an I/O thread and sends a binlog dump instruction to the master; subsequently, the master node stores binlog data into a relay log through the slave node; and finally, synchronizing the data to other slave nodes through other SQL threads by the slave nodes.

In some "read" operation intensive scenarios, the speed of I/O operations and network traffic is an important factor affecting the performance of the system due to the plethora of nodes in the cluster. The common optimization method is to upgrade the read-write speed of the hard disk and increase the network bandwidth, but this will increase the huge capital expenditure. For each SQL request, the execution process is shown in fig. 7 (first, the client initiates the SQL request, the SQL request statement is sent to the SQL executor, and the SQL executor sends the SQL request to the SQL parser, the SQL parser parses the condition information in the statement, obtains the SQL request routing information from the SQL router according to the parsing result, and returns the SQL request routing information to the SQL executor, the SQL executor sends the SQL statement to the corresponding node for execution, and the execution result returns to the SQL executor. For n heterogeneous executors which are scheduled in each stage of strategy in the mimicry DHR framework, the read-write request of a user is sent to a database through each heterogeneous executer. In order to reduce the burden of a database, a queue, an internal judgment module and the like are arranged in data middleware, a credible request is selected to be sent to the database, and the amount of (n-1) pieces of data is reduced while each database request is sent. Meanwhile, in the data interaction model, a cache area is set, and data can be read out from the cache as much as possible according to an algorithm set by people.

C. A database processing module: traditional large-scale website systems store all tables and data in the same database, but such a processing mode is not suitable for a scene of rapid data growth. Meanwhile, in the case of frequent "write" requests, the master node will have difficulty in withstanding the stress caused by operations such as data synchronization. Considering the database slicing technique in the data processing module, the diagram of the sub-database and sub-table is shown in fig. 8. Since the CPU processing speed, I/O throughput, memory size, etc. of a large website system are limited, as the amount of data continues to increase, the vertically sliced nodes are not sufficient to support the processing of large amounts of data. Therefore, it is necessary to use the horizontal slicing technique to meet the application requirements by subsequently adding nodes.

After using the horizontal slicing technique, the main processing steps of the module on the received SQL statement are shown in FIG. 9

SQL analysis: this operation is responsible for parsing the SQL statement, thereby obtaining an abstract syntax tree that contains important SQL information.

SQL routing: the library routes and table routes constitute SQL routes. The pool routes represent the operated sublibrary numbers and the table routes represent the operated sublibrary numbers.

SQL rewriting: the SQL rewrite part is to change the SQL statement to the appropriate correctly executed statement. For example, if a user wants to change an SQL statement that is inserted into 4 records in a batch manner into 4 records that are stored in 4 tables, the user needs to rewrite the original statement into 4 SQL statements, and each table can only insert one record.

SQL executes: the operation is the concurrent execution after the statement is rewritten, and the system execution efficiency can be effectively improved.

And (4) merging the result sets: and the result set is responsible for merging the execution result sets from each sub-library and sub-table operation, and the result is the final output of the whole data interaction model.

In the description of the present application, it is to be understood that the terms "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience in describing the present application and simplifying the description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present application.

Those skilled in the art will appreciate that, in addition to implementing the systems, apparatus, and various modules thereof provided by the present invention in purely computer readable program code, the same procedures can be implemented entirely by logically programming method steps such that the systems, apparatus, and various modules thereof are provided in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system, the device and the modules thereof provided by the present invention can be considered as a hardware component, and the modules included in the system, the device and the modules thereof for implementing various programs can also be considered as structures in the hardware component; modules for performing various functions may also be considered to be both software programs for performing the methods and structures within hardware components.

The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims

1. A high performance data access system of a mimicry defense architecture, comprising:

2. The high-performance data access middleware of the mimicry defense architecture of claim 1, wherein the SQL request processing module adopts a read-write separation technique:

the master node stores binlog data into a relay log through a slave node;

3. The high-performance data access middleware of the mimicry defense architecture of claim 1, wherein the SQL request processing module:

the process flow for each SQL request is as follows:

4. The high-performance data access middleware of the mimicry defense architecture of claim 1, wherein the SQL request processing module:

5. The high-performance data access middleware of the mimicry defense architecture of claim 1 wherein the database slicing technique uses a horizontal slicing technique;

SQL routing: the library route and the table route form an SQL route;

6. A high-performance data access method of a mimicry defense architecture is characterized by comprising the following steps:

7. The method for accessing high-performance data of a mimicry defense architecture according to claim 6, wherein the SQL request processing step adopts a read-write separation technique:

the master node stores binlog data into a relay log through a slave node;

8. The method for high-performance data access of a mimicry defense architecture of claim 6, wherein the SQL request processing step comprises:

the process flow for each SQL request is as follows:

9. The method for high-performance data access of a mimicry defense architecture of claim 6, wherein the SQL request processing step comprises:

10. The high-performance data access method of the mimicry defense architecture of claim 6, wherein the database slicing technique uses a horizontal slicing technique;

SQL routing: the library route and the table route form an SQL route;