CN117370975B - Sql injection detection method and system based on deep learning - Google Patents
Sql injection detection method and system based on deep learning Download PDFInfo
- Publication number
- CN117370975B CN117370975B CN202311678372.5A CN202311678372A CN117370975B CN 117370975 B CN117370975 B CN 117370975B CN 202311678372 A CN202311678372 A CN 202311678372A CN 117370975 B CN117370975 B CN 117370975B
- Authority
- CN
- China
- Prior art keywords
- model
- deep learning
- participant
- local
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000002347 injection Methods 0.000 title claims abstract description 35
- 239000007924 injection Substances 0.000 title claims abstract description 35
- 238000001514 detection method Methods 0.000 title claims abstract description 28
- 238000013135 deep learning Methods 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 claims abstract description 48
- 238000013136 deep learning model Methods 0.000 claims abstract description 44
- 230000004931 aggregating effect Effects 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 29
- 230000002776 aggregation Effects 0.000 claims description 27
- 238000004220 aggregation Methods 0.000 claims description 27
- 238000012549 training Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 8
- 238000005457 optimization Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 5
- 238000010276 construction Methods 0.000 claims description 4
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 description 9
- 238000003860 storage Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 230000007123 defense Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/554—Detecting local intrusion or implementing counter-measures involving event detection and direct action
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Bioethics (AREA)
- Virology (AREA)
- Medical Informatics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a deep learning-based sql injection detection method and a deep learning-based sql injection detection system, wherein the method comprises the following steps: acquiring a data set of the sql query statement; for each sql query statement, selecting a most relevant feature subset using a sparse group lasso method; each participant builds a local deep learning model based on the selected feature subset, wherein the parameters of each deep learning model are regularized by using a sparse group lasso method; aggregating model parameters of each participant in a federal learning mode; each participant trains the respective deep learning model; and detecting a new sql query statement based on the trained deep learning model. The invention can effectively protect the data privacy, reduce the risks of data leakage and attack and enhance the safety of the data; the interpretation and generalization capability of the model are improved, and the risk of overfitting is reduced.
Description
Technical Field
The invention belongs to the field of computers, and particularly relates to an sql injection detection method and system based on deep learning.
Background
In the rapid development of information technology, network security is a problem that must be faced. The frequent occurrence of network attack events causes significant losses in all respects. Intrusion detection is a common network security defense technique. The network traffic is analyzed by an effective detection means, from which traffic data having characteristics different from normal traffic are identified, in particular, attack behavior against various malicious programs. The deep learning is proposed to further learn features from a large amount of chaotic and unordered high-dimensional data, and has the advantage that a learning model can be built by setting reasonable training parameters to select optimal features. There has been much research currently applied to deep learning in sql injection detection systems for security defense. Deep learning requires a huge data set to train, and the larger the training set size, the better the performance of the model. The richer the network traffic data set, the higher the accuracy of the final model intrusion detection, but the network traffic is collected centrally, which will involve privacy concerns. Existing intrusion detection based on deep learning relies on local network traffic for model training. Different network operators, different organizations, typically do not share the network traffic sets together to construct a complete intrusion detection network traffic data set.
The sql injection detection method based on the deep learning is more, the detection accuracy is higher, and the feasibility of the deep learning in the sql injection detection is proved. Most methods also present problems such as insufficient data sets. Common centralized deep learning methods require various organizations to collect network traffic, which can lead to privacy concerns. If the method such as changing the model structure or generating a new data set is complicated, the method is difficult to apply to the real network scene.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an sql injection detection method based on deep learning, which comprises the following steps:
step S101, acquiring a dataset of sql query sentences, wherein the dataset comprises normal query sentences and malicious sql query sentences;
step S103, extracting features related to sql injection attack for each sql query statement, and selecting a most related feature subset by using a sparse group lasso method;
step S105, each participant builds a local deep learning model based on the selected feature subset, wherein the parameters of each deep learning model are regularized by using a sparse group lasso method;
step S107, aggregating model parameters of all the participants in a federal learning mode;
step S109, each participant trains the respective deep learning model;
and step S1011, detecting a new sql query statement based on the trained deep learning model.
The features related to the sql injection attack at least comprise keywords, special characters and query structures.
Wherein selecting the most relevant feature subset using sparse lasso in step S103 comprises: and calculating the feature weights by using a sparse group lasso method, and selecting the feature with the largest weight as the most relevant feature subset.
The method comprises the steps of calculating feature weights by using a sparse group lasso method, and selecting features with the largest weights as the most relevant feature subsets, wherein the method specifically comprises the following steps:
the features in the dataset are organized into a feature matrix, each row representing a query, and each column representing a feature;
for each query, determining the category to which the query belongs, and encoding the category as a target vector;
constructing an optimization problem, and adding a sparse group lasso penalty term into an objective function;
solving the optimization problem by using a gradient descent method to obtain a characteristic weight vector w;
and setting a threshold according to the weight, and selecting the feature exceeding the threshold as the most relevant feature.
Wherein the objective function is expressed as follows:
,
wherein y is a target vector; x is a feature matrix; w is a feature weight vector;representing the weight for each feature +.>Summing and applying L2 regularization; />Representing summing the different feature groups, corresponding to the keyword, the special character and the query structure, j representing the j-th feature, k representing the k-th feature group; />Representing an L2 norm within the feature set for measuring weights of the feature set; />And->The super-parameters for controlling L2 regularization and sparse group lasso punishment intensity are adjusted according to requirements.
The local deep learning model is an initialized global model, and is distributed to all participants by a central server before federal learning.
Wherein, the step S105 includes:
distributing the selected feature subset to each participant;
each participant uses the distributed feature subsets to construct a local deep learning model;
defining a cross entropy loss function for the local model of each participant;
each participant trains the local model using a local training set to optimize the loss function with sparse group lasso penalty terms.
Wherein optimizing the loss function with sparse group lasso penalty term comprises:
adding a sparse group lasso penalty term to a local loss function, regularizing model parameters, wherein the formula is as follows:
wherein,representing the local loss function,/->Weights representing model parameters +.>Summing and applying L1 regularization; />An L2 norm representing a model parameter for measuring a weight of the model; />And->Is a super-parameter for controlling the regularization intensity of L1 and L2, and can be adjusted according to the requirement.
Wherein, the step S107 includes:
after each participant completes the local model training, model parameters are sent back to a central server for aggregation;
after receiving the model parameters of each participant, the central server aggregates the model parameters according to a preset aggregation method to obtain an updated global model;
distributing the updated global model to each participant;
repeating the above three steps until reaching the iteration stop condition.
The invention also provides an sql injection detection system based on deep learning, which is characterized by comprising:
the acquisition module is used for acquiring a data set of the sql query statement, wherein the data set comprises a normal query statement and a malicious sql query statement;
the feature subset extraction module is used for extracting features related to the sql injection attack for each sql query statement and selecting the most relevant feature subset by using a sparse group lasso method;
the model construction module is used for constructing a local deep learning model by each participant based on the selected feature subset, wherein the parameters of the respective deep learning model are regularized by using a sparse group lasso method;
the parameter aggregation module is used for aggregating model parameters of all the participants in a federal learning mode;
the model training module is used for training the respective deep learning model by each participant;
the detection module is used for detecting a new sql query statement based on a trained deep learning model;
and the central server is used for assisting the modules to update parameters of the deep learning model of each participant.
Compared with the prior art, the invention has the following beneficial effects:
1. privacy protection: federal learning methods allow participants to train models locally without sharing sensitive raw data. Thus, the data privacy can be effectively protected, and the method is particularly important for tasks related to sensitive information (such as sql injection detection);
2. data security: federal learning does not require data to be transmitted to a central server, only model parameters. The risk of data leakage and attack is reduced, and the safety of the data is enhanced;
3. distributed feature selection: sparse lasso may be used to make feature selection on each participant's local data. By selecting the feature with the greatest weight, the most relevant feature subset can be obtained. This helps to reduce feature dimensions, improving the interpretability and generalization ability of the model;
4. model generalization ability: the sparse group lasso method regularizes model parameters, encouraging the model to generate sparse weights. This helps to improve the generalization ability of the model, reducing the risk of overfitting;
5. merging multiparty models: federal learning allows model parameters of participants to be aggregated on a central server to generate a global model. Knowledge and characteristics of all parties can be fused by combining the multiparty models, and the model performance is further improved.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar or corresponding parts and in which:
fig. 1 is a flowchart showing an sql injection detection method based on deep learning according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, the "plurality" generally includes at least two.
It should be understood that although the terms first, second, third, etc. may be used to describe … … in embodiments of the present invention, these … … should not be limited to these terms. These terms are only used to distinguish … …. For example, the first … … may also be referred to as the second … …, and similarly the second … … may also be referred to as the first … …, without departing from the scope of embodiments of the present invention.
It should be understood that the term "and/or" as used herein is merely one relationship describing the association of the associated objects, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
The words "if", as used herein, may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such product or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a commodity or device comprising such element.
Alternative embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Embodiment 1,
As shown in fig. 1, the invention discloses an sql injection detection method based on deep learning, which comprises the following steps:
step S101, acquiring a dataset of sql query sentences, wherein the dataset comprises normal query sentences and malicious sql query sentences;
step S103, extracting features related to sql injection attack for each sql query statement, and selecting a most related feature subset by using a sparse group lasso method;
step S105, each participant builds a local deep learning model based on the selected feature subset, wherein the parameters of each deep learning model are regularized by using a sparse group lasso method;
step S107, aggregating model parameters of all the participants in a federal learning mode;
step S109, each participant trains the respective deep learning model;
and step S1011, detecting a new sql query statement based on the trained deep learning model.
In one embodiment, the aggregated model is used to detect new sql query statements. The participants take the input query statement as input, and output corresponding prediction results by using the aggregation model to indicate whether the sql injection attack exists.
Embodiment II,
The invention provides an sql injection detection method based on deep learning, which comprises the following steps:
step S101, acquiring a dataset of sql query sentences, wherein the dataset comprises normal query sentences and malicious sql query sentences;
step S103, extracting features related to sql injection attack for each sql query statement, and selecting a most related feature subset by using a sparse group lasso method;
step S105, each participant builds a local deep learning model based on the selected feature subset, wherein the parameters of each deep learning model are regularized by using a sparse group lasso method;
step S107, aggregating model parameters of all the participants in a federal learning mode;
step S109, each participant trains the respective deep learning model;
and step S1011, detecting a new sql query statement based on the trained deep learning model.
The features related to the sql injection attack at least comprise keywords, special characters and query structures.
Features associated with sql injection attacks may be extracted as feature extraction is performed locally for each participant. These features may include the following:
keywords (Keywords): keywords commonly used in sql query statements, such as SELECT, INSERT, UPDATE, DELETE, etc. These keywords are typically associated with sql injection attacks.
Special character (Special Characters): special characters in sql query statements, such as quotation marks (') and semicolons (;). These special characters are often used in sql injection attacks to bypass input verification and inject malicious code.
Query Structure (Query Structure): the structure and components of sql query statements, such as table names, column names, operators, logical statements, and the like. sql injection attacks typically attempt to modify the query structure for attack purposes.
Each participant (e.g., a different organization or device) collects a local dataset containing normal and malicious sql query statements. These query statements should be labeled and vectorized.
Wherein selecting the most relevant feature subset using sparse lasso in step S103 comprises: and calculating the feature weights by using a sparse group lasso method, and selecting the feature with the largest weight as the most relevant feature subset.
In one embodiment, feature extraction in sql injection is typically as follows:
keyword feature extraction:
One-Hot Encoding: each keyword is mapped into a binary vector, the position of the corresponding keyword in the vector is 1, and the other positions are 0.
TF-IDF (word frequency-inverse document frequency): the method is used for measuring the importance of the keywords in the query statement, and the calculation formula is as follows:
tf= (number of occurrences of keyword in query sentence)/(total number of words in query sentence)
Idf=log ((total number of query sentences)/(number of query sentences including the keyword))
TF-IDF = TF * IDF
Special character feature extraction:
counting the occurrence times: the number of times a special character appears in the query statement is calculated as a feature.
Proportion statistics: and calculating the proportion of the special characters to the total characters in the query sentence as a characteristic.
Inquiring structural feature extraction:
n-gram characteristics: the query statement is divided into N consecutive sub-sequences, each of which is taken as a feature.
Syntax parsing tree: by analyzing the grammar structure of the query statement, the nodes and edges in the tree are extracted as features.
The method comprises the steps of calculating feature weights by using a sparse group lasso method, and selecting features with the largest weights as the most relevant feature subsets, wherein the method specifically comprises the following steps:
the features in the dataset are organized into a feature matrix, each row representing a query, and each column representing a feature;
for each query, the category to which it belongs (normal query or sql injection query) is determined and the category is encoded as a target vector. For example, a normal query may be represented using 0 and an sql injection query may be represented using 1.
Constructing an optimization problem, and adding a sparse group lasso penalty term into an objective function;
solving the optimization problem by using a gradient descent method to obtain a characteristic weight vector w;
and setting a threshold according to the weight, and selecting the feature exceeding the threshold as the most relevant feature.
Wherein the objective function is expressed as follows:
,
wherein y is a target vector; x is a feature matrix; w is a feature weight vector;representing the weight for each feature/>Summing and applying L2 regularization; />Representing summing the different feature groups, corresponding to the keyword, the special character and the query structure, j representing the j-th feature, k representing the k-th feature group; />Representing an L2 norm within the feature set for measuring weights of the feature set; />And->The super-parameters for controlling L2 regularization and sparse group lasso punishment intensity are adjusted according to requirements.
The local deep learning model is an initialized global model, and is distributed to all participants by a central server before federal learning.
Common deep learning models, such as Convolutional Neural Networks (CNNs) or Recurrent Neural Networks (RNNs), may be used to learn patterns and relationships between features.
Wherein, the step S105 includes:
distributing the selected feature subset to each participant;
each participant uses the distributed feature subsets to construct a local deep learning model;
defining a cross entropy loss function for the local model of each participant;
each participant trains the local model using a local training set to optimize the loss function with sparse group lasso penalty terms.
Wherein optimizing the loss function with sparse group lasso penalty term comprises:
adding a sparse group lasso penalty term to a local loss function, regularizing model parameters, wherein the formula is as follows:
wherein,representing the local loss function,/->Weights representing model parameters +.>Summing and applying L1 regularization; />An L2 norm representing a model parameter for measuring a weight of the model; />And->Is a super-parameter for controlling the regularization intensity of L1 and L2, and can be adjusted according to the requirement.
Wherein, the step S107 includes:
after each participant completes the local model training, model parameters are sent back to a central server for aggregation;
after receiving the model parameters of each participant, the central server aggregates the model parameters according to a preset aggregation method to obtain an updated global model;
distributing the updated global model to each participant;
repeating the above three steps until reaching the iteration stop condition.
In one embodiment, in federal learning, to protect the data privacy of the participants, secure aggregation algorithms, such as cryptographic aggregation or differential private aggregation, are often employed.
Encryption aggregation (Encrypted Aggregation):
encryption aggregation uses encryption techniques to protect model parameters of a participant. Common cryptographic aggregation algorithms include homomorphic encryption and secure multiparty computing.
Homomorphic encryption: homomorphic encryption is a special encryption method that allows data to be calculated in the encrypted state. The party can encrypt the model parameters by using homomorphic encryption and then send the encrypted model parameters to the aggregation party, and the aggregation party uses homomorphic decryption technology to conduct aggregation calculation on the encrypted parameters without knowing the original parameter values. Secure multiparty computing: secure Multi-party computing (SMPC) is a protocol that allows multiple parties to compute without revealing private data. The participants can encrypt the model parameters of the participants by using a secure multiparty computing protocol and then send the encrypted model parameters to the aggregator, and the aggregator uses the secure multiparty computing protocol to conduct aggregate computation on the encrypted parameters.
Differential private aggregation (Differential Privacy Aggregation):
differential private aggregation protects the privacy of participants by adding noise. Differential privacy is a privacy protection method that protects the privacy of individual data by introducing a degree of uncertainty in the calculation results. The participants may apply a differential privacy mechanism locally to their model parameters and then send the noisy parameters to the aggregator for aggregate computation.
The specific differential private aggregation formula will depend on the differential privacy mechanism used. Common differential privacy mechanisms include the laplace mechanism and the gaussian mechanism. These mechanisms achieve privacy protection by adding noise to the model parameters that conforms to the laplace or gaussian distribution.
Taking the laplace mechanism as an example, the laplace mechanism achieves differential privacy protection by adding laplace noise to the model parameters. The Laplace noise conforms to the Laplace distribution with the following probability density function:
;
where x is the noise value, and where,is the center position of the noise, and b is the scale parameter of the noise.
In differential private aggregation, the participants can apply the laplace mechanism locally to their model parameters, and then send the parameters with laplace noise to the aggregator for aggregation computation. The specific differential private aggregation calculation formula is as follows:
;
where N is the number of participants,representing the summation operation, private_parameter is the Private parameter of the participant, and Laplace_noise is the noise sampled from the Laplace distribution.
Third embodiment,
The invention also provides an sql injection detection system based on deep learning, which is characterized by comprising:
the acquisition module is used for acquiring a data set of the sql query statement, wherein the data set comprises a normal query statement and a malicious sql query statement;
the feature subset extraction module is used for extracting features related to the sql injection attack for each sql query statement and selecting the most relevant feature subset by using a sparse group lasso method;
the model construction module is used for constructing a local deep learning model by each participant based on the selected feature subset, wherein the parameters of the respective deep learning model are regularized by using a sparse group lasso method;
the parameter aggregation module is used for aggregating model parameters of all the participants in a federal learning mode;
the model training module is used for training the respective deep learning model by each participant;
the detection module is used for detecting a new sql query statement based on a trained deep learning model;
and the central server is used for assisting the modules to update parameters of the deep learning model of each participant.
Fourth embodiment,
The disclosed embodiments provide a non-transitory computer storage medium storing computer executable instructions that perform the method steps described in the embodiments above.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a local Area Network (AN) or a Wide Area Network (WAN), or can be connected to AN external computer (for example, through the Internet using AN Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
The foregoing description of the preferred embodiments of the present invention has been presented for purposes of clarity and understanding, and is not intended to limit the invention to the particular embodiments disclosed, but is intended to cover all modifications, alternatives, and improvements within the spirit and scope of the invention as outlined by the appended claims.
Claims (6)
1. An sql injection detection method based on deep learning, which is characterized by comprising the following steps:
step S101, acquiring a dataset of sql query sentences, wherein the dataset comprises normal query sentences and malicious sql query sentences;
step S103, extracting features related to sql injection attack for each sql query statement, and selecting a most related feature subset by using a sparse group lasso method;
step S105, each participant builds a local deep learning model based on the selected feature subset, wherein the parameters of each deep learning model are regularized by using a sparse group lasso method;
step S107, aggregating model parameters of all the participants in a federal learning mode;
step S109, each participant trains the respective deep learning model;
step S1011, detecting a new sql query statement based on a trained deep learning model;
the local deep learning model is an initialized global model, and is distributed to all participants by a central server before federal learning;
wherein the step S105 includes:
distributing the selected feature subset to each participant;
each participant uses the distributed feature subsets to construct a local deep learning model;
defining a cross entropy loss function for the local model of each participant;
each participant uses a local training set to train the local model, and a loss function with sparse group lasso penalty term is optimized;
wherein optimizing the loss function with sparse group lasso penalty term comprises:
adding a sparse group lasso penalty term to a local loss function, regularizing model parameters, wherein the formula is as follows:
wherein,representing the local loss function,/->Weights representing model parameters +.>Summing and applying L1 regularization; />An L2 norm representing a model parameter for measuring a weight of the model; />And->The super-parameters for controlling the regularization intensity of the L1 and the L2 are adjusted according to the requirements.
2. The method of claim 1, wherein the sql injection attack-related features include at least keywords, special characters, and query structures.
3. The method of claim 2, wherein selecting the most relevant feature subset using sparse lasso in step S103 comprises: and calculating the feature weights by using a sparse group lasso method, and selecting the feature with the largest weight as the most relevant feature subset.
4. A method according to claim 3, wherein the feature weights are calculated using a sparse group lasso method, and the feature with the greatest weight is selected as the most relevant feature subset, comprising in particular:
the features in the dataset are organized into a feature matrix, each row representing a query, and each column representing a feature;
for each query, determining the category to which the query belongs, and encoding the category as a target vector;
constructing an optimization problem, and adding a sparse group lasso penalty term into an objective function;
solving the optimization problem by using a gradient descent method to obtain a characteristic weight vector w;
and setting a threshold according to the weight, and selecting the feature exceeding the threshold as the most relevant feature.
5. The method of claim 1, wherein said step S107 comprises:
after each participant completes the local model training, model parameters are sent back to a central server for aggregation;
after receiving the model parameters of each participant, the central server aggregates the model parameters according to a preset aggregation method to obtain an updated global model;
distributing the updated global model to each participant;
repeating the above three steps until reaching the iteration stop condition.
6. An sql injection detection system based on deep learning, the system comprising:
the acquisition module is used for acquiring a data set of the sql query statement, wherein the data set comprises a normal query statement and a malicious sql query statement;
the feature subset extraction module is used for extracting features related to the sql injection attack for each sql query statement and selecting the most relevant feature subset by using a sparse group lasso method;
the model construction module is used for constructing a local deep learning model by each participant based on the selected feature subset, wherein the parameters of the respective deep learning model are regularized by using a sparse group lasso method;
the parameter aggregation module is used for aggregating model parameters of all the participants in a federal learning mode;
the model training module is used for training the respective deep learning model by each participant;
the detection module is used for detecting a new sql query statement based on a trained deep learning model;
the central server is used for assisting the modules to update parameters of the deep learning model of each participant;
the local deep learning model is an initialized global model, and is distributed to all participants by a central server before federal learning;
the model construction module is specifically used for:
distributing the selected feature subset to each participant;
each participant uses the distributed feature subsets to construct a local deep learning model;
defining a cross entropy loss function for the local model of each participant;
each participant uses a local training set to train the local model, and a loss function with sparse group lasso penalty term is optimized;
wherein optimizing the loss function with sparse group lasso penalty term comprises:
adding a sparse group lasso penalty term to a local loss function, regularizing model parameters, wherein the formula is as follows:
wherein,representing the local loss function,/->Weights representing model parameters +.>Summing and applying L1 regularization; />An L2 norm representing a model parameter for measuring a weight of the model; />And->The super-parameters for controlling the regularization intensity of the L1 and the L2 are adjusted according to the requirements.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311678372.5A CN117370975B (en) | 2023-12-08 | 2023-12-08 | Sql injection detection method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311678372.5A CN117370975B (en) | 2023-12-08 | 2023-12-08 | Sql injection detection method and system based on deep learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117370975A CN117370975A (en) | 2024-01-09 |
CN117370975B true CN117370975B (en) | 2024-03-26 |
Family
ID=89395075
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311678372.5A Active CN117370975B (en) | 2023-12-08 | 2023-12-08 | Sql injection detection method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117370975B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909077A (en) * | 2017-10-10 | 2018-04-13 | 安徽信息工程学院 | Feature selection approach based on rarefaction theory in the case of semi-supervised |
CN113946869A (en) * | 2021-11-02 | 2022-01-18 | 深圳致星科技有限公司 | Internal security attack detection method and device for federal learning and privacy calculation |
CN117176482A (en) * | 2023-11-03 | 2023-12-05 | 国任财产保险股份有限公司 | Big data network safety protection method and system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7172816B2 (en) * | 2019-04-11 | 2022-11-16 | 日本電信電話株式会社 | Data analysis device, data analysis method and data analysis program |
-
2023
- 2023-12-08 CN CN202311678372.5A patent/CN117370975B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909077A (en) * | 2017-10-10 | 2018-04-13 | 安徽信息工程学院 | Feature selection approach based on rarefaction theory in the case of semi-supervised |
CN113946869A (en) * | 2021-11-02 | 2022-01-18 | 深圳致星科技有限公司 | Internal security attack detection method and device for federal learning and privacy calculation |
CN117176482A (en) * | 2023-11-03 | 2023-12-05 | 国任财产保险股份有限公司 | Big data network safety protection method and system |
Non-Patent Citations (2)
Title |
---|
False Data Injection Attacks in Smart Grids:State of the Art and Way Forward;Muhammad Irfan et al.;《arXiv:2308.10268v1》;第1-15 * |
新的基于Laplacian的特征选择方法;钱晓亮;左开中;接标;;《计算机工程与应用》(第15期);第83-86+104页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117370975A (en) | 2024-01-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ren et al. | Tree-RNN: Tree structural recurrent neural network for network traffic classification | |
Nerurkar et al. | Detecting illicit entities in bitcoin using supervised learning of ensemble decision trees | |
Zhou et al. | CTI view: APT threat intelligence analysis system | |
Huang | Network intrusion detection based on an improved long-short-term memory model in combination with multiple spatiotemporal structures | |
Wang et al. | Malicious code classification based on opcode sequences and textCNN network | |
Munkhdorj et al. | Cyber attack prediction using social data analysis | |
Liu et al. | Differentially private learning with grouped gradient clipping | |
Fu | Computer network intrusion anomaly detection with recurrent neural network | |
Tayyab et al. | Cryptographic based secure model on dataset for deep learning algorithms | |
Hong et al. | Abnormal access behavior detection of ideological and political MOOCs in colleges and universities | |
Zhou et al. | Cdtier: a Chinese dataset of threat intelligence entity relationships | |
EP3591561A1 (en) | An anonymized data processing method and computer programs thereof | |
CN117370975B (en) | Sql injection detection method and system based on deep learning | |
CN114726634B (en) | Knowledge graph-based hacking scene construction method and device | |
CN117009509A (en) | Data security classification method, apparatus, device, storage medium and program product | |
He et al. | Insider Threat Detection Based on User Historical Behavior and Attention Mechanism | |
CN115567305A (en) | Sequential network attack prediction analysis method based on deep learning | |
Abiodun et al. | Detection and Prevention of Data Leakage in Transit Using LSTM Recurrent Neural Network with Encryption Algorithm | |
Kim et al. | A Classification Model for Illegal Debt Collection Using Rule and Machine Learning Based Methods | |
Reddy et al. | Evaluation of Recurrent Neural Networks for Detecting Injections in API Requests | |
CN116611057B (en) | Data security detection method and system thereof | |
Jemili et al. | Intrusion detection based on big data fuzzy analytics | |
CN111712868B (en) | Search device, search method, and recording medium | |
Zeng et al. | Pri-pgd: forging privacy-preserving graph towards spectral-based graph neural network | |
Yaochuang | Research on Application System of Artificial Intelligence in Informatics Based on Computer Machine Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |