CN107659555B - Network attack detection method and device, terminal equipment and computer storage medium - Google Patents

Network attack detection method and device, terminal equipment and computer storage medium Download PDF

Info

Publication number
CN107659555B
CN107659555B CN201710656777.7A CN201710656777A CN107659555B CN 107659555 B CN107659555 B CN 107659555B CN 201710656777 A CN201710656777 A CN 201710656777A CN 107659555 B CN107659555 B CN 107659555B
Authority
CN
China
Prior art keywords
target language
analysis
score
sub
lexical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710656777.7A
Other languages
Chinese (zh)
Other versions
CN107659555A (en
Inventor
刘超
朱文雷
李昌志
吴雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Pulsar Technology Co Ltd
Original Assignee
Beijing Changting Future Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Changting Future Technology Co ltd filed Critical Beijing Changting Future Technology Co ltd
Priority to PCT/CN2017/099556 priority Critical patent/WO2018041114A1/en
Publication of CN107659555A publication Critical patent/CN107659555A/en
Application granted granted Critical
Publication of CN107659555B publication Critical patent/CN107659555B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/1466Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Virology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a method and a device for detecting network attacks, terminal equipment and a computer storage medium, and relates to the technical field of network security. The network attack detection method comprises the following steps: determining a target language from the request data according to the type of the target language; performing lexical analysis, syntactic analysis and semantic analysis on the target language; and determining the risk level of the request data according to the results of the lexical analysis, the syntactic analysis and the semantic analysis. In the technical scheme provided by the invention, the target language is determined from the request data according to the type of the target language, so that the extraction operation of different types of target languages can be considered to adapt to different detection purposes, and the compatibility of network attack detection is further improved.

Description

Network attack detection method and device, terminal equipment and computer storage medium
Technical Field
The present invention relates to the field of network security technologies, and in particular, to a method and an apparatus for detecting a network attack, a terminal device, and a computer storage medium.
Background
In recent years, vulnerability attack technologies for Web (network) applications are rapidly developed, diversified and retrofitted, and therefore, it is a serious challenge to form effective defense against different vulnerability attacks.
Common vulnerability attacks are, for example: SQL (Structured Query Language) injection attacks, XSS (Cross Site Scripting) attacks, and the like.
The SQL injection attack is to insert an SQL command (malicious) into a Web form or a page request, and finally achieve the purpose of deceiving a server to execute the malicious SQL command.
The XSS attack means that a malicious attacker inserts malicious Script codes into a Web page, and when a user browses the page, the Script codes embedded in the page are executed, so that the purpose of maliciously attacking the user is achieved.
However, in the prior art, the attack defense method usually only aims at a certain type of vulnerability attack, and therefore, the existing attack defense method often has the problem of low compatibility.
Disclosure of Invention
The embodiment of the invention provides a network attack detection method and device, terminal equipment and a computer storage medium, which are used for solving the technical problems in the prior art.
In a first aspect, an embodiment of the present invention provides a method for detecting a network attack.
Specifically, the method comprises the following steps:
determining a target language from the request data according to the type of the target language;
performing lexical analysis, syntactic analysis and semantic analysis on the target language;
and determining the risk level of the request data according to the results of the lexical analysis, the syntactic analysis and the semantic analysis.
As can be seen from the foregoing, different vulnerability attacks tend to depend on different types of computer languages. In contrast, in the invention, the target language is determined from the request data according to the type of the target language, so that the extraction operation of different types of target languages can be considered to adapt to different detection purposes, and the compatibility of network attack detection is further improved.
With reference to the first aspect, in some embodiments of the invention, determining the target language from the request data according to the type of the target language comprises:
if the type of the target language is a script language, determining the target language from the payload data of the request data through a first automaton, wherein the first automaton is constructed by using the following model: a model built for a wrapper layer outside the target language, the model comprising: the position of occurrence, the form of occurrence and the form of encoding of the target language.
To more effectively determine the target language in the request data, with reference to the first aspect, in some embodiments of the present invention, determining the target language from the request data according to the type of the target language further includes:
identifying whether the target language is encoded;
and if the target language is coded, decoding the target language.
With reference to the first aspect, in some embodiments of the invention, determining the target language from the request data according to the type of the target language comprises:
and if the type of the target language is the structured query language, determining the payload data of the request data as the target language.
With reference to the first aspect, in some embodiments of the invention, the method further comprises:
extracting the payload data from the request data.
With reference to the first aspect, in some embodiments of the invention, extracting the payload data from the request data comprises:
analyzing a specified head parameter or a request body from the request data;
decoding the header parameters or a request body to obtain the payload data.
With reference to the first aspect, in some embodiments of the invention, the head parameters include a combination of one or more of:
a Request URL (Request network address, where URL refers to Uniform Resource Locator (Uniform Resource Locator, also called network address)) parameter, a Referer parameter, a cookie parameter, and a User-Agent parameter.
With reference to the first aspect, in some embodiments of the invention, the lexical analysis of the target language includes:
determining lexical elements in the target language;
and analyzing the lexical elements through a finite state automaton to obtain token (marker) sequences of clauses in the target language.
In order to ensure the accuracy of the lexical analysis result, with reference to the first aspect, in some embodiments of the present invention, the performing lexical analysis on the target language further includes:
performing a disambiguation operation on the target language according to a context of the target language.
With reference to the first aspect, in some embodiments of the invention, parsing the target language comprises:
and inputting the token sequence into a second automaton to obtain a syntax analysis result of the clause in the target language, wherein the second automaton is generated according to the syntax standard of the target language.
With reference to the first aspect, in some embodiments of the invention, the method further comprises:
generating a BNF (Backus-Naur Form) file according to the grammatical standard of the target language;
and generating the second automaton according to the BNF file.
Since the second automaton for performing the syntax analysis is generated from the BNF file in the present invention, it is possible to make the result of the syntax analysis more accurate and to improve the execution speed of the syntax analysis.
With reference to the first aspect, in some embodiments of the invention, the semantic analyzing the target language comprises:
and identifying a key function call body and a key feature substructure from the target language.
With reference to the first aspect, in some embodiments of the invention, identifying a key function call body and a key feature substructure from the target language comprises:
and identifying a key function calling body and a key feature substructure from the target language by adopting a bottom-up reduction mode.
Because the semantic analysis is carried out in a bottom-up reduction mode, the automatic layer-by-layer analysis of the semantics can be realized, and the semantic structure of the language can be accurately identified.
With reference to the first aspect, in some embodiments of the invention, determining the risk level of the requested data from the results of the lexical analysis, the syntactic analysis, and the semantic analysis includes:
calculating a comprehensive score of the results of the lexical analysis, the syntactic analysis and the semantic analysis;
comparing the comprehensive score with a set threshold range;
determining a risk level of the requested data according to a result of the comparison.
According to the method and the device, the comprehensive results of lexical analysis, syntactic analysis and semantic analysis of the target language are quantized to generate the corresponding comprehensive score, and the comprehensive score is compared with the set threshold range, so that the judgment process of the risk level is more convenient.
With reference to the first aspect, in some embodiments of the invention, calculating a composite score of the results of the lexical analysis, the syntactic analysis, and the semantic analysis includes:
calculating a first sub-score, a second sub-score and a third sub-score of the target language according to the results of the lexical analysis, the syntactic analysis and the semantic analysis respectively;
respectively weighting the first sub-score, the second sub-score and the third sub-score;
calculating the composite score based on the weighted first sub-score, second sub-score, and third sub-score.
In the invention, the comprehensive score of the target language is calculated according to the weighted lexical analysis result, the syntactic analysis result and the semantic analysis result, so that the importance degrees of different analysis results can be reflected by different weight parameters, and the accuracy of the comprehensive score is improved.
With reference to the first aspect, in some embodiments of the invention, calculating the first sub-score of the target language from the result of the lexical analysis comprises:
and calculating the first sub-score according to the occurrence times and the weight parameters of the token sequence.
With reference to the first aspect, in some embodiments of the invention, calculating the second sub-score of the target language according to the result of the parsing comprises:
and calculating the second sub-score according to the grammar analysis result and the weight parameter of the grammar analysis result.
With reference to the first aspect, in some embodiments of the invention, calculating the third sub-score of the target language according to the result of the semantic analysis includes:
and calculating the third sub-score according to the occurrence frequency and the weight parameter of the key function calling body or the key feature substructure.
With reference to the first aspect, in some embodiments of the invention, the combined score of the results of the lexical analysis, the syntactic analysis and the semantic analysis is calculated according to the following formula:
Figure BDA0001369486000000041
in the above formula:
score (payload) is the composite score;
tithe occurrence number of the ith token sequence obtained by the lexical analysis is shown;
wtithe weight parameter of the ith token sequence;
sjthe syntax analysis result of the jth clause obtained by the syntax analysis is 0 or 1;
wsjthe weight parameter is the jth clause;
mkthe number of times of occurrence of the kth key function call body or key feature substructure obtained through the semantic analysis;
wmka weight parameter for the kth key function call body or key feature substructure;
Ct、Csand CmAnd the weight parameters of lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
To further improve the accuracy of the composite score, in some embodiments of the invention in combination with the first aspect, the method further comprises:
the weight parameters are optimized by machine learning.
In a second aspect, the embodiment of the present invention provides a device for detecting a network attack.
Specifically, the apparatus comprises:
the target language determining module is used for determining the target language from the request data according to the type of the target language;
an analysis module comprising: the lexical analysis unit is used for carrying out lexical analysis on the target language, the syntactic analysis unit is used for carrying out syntactic analysis on the target language, and the semantic analysis unit is used for carrying out semantic analysis on the target language;
and the risk level determining module is used for determining the risk level of the request data according to the results of the lexical analysis, the syntactic analysis and the semantic analysis.
According to the invention, the target language is determined from the request data according to the type of the target language, so that the extraction operation of different types of target languages can be considered to adapt to different detection purposes, and the compatibility of network attack detection is further improved.
With reference to the second aspect, in some embodiments of the invention, the target language determination module comprises:
a determining unit, configured to determine a target language from payload data of the request data through a first automaton in a case where a type of the target language is a script language, wherein the first automaton is constructed using the following model: a model built for a wrapper layer outside the target language, the model comprising: the position of occurrence, the form of occurrence and the form of encoding of the target language.
To determine the target language in the request data more effectively, in some embodiments of the present invention, with reference to the second aspect, the target language determination module further includes:
an identifying unit for identifying whether the target language is encoded;
and the target language decoding unit is used for decoding the target language under the condition that the target language is coded.
With reference to the second aspect, in some embodiments of the invention, the target language determination module is configured to determine the target language from the request data according to the type of the target language by: in a case where the type of the target language is a structured query language, determining payload data of the request data as the target language.
With reference to the second aspect, in some embodiments of the invention, the apparatus further comprises:
and the extraction module is used for extracting the effective load data from the request data.
With reference to the second aspect, in some embodiments of the invention, the extraction module comprises:
the analysis unit is used for analyzing the specified head parameters or the request body from the request data;
a header parameter or request body decoding unit, configured to decode the header parameter or request body to obtain the payload data.
With reference to the second aspect, in some embodiments of the invention, the head parameters include a combination of one or more of: a Request URL parameter, a refer parameter, a cookie parameter, and a User-Agent parameter.
With reference to the second aspect, in some embodiments of the invention, the lexical analysis unit includes:
a determining component for determining lexical elements in the target language;
and the analysis component is used for analyzing the lexical elements through a finite state automaton to obtain a token sequence of the clauses in the target language.
In order to ensure the accuracy of the lexical analysis result, in combination with the second aspect, in some embodiments of the present invention, the lexical analysis unit further includes:
a disambiguation component to perform a disambiguation operation on the target language according to a context of the target language.
With reference to the second aspect, in some embodiments of the invention, the parsing unit is configured to parse the target language by: and inputting the token sequence into a second automaton to obtain a syntax analysis result of the clause in the target language, wherein the second automaton is generated according to the syntax standard of the target language.
With reference to the second aspect, in some embodiments of the invention, the apparatus further comprises:
the file generation module is used for generating a BNF file according to the grammar standard of the target language;
and the automaton generating module is used for generating the second automaton according to the BNF file.
Since the second automaton for performing the syntax analysis is generated from the BNF file in the present invention, it is possible to make the result of the syntax analysis more accurate and to improve the execution speed of the syntax analysis.
With reference to the second aspect, in some embodiments of the present invention, the semantic analysis unit is configured to perform semantic analysis on the target language by: and identifying a key function call body and a key feature substructure from the target language.
With reference to the second aspect, in some embodiments of the present invention, the semantic analysis unit is configured to identify a key function call body and a key feature substructure from the target language by: and identifying a key function calling body and a key feature substructure from the target language by adopting a bottom-up reduction mode.
Because the semantic analysis is carried out in a bottom-up reduction mode, the automatic layer-by-layer analysis of the semantics can be realized, and the semantic structure of the language can be accurately identified.
With reference to the second aspect, in some embodiments of the invention, the risk level determination module comprises:
the calculation submodule is used for calculating the comprehensive score of the results of the lexical analysis, the syntactic analysis and the semantic analysis;
the comparison submodule is used for comparing the comprehensive score with a set threshold range;
and the determining submodule is used for determining the risk level of the request data according to the comparison result.
According to the method and the device, the comprehensive results of lexical analysis, syntactic analysis and semantic analysis of the target language are quantized to generate the corresponding comprehensive score, and the comprehensive score is compared with the set threshold range, so that the judgment process of the risk level is more convenient.
With reference to the second aspect, in some embodiments of the invention, the computation submodule includes:
a sub-score calculating unit comprising: a first calculation component for calculating a first sub-score of the target language based on results of the lexical analysis, a second calculation component for calculating a second sub-score of the target language based on results of the syntactic analysis, and a third calculation component for calculating a third sub-score of the target language based on results of the semantic analysis;
the weighting unit is used for respectively weighting the first sub-score, the second sub-score and the third sub-score;
and the comprehensive score calculating unit is used for calculating the comprehensive score according to the weighted first sub-score, the weighted second sub-score and the weighted third sub-score.
In the invention, the comprehensive score of the target language is calculated according to the weighted lexical analysis result, the syntactic analysis result and the semantic analysis result, so that the importance degrees of different analysis results can be reflected by different weight parameters, and the accuracy of the comprehensive score is improved.
With reference to the second aspect, in some embodiments of the invention, the first calculation component is configured to calculate the first sub-score of the target language from the result of the lexical analysis by: and calculating the first sub-score according to the occurrence times and the weight parameters of the token sequence.
With reference to the second aspect, in some embodiments of the invention, the second calculation component is configured to calculate the second sub-score of the target language from the result of the parsing by: and calculating the second sub-score according to the grammar analysis result and the weight parameter of the grammar analysis result.
With reference to the second aspect, in some embodiments of the invention, the third computing component is configured to implement the computing of the third sub-score of the target language from the result of the semantic analysis by: and calculating the third sub-score according to the occurrence frequency and the weight parameter of the key function calling body or the key feature substructure.
With reference to the second aspect, in some embodiments of the invention, the calculation sub-module is configured to calculate a combined score of the results of the lexical analysis, the syntactic analysis and the semantic analysis according to the following formula:
Figure BDA0001369486000000081
in the above formula:
score (payload) is the composite score;
tithe occurrence number of the ith token sequence obtained by the lexical analysis is shown;
wtithe weight parameter of the ith token sequence;
sjthe syntax analysis result of the jth clause obtained by the syntax analysis is 0 or 1;
wsjthe weight parameter is the jth clause;
mkthe number of times of occurrence of the kth key function call body or key feature substructure obtained through the semantic analysis;
wmka weight parameter for the kth key function call body or key feature substructure;
Ct、Csand CmAnd the weight parameters of lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
To further improve the accuracy of the composite score, in combination with the second aspect, in some embodiments of the invention, the apparatus further comprises:
an optimization module to optimize the weight parameters through machine learning.
In a third aspect, the embodiment of the invention provides a terminal device.
The terminal equipment comprises a memory and a processor; wherein,
the memory is used for storing one or more computer instructions, wherein the one or more computer instructions can realize the detection method of any one of the above network attacks when being executed by the processor.
In a fourth aspect, embodiments of the present invention provide a computer storage medium.
The computer storage medium is used for storing one or more computer instructions, wherein the one or more computer instructions can realize the detection method of any network attack when being executed.
According to the invention, the target language is determined from the request data according to the type of the target language, so that the extraction operation of different types of target languages can be considered to adapt to different detection purposes, and the compatibility of network attack detection is further improved.
These and other aspects of the invention will be more readily apparent from the following description of the embodiments.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the description below are some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of a network attack detection method according to method embodiment 1 of the present invention;
FIG. 2 illustrates one embodiment of the process S2 shown in FIG. 1;
FIG. 3 illustrates another embodiment of the process S2 shown in FIG. 1;
FIG. 4 illustrates one embodiment of the process S5 shown in FIG. 1;
FIG. 5 illustrates one embodiment of the process S51 shown in FIG. 4;
fig. 6 is a schematic structural diagram of a network attack detection apparatus according to embodiment 1 of the present invention;
fig. 7 is a schematic structural diagram of a network attack detection apparatus according to embodiment 5 of the present invention;
FIG. 8 illustrates one embodiment of the extraction module 4' shown in FIG. 7;
FIG. 9 illustrates one embodiment of the lexical analysis unit 21 shown in FIG. 6;
FIG. 10 illustrates one embodiment of the risk level determination module 3 shown in FIG. 6;
FIG. 11 illustrates one embodiment of the calculation submodule 31 illustrated in FIG. 10.
Detailed Description
Various aspects of the invention are described in detail below with reference to the figures and the detailed description. Well-known processes, program modules, elements and their interconnections, links, communications or operations, among others, are not shown or described in detail herein in various embodiments of the invention.
Furthermore, the described features, architectures, or functions can be combined in any manner in one or more implementations.
Furthermore, it should be understood by those skilled in the art that the following embodiments are illustrative only and are not intended to limit the scope of the present invention. Those of skill would further appreciate that the program modules, elements, or steps of the various embodiments described herein and illustrated in the figures may be combined and designed in a wide variety of different configurations.
Technical terms not specifically described in the present specification should be construed in the broadest sense in the art unless otherwise specifically indicated.
In some of the flows described in the present specification and claims and in the above figures, a number of operations are included that occur in a particular order, but it should be clearly understood that these operations may be performed out of order or in parallel as they occur herein, with the order of the operations being indicated as 101, 102, etc. merely to distinguish between the various operations, and the order of the operations by themselves does not represent any order of performance. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive effort based on the embodiments of the present invention, are within the scope of the present invention.
[ METHOD EMBODIMENT 1]
Fig. 1 is a flowchart of a network attack detection method according to embodiment 1 of the method of the present invention. Referring to fig. 1, in the present embodiment, the method includes:
s1: determining the target language from the request data according to the type of the target language.
S2: and carrying out lexical analysis on the target language.
S3: and carrying out syntactic analysis on the target language.
S4: and performing semantic analysis on the target language.
S5: and determining the risk level of the request data according to the results of the lexical analysis, the syntactic analysis and the semantic analysis.
The invention provides a detection method suitable for different types of network attacks (such as SQL injection attack, XSS attack, PHP (Hypertext Preprocessor) code injection attack and the like) aiming at the problems in the prior art.
Among them, the SQL injection attack attacks using the structured query language (SQL command), and the XSS attack and the PHP code injection attack attacks using the scripting language (e.g., JavaScript (an interpreted scripting language), VBScript (a lightweight interpreted language in microsoft environment), PHP, etc.).
It follows that different types of target languages need to be analyzed for different types of network attacks. In contrast, in the invention, the target language is determined from the request data according to the type of the target language, so that the extraction operation of different types of target languages can be considered to adapt to different detection purposes, and the compatibility of network attack detection is further improved.
Specifically, in the present invention, if the type of the target language is a structured query language, it is determined that payload data of the request data is the target language; if the type of the target language is a script language, the target language embedded in the payload data needs to be determined. Determining the target language from the payload data may be achieved, for example, by:
determining the target language from payload data of the request data by a first automaton, wherein the first automaton is constructed using the following model: a model built for a wrapper layer outside the target language, the model comprising: the position of occurrence, the form of occurrence and the form of encoding of the target language.
Wherein the first automaton is, for example, a push-down automaton or a finite state automaton.
In order to more effectively determine the target language in the request data, in the present invention, the implementation process of determining the target language from the payload data may further include the following processing:
identifying whether the target Language is encoded (e.g., there may be HTML entry encoding in JavaScript (HyperText Markup Language)); and if the target language is coded, decoding the target language. That is, if the encoding result in the target language determined from the payload data is identified, the corresponding decoding operation is performed, and the list of the target language can be obtained after decoding.
In addition, the network attack detection method provided by the invention abandons the traditional mode of using rules to detect, defines a model of a target language aiming at a vulnerability, does not need manual maintenance rules, can intelligently extract the target language from the request data, and carries out lexical analysis, syntactic analysis and semantic analysis aiming at the target language, integrates the results of the lexical analysis, the syntactic analysis and the semantic analysis, deeply identifies the target language, further judges the risk level of the request data, and further improves the accuracy of network attack detection.
Meanwhile, the method does not adopt a mode of using the rule for detection, so that the problem of low execution speed caused by rule superposition does not exist.
In specific implementation, the network attack detection method provided by the invention can be applied to firewalls of various servers, attack-prevention monitoring software, log audit or entity equipment with protection function combining software and hardware, such as various network equipment and the like.
[ METHOD EMBODIMENT 2 ]
The method provided by this embodiment includes all the contents of method embodiment 1, and is not described herein again. In this embodiment, the method further comprises: extracting the payload data (payload) from the request data. Specifically, a specified header parameter or a request body is analyzed from the request data; decoding the header parameters or a request body to obtain the payload data.
Wherein the head parameters include a combination of one or more of: a Request URL parameter, a refer parameter, a cookie parameter, and a User-Agent parameter.
Taking HTTP (HyperText Transfer Protocol) Protocol as an example, HTTP headers generally include a generic header, a request header, a response header, and an entity header. Each header consists of three parts, domain name, colon (: and domain value). In these headers, part of the header parameters may contain malicious attack information input by the user, such as:
a Request URL parameter in the generic header, the parameter representing a requested URL address;
a refer header field, a cookie header field and a User-Agent header field in the request header; wherein,
the Referer header field indicates the source of the current request, i.e. the browser indicates to the Web server from which Web page/URL it gets/clicks on the Web address/URL in the current request. For example, Referer: http:// www.ABC.com.
The cookie header field is sent by the client and is included in the header of the HTTP request. An example is as follows: cookie: userId ═ C5bYpXrimdmsiQmsBPnE1Vn8ZQmdWSm3WRlEB3vRwTnRtW < -Cookie.
The User-Agent header field is used to indicate the identity of the client (which browser, editor, or other User tool is).
Of course, the present invention is not limited to the above header parameters, and any header parameters that may contain malicious data may be used.
In Web traffic data, payload data (also referred to as a minimum unencoded request unit) is the portion of data in which information is described. In general, when Web data is transmitted, in order to make data transmission more reliable, original data is transmitted in batches, and certain auxiliary information, such as the size of data volume and check bits, is added at the head and tail of each batch of data. The original of these data is the payload data. For example, the user inputs the name and quantity of the commodity to be purchased in the e-commerce webpage, when sending a purchase request to the server, the name and quantity information of the commodity is not transmitted in the network in an exposed manner, but certain auxiliary information is added and encoded and then transmitted in the network, and the server decodes to extract payload data after receiving the purchase request, wherein the name and quantity information of the commodity to be purchased by the user is the payload data itself or is contained in the payload data.
In general, data to be transmitted is encoded and then transmitted through a network according to the requirements of a transmission protocol, and therefore, in the above processing, header parameters or a request body need to be decoded so as to extract payload data contained therein.
Specifically, when the request data includes the encoding information at the time of decoding, the corresponding decoding operation can be performed directly using the encoding information. However, if no decoding information exists in the requested data, it is necessary to intelligently guess a plurality of possible encoding methods and try corresponding decoding operations until decoding can be successfully performed. Only if the header parameters or the requester are successfully decoded, a valid attack code can be extracted.
[ METHOD EMBODIMENT 3 ]
The method provided by this embodiment includes all the contents of method embodiment 1 or method embodiment 2, and is not described herein again. As shown in fig. 2, in the present embodiment, the process S2 is implemented by:
s21: and determining lexical elements in the target language.
S22: and analyzing the lexical elements through a finite state automaton to obtain a token sequence of the clauses in the target language.
[ METHOD EMBODIMENT 4 ]
The method provided by this embodiment includes all the contents of method embodiment 1 or method embodiment 2, and is not described herein again. As shown in fig. 3, in the present embodiment, the process S2 is implemented by:
s21': performing a disambiguation operation on the target language according to a context of the target language.
S22': and determining lexical elements in the target language.
S23': and analyzing the lexical elements through a finite state automaton to obtain a token sequence of the clauses in the target language.
In the process of determining the lexical elements, the types of the lexical elements are deduced according to the 1 st non-empty character, and the subsequent characters are processed one by one until characters which do not belong to the types appear, so that the boundaries and the types of the lexical elements are determined. Correctly determining the boundaries of lexical elements is critical to the lexical analysis, and if not correctly processed, it may affect the correctness of the entire lexical analysis result.
In order to ensure the accuracy of the lexical analysis result, the following processing is added to the method provided by the embodiment: a disambiguating operation is performed on the target language according to a context of the target language. For example, a possible concatenation quotation mark is determined in advance, for example, a prefix mark does not exist in the SQL injection example of select1from users where generated password is 'admin', and lexical analysis is directly performed, where the result of correct lexical analysis is:
<keyword select><type number><keywordfrom><type bareword><keywordwhere><type bareword><operator=><typestring>
however, in the SQL injection example of xxxx 'or'1'═ 1, there is a prefix, and a monogram needs to be added before xxxx' for correct lexical analysis, otherwise, xxxx is recognized as a pure word (bareword).
The result of correct lexical analysis after disambiguation of xxxx ' or '1' ═ 1 is:
< typing > < keyword > < typing > < operator > < typing >, and if a single quotation mark is not supplemented, it is erroneously analyzed as the following result:
<typebareword><typestring><type number><typestring><type number>。
as can be seen from the above examples, for a target language that may generate ambiguity, the result obtained by the disambiguation operation is greatly different from the result obtained without the disambiguation operation, and therefore, the disambiguation operation performed on the target language can improve the accuracy of the lexical analysis result.
In specific implementation, multiple situations or manners (for convenience, referred to as ambiguity generating situations hereinafter) that may generate ambiguity may be predefined, and when the lexical analysis is performed, whether the set ambiguity generating situation occurs in the target language is identified, and if the set ambiguity generating situation occurs in the target language, the ambiguity is resolved first, and then the boundary of each lexical element in the target language is determined step by step according to the above manners.
Situations where ambiguities may arise include, but are not limited to: for example, there may be ambiguity of multiple database languages in SQL injection, and for example, the left slash may be analyzed as a division sign or as an ambiguity of the start of a regular expression during lexical analysis of XSS. These situations need to be eliminated according to the context to finally obtain the correct lexical analysis result. Of course, the several possible ambiguity scenarios listed above are merely exemplary, and the present invention is not limited thereto.
[ METHOD EMBODIMENT 5 ]
The method provided by this embodiment includes all of the contents of any one of method embodiment 1 to method embodiment 4, and is not described herein again. In the present embodiment, the process S3 is realized by:
and inputting the token sequence into a second automaton to obtain a syntax analysis result of the clause in the target language, wherein the second automaton is generated according to the syntax standard of the target language, namely the second automaton is adaptive to the syntax of the target language.
In the present embodiment, for example, the grammar standard of the target language can be defined in the language standard of the language.
Furthermore, the second automaton is, for example, a push-down automaton or a finite state automaton.
The target language usually comprises one or more clauses (each clause is separated by a semicolon; "for example), and accordingly, the token sequence of the clauses in the target language can be obtained through lexical analysis, so that the second automaton is executed by taking the token sequence of the clauses obtained through lexical analysis as the input of the second automaton, and the grammar analysis result of the clauses in the target language can be obtained.
Wherein the syntax analysis result is used to indicate whether a clause meets the syntax standard, for example, "0" or "1" may be used to identify whether a clause meets the syntax standard. Thus, the results of the parsing of different clauses in the target language may constitute an array containing 0's or 1's.
For example, when the select the best components from the class as the term representation (only including one clause) is parsed, the clause does not conform to the corresponding syntax rule, so the second automaton outputs the following result: 0.
however, for 'or'1'═ 1'; select password from users where 1 is 1; for example, since both clauses conform to the corresponding grammar rule, the second automaton outputs the result of [1,1 ].
[ METHOD EMBODIMENT 6 ]
The method provided by this embodiment includes the entire contents of method embodiment 5, which are not described herein again. In this embodiment, the method further includes the following processes:
(1) and generating a BNF file according to the grammar standard of the target language. I.e., a grammar defined for the standard of the target language, a BNF file corresponding to the grammar is generated.
(2) And generating a second automaton according to the BNF file.
BNF is a way to describe the syntax of a given language using formal notation, and BNF files are a kind of syntax description definition files. In particular, the generation of a corresponding automaton from a BNF file can be achieved by an OpenFst (build, merge, optimize, and search library of weighted finite state machines) tool.
Since the second automaton for performing the syntax analysis is generated from the BNF file in the present invention, it is possible to make the result of the syntax analysis more accurate and to improve the execution speed of the syntax analysis.
[ METHOD EMBODIMENT 7 ]
The method provided by this embodiment includes all of the contents of any one of method embodiment 1 to method embodiment 6, and is not described herein again. In the present embodiment, the process S4 is realized by: and identifying a key function call body and a key feature substructure from the target language. For example, a key function call body and key feature substructures may be identified from the target language in a bottom-up reduction manner.
By adopting a bottom-up reduction mode, semantic structures such as expressions, key sentences, bracket matching relations, function calling relations and the like in the target language can be identified, so that key function calling bodies and key feature substructures contained in the target language can be identified.
For example, an example of SQL injection is as follows:
union select substr(version(),1,1)from users
in the process of analysis by adopting a reduction mode, when version () is analyzed, it is recognized that it is a function call, based on which, further reducing upwards, an expression can be analyzed locally: e1(E1 ═ version, (,)), which expression E1 together with the following 1 (the 2 nd 1from left to right in the above formula) constitutes the parameters of the substr function, on the basis of which the expression E2(E2 ═ substr, (, E1,1,1,) can be reduced further upwards, and the select clause can be reduced again: select E2from users. It can be seen that reduction is a process of identifying sub-structures and gradually obtaining higher-level structures upwards.
Because the semantic analysis is carried out in a bottom-up reduction mode, the automatic layer-by-layer analysis of the semantics can be realized, so that the semantic structure of the language can be accurately identified, the accuracy of the semantic analysis is improved, and a foundation is laid for accurately carrying out comprehensive judgment of the network attack on the results of subsequent comprehensive lexical analysis, syntactic analysis and semantic analysis.
[ METHOD EMBODIMENT 8 ]
The method provided by this embodiment includes all of the contents of any one of method embodiment 1 to method embodiment 7, and is not described herein again. As shown in fig. 4, in the present embodiment, the process S5 is implemented by:
s51: and calculating a comprehensive score of the results of the lexical analysis, the syntactic analysis and the semantic analysis.
S52: and comparing the comprehensive score with a set threshold range.
S53: determining a risk level of the requested data according to a result of the comparison.
According to the method and the device, the comprehensive results of lexical analysis, syntactic analysis and semantic analysis of the target language are quantized to generate the corresponding comprehensive score, and the comprehensive score is compared with the set threshold range, so that the judgment process of the risk level is more convenient.
For convenience of judgment, in the present invention, one or more risk levels and a threshold range corresponding to each level may be preset, as shown in table 1:
risk level Threshold range
Without risk (Normal) n<10
Risk level 1 10≤n<20
Risk level 2 20≤n<50
Risk class 3 n≥50
TABLE 1
In table 1, n is a composite score, and the risk levels of risk level 1, risk level 2, and risk level 3 are sequentially higher. In particular implementations, risk level 1 may be a lower risk, risk level 2 may be a medium risk, risk level 3 may be a higher risk, and so on.
The various threshold ranges and risk levels in table 1 are exemplary only and one skilled in the art can also distinguish between no risk and at risk.
After the risk level is analyzed, network attacks can be selectively filtered, intercepted or prompted according to various preset strategies.
[ METHOD EMBODIMENT 9 ]
The method provided by this embodiment includes all the contents of method embodiment 8, and is not described herein again. As shown in fig. 5, in the present embodiment, the process S51 is implemented by:
s511: and calculating a first sub-score of the target language according to the result of the lexical analysis.
For example, the first sub-score is calculated according to the number of occurrences of the token sequence and a weighting parameter.
S512: and calculating a second sub-score of the target language according to the result of the syntactic analysis.
For example, the second sub-score is calculated based on the parsing result and a weight parameter of the parsing result.
S513: and calculating a third sub-score of the target language according to the result of the semantic analysis.
For example, the third sub-score is calculated according to the occurrence number of the key function call body or the key feature substructure and a weight parameter.
S514: and respectively weighting the first sub-score, the second sub-score and the third sub-score.
S515: calculating the composite score based on the weighted first sub-score, second sub-score, and third sub-score.
In the invention, the comprehensive score of the target language is calculated according to the weighted lexical analysis result, the syntactic analysis result and the semantic analysis result, so that the importance degrees of different analysis results can be reflected by different weight parameters, and the accuracy of the comprehensive score is improved.
[ METHOD EMBODIMENT 10 ]
The method provided by this embodiment includes all the contents of method embodiment 8 or method embodiment 9, and is not described herein again. In the present embodiment, the process S51 is implemented according to the following formula:
Figure BDA0001369486000000181
in the above formula:
score (payload) is the composite score;
tithe occurrence number of the ith token sequence obtained by the lexical analysis is shown;
wtias the weight of the ith token sequenceA parameter;
sjthe syntax analysis result of the jth clause obtained by the syntax analysis is 0 or 1;
wsjthe weight parameter is the jth clause;
mkthe number of times of occurrence of the kth key function call body or key feature substructure obtained through the semantic analysis;
wmka weight parameter for the kth key function call body or key feature substructure;
Ct、Csand CmAnd the weight parameters of lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
Generally, the weight parameter (i.e. the weight coefficient or the weight value) is positively correlated with the importance degree, that is, the larger the weight parameter is, the more important the corresponding data is in the calculation formula of the overall comprehensive score. For example, setting CtIs 0.5, Cs1.0, which indicates that in the above calculation formula, the result of the parsing is more important than the result of the lexical analysis, i.e., the lexical analysis result is less important than the parsing result.
In order to improve the accuracy of the composite score, the weight parameters can also be optimized through machine learning. For example, for Ct、CsAnd CmAnd wti、wsjAnd wmkThe constant weight parameters can be used for obtaining a model based on initial values (for example, prior weight values obtained according to experience) of the parameters, continuously performing machine learning, continuously training on the basis of big data, and finally obtaining optimized Ct、CsAnd CmAnd wti、wsjAnd wmkAnd (4) the weighting parameters are equal. Moreover, with the accumulation of a large amount of network attack detection data, the optimization process can be dynamically and continuously carried out to continuously adjust and optimize Ct、CsAnd CmAnd wti、wsjAnd wmkAnd the final judgment result is more and more accurate due to the equal weight parameters.
The judgment method of the comprehensive score can also eliminate noise caused by some request data which does not conform to the grammar rule but has no aggressivity, so that the accuracy of network attack judgment is improved, and the misjudgment rate is reduced to a certain extent.
Based on the same inventive concept, the embodiment of the invention also provides a network attack detection device, and as the principle adopted by the device is similar to the network attack detection method, the implementation of the device can refer to the network attack detection method, and repeated details are not repeated.
[ DEVICE EMBODIMENT 1]
Fig. 6 is a schematic structural diagram of a network attack detection apparatus according to embodiment 1 of the present invention. Referring to fig. 6, in the present embodiment, a network attack detection apparatus 10 includes: the target language determining module 1, the analyzing module 2 and the risk level determining module 3 specifically:
the target language determining module 1 is used for determining the target language from the request data according to the type of the target language.
The analysis module 2 includes: lexical analysis section 21, syntax analysis section 22, and semantic analysis section 23, specifically:
the lexical analysis unit 21 is configured to perform lexical analysis on the target language determined by the target language determination module 1.
The parsing unit 22 is used for parsing the target language determined by the target language determination module 1.
The semantic analysis unit 23 is used for performing semantic analysis on the target language determined by the target language determination module 1.
The risk level determination module 3 is configured to determine a risk level of the requested data according to results of the lexical analysis unit 21, the syntactic analysis unit 22, and the semantic analysis unit 23.
[ DEVICE EMBODIMENT 2 ]
The apparatus provided in this embodiment includes all the contents of apparatus embodiment 1, and is not described herein again. In the present embodiment, the target language determination module 1 comprises a determination unit, specifically:
the determination unit is used for determining the target language from the payload data of the request data through a first automaton under the condition that the type of the target language is a script language, wherein the first automaton is constructed by using the following model: a model built for a wrapper layer outside the target language, the model comprising: the position of occurrence, the form of occurrence and the form of encoding of the target language.
[ DEVICE EMBODIMENT 3 ]
The apparatus provided in this embodiment includes all the contents of apparatus embodiment 2, and is not described herein again. In this embodiment, the target language determination module 1 further includes: a recognition unit and a target language decoding unit, specifically:
the identification unit is used for identifying whether the target language is coded.
The target language decoding unit is used for decoding the target language under the condition that the identification unit identifies that the target language is coded.
[ DEVICE EMBODIMENT 4 ]
The apparatus provided in this embodiment includes all the contents of apparatus embodiment 1, and is not described herein again. In this embodiment, the target language determining module 1 specifically determines the target language from the request data according to the type of the target language by: in a case where the type of the target language is a structured query language, determining payload data of the request data as the target language.
[ DEVICE EMBODIMENT 5 ]
Fig. 7 is a schematic structural diagram of a network attack detection apparatus according to embodiment 5 of the present invention. Referring to fig. 7, in the present embodiment, a network attack detection apparatus 10' includes: target language determination module 1', analysis module 2', risk level determination module 3 'and extraction module 4', in particular:
the target language determining module 1', the analyzing module 2 ', and the risk level determining module 3 ' are the same as the target language determining module 1, the analyzing module 2, and the risk level determining module 3 in the device embodiment 1, and are not described herein again.
The extraction module 4' is configured to extract the payload data from the request data.
[ DEVICE EMBODIMENT 6 ]
The apparatus provided in this embodiment includes all the contents of apparatus embodiment 5, and is not described herein again. As shown in fig. 8, in the present embodiment, the extraction module 4' includes: parsing unit 41 'and header parameter or request body decoding unit 42', specifically:
the parsing unit 41' is used for parsing out the specified header parameter or request body from the request data.
The header parameter or request body decoding unit 42 'is configured to decode the header parameter or request body parsed by the parsing unit 41' to obtain the payload data.
Wherein the head parameters include a combination of one or more of: a Request URL parameter, a refer parameter, a cookie parameter, and a User-Agent parameter.
[ DEVICE EMBODIMENT 7 ]
The apparatus provided in this embodiment includes all of the apparatus embodiments 1 to 6, and is not described herein again. As shown in fig. 9, in the present embodiment, the lexical analysis unit 21 includes: the determination component 211 and the analysis component 212, in particular:
the determination component 211 is configured to determine lexical elements in the target language.
The analysis component 212 is configured to analyze the lexical elements determined by the determination component 211 through finite state automata to obtain a token sequence of clauses in the target language.
[ DEVICE EMBODIMENT 8 ]
The apparatus provided in this embodiment includes all of apparatus embodiment 7, and is not described herein again. In this embodiment, the lexical analysis unit 21 further includes a disambiguation component, specifically:
the disambiguation component is to perform a disambiguation operation on the target language according to a context of the target language.
[ DEVICE EMBODIMENT 9 ]
The apparatus provided in this embodiment includes all of the contents of any one of apparatus embodiments 1 to 8, and is not described herein again. In the present embodiment, the parsing unit 22 specifically implements parsing of the target language by: and inputting the token sequence into a second automaton to obtain a syntax analysis result of the clause in the target language, wherein the second automaton is generated according to the syntax standard of the target language.
[ DEVICE EMBODIMENT 10 ]
The apparatus provided in this embodiment includes all the contents of apparatus embodiment 9, and is not described herein again. The network attack detection device provided by this embodiment further includes: the file generation module and the automaton generation module specifically:
and the file generation module is used for generating a BNF file according to the grammar standard of the target language.
The automaton generating module is used for generating the second automaton according to the BNF file generated by the file generating module.
[ DEVICE EMBODIMENT 11 ]
The apparatus provided in this embodiment includes all of the apparatus embodiments 1 to 10, and is not described herein again. In this embodiment, the semantic analysis unit 23 implements semantic analysis on the target language by specifically: and identifying a key function call body and a key feature substructure from the target language, for example, by using a bottom-up reduction mode.
[ DEVICE EMBODIMENT 12 ]
The apparatus provided in this embodiment includes all of the apparatus embodiments 1 to 11, and is not described herein again. As shown in fig. 10, in the present embodiment, the risk level determination module 3 includes: a calculation submodule 31, a comparison submodule 32 and a determination submodule 33, in particular:
the computation submodule 31 is configured to compute a composite score of the results of the lexical analysis, the syntactic analysis and the semantic analysis.
The comparison submodule 32 is used for comparing the comprehensive score calculated by the calculation submodule 31 with a set threshold value range.
The determination submodule 33 is used for determining the risk level of the requested data according to the result of the comparison by the comparison submodule 32.
[ DEVICE EMBODIMENT 13 ]
The apparatus provided in this embodiment includes all of apparatus embodiment 10, and is not described herein again. As shown in fig. 11, the calculation submodule 31 includes: the sub-score calculating unit 311, the weighting unit 312, and the comprehensive score calculating unit 313 specifically:
the sub-score calculating unit 311 includes: the first computing component 3111, the second computing component 3112 and the third computing component 3113, in particular:
the first computing component 3111 is configured to compute a first sub-score for the target language based on a result of the lexical analysis.
The second computing component 3112 is configured for computing a second sub-score for the target language based on the result of the parsing.
The third computing component 3113 is configured for computing a third sub-score for the target language based on the result of the semantic analysis.
The weighting unit 312 is configured to weight the first sub-score calculated by the first calculating component 3111, the second sub-score calculated by the second calculating component 3112, and the third sub-score calculated by the third calculating component 3113.
The integrated score calculating unit 313 is configured to calculate the integrated score according to the first sub-score, the second sub-score, and the third sub-score weighted by the weighting unit 312.
[ DEVICE EMBODIMENT 14 ]
The apparatus provided in this embodiment includes all of apparatus embodiment 13, and is not described herein again. In this embodiment, the first calculating component 3111 specifically implements calculating the first sub-score of the target language according to the result of the lexical analysis by: and calculating the first sub-score according to the occurrence times and the weight parameters of the token sequence.
[ DEVICE EMBODIMENT 15 ]
The apparatus provided in this embodiment includes all of the apparatus embodiment 13 or the apparatus embodiment 14, and is not described herein again. In this embodiment, the second calculation component 3112 is specifically configured to calculate the second sub-score of the target language based on the result of the parsing by: and calculating the second sub-score according to the grammar analysis result and the weight parameter of the grammar analysis result.
[ DEVICE EMBODIMENT 16 ]
The apparatus provided in this embodiment includes all of apparatus embodiment 13 to apparatus embodiment 15, and is not described herein again. In this embodiment, the third computing component 3113 specifically implements the computing of the third sub-score of the target language from the result of the semantic analysis by: and calculating the third sub-score according to the occurrence frequency and the weight parameter of the key function calling body or the key feature substructure.
[ DEVICE EMBODIMENT 17 ]
The apparatus provided in this embodiment includes all of the apparatus embodiments 12 to 16, and is not described herein again. In the present embodiment, the calculation sub-module 31 calculates the integrated score of the results of the lexical analysis, the syntactic analysis, and the semantic analysis according to the following formulas:
Figure BDA0001369486000000241
in the above formula:
score (payload) is the composite score;
tithe occurrence number of the ith token sequence obtained by the lexical analysis is shown;
wtithe weight parameter of the ith token sequence;
sjthe syntax analysis result of the jth clause obtained by the syntax analysis is 0 or 1;
wsjthe weight parameter is the jth clause;
mkthe number of times of occurrence of the kth key function call body or key feature substructure obtained through the semantic analysis;
wmka weight parameter for the kth key function call body or key feature substructure;
Ct、Csand CmAnd the weight parameters of lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
[ DEVICE EMBODIMENT 18 ]
The apparatus provided in this embodiment includes all of apparatus embodiment 17, and is not described herein again. The network attack detection device provided by this embodiment further includes: an optimization module, specifically:
the optimization module is used for optimizing the weight parameters through machine learning.
The embodiment of the invention also provides terminal equipment, which comprises a memory and a processor; wherein,
the memory is configured to store one or more computer instructions that, when executed by the processor, are capable of performing the method of any one of method embodiments 1-10.
Furthermore, embodiments of the present invention also provide a computer storage medium for storing one or more computer instructions, wherein the one or more computer instructions, when executed, enable implementation of the method according to any one of method embodiment 1 to method embodiment 10.
Generally, the detection scheme of the network attack provided by the embodiment of the invention abandons the traditional detection mode using rules, can define a model of a target language aiming at various network vulnerabilities (SQL injection attack, XSS cross-site scripting attack, PHP injection attack, and the like), does not need manual maintenance rules, can intelligently extract the target language from request data, performs lexical analysis, syntactic analysis and semantic analysis aiming at the target language, and integrates the results of the lexical analysis, the syntactic analysis and the semantic analysis, so as to accurately and deeply identify whether the target language is the network attack, and further judge whether the request data has risks, and has high accuracy and few misjudgments. And because a rule detection mode is not adopted, the problem that the speed is slower when more rules are overlapped does not exist, and the running speed is higher.
Those skilled in the art will clearly understand that the present invention may be implemented entirely in software, or by a combination of software and a hardware platform. Based on such understanding, all or part of the technical solutions of the present invention contributing to the background may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, a smart phone, a network device, etc.) to execute the method according to each embodiment or some parts of the embodiments of the present invention.
As used herein, the term "software" or the like refers to any type of computer code or set of computer-executable instructions in a general sense that is executed to program a computer or other processor to perform various aspects of the present inventive concepts as discussed above. Furthermore, it should be noted that according to one aspect of the embodiment, one or more computer programs implementing the method of the present invention when executed do not need to be on one computer or processor, but may be distributed in modules in multiple computers or processors to execute various aspects of the present invention.
Computer-executable instructions may take many forms, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. In particular, the operations performed by the program modules may be combined or separated as desired in various embodiments.
Also, technical solutions of the present invention may be embodied as a method, and at least one example of the method has been provided. The actions may be performed in any suitable order and may be presented as part of the method. Thus, embodiments may be configured such that acts may be performed in an order different than illustrated, which may include performing some acts simultaneously (although in the illustrated embodiments, the acts are sequential).
The definitions given and used herein should be understood with reference to dictionaries, definitions in documents incorporated by reference, and/or their ordinary meanings.
In the claims, as well as in the specification above, all transitional phrases such as "comprising," "having," "containing," "carrying," "having," "involving," "consisting essentially of …," and the like are to be understood to be open-ended, i.e., to include but not limited to.
The terms and expressions used in the specification of the present invention have been set forth for illustrative purposes only and are not meant to be limiting. It will be appreciated by those skilled in the art that changes could be made to the details of the above-described embodiments without departing from the underlying principles thereof. The scope of the invention is, therefore, indicated by the appended claims, in which all terms are intended to be interpreted in their broadest reasonable sense unless otherwise indicated.
While various embodiments of the present invention have been described above with particularity, various aspects or features of the teachings of embodiments of the present invention are described below in another form and are not limited to the following series of paragraphs, some or all of which may be assigned alphanumeric characters for the sake of clarity. Each of these paragraphs may be combined with the contents of one or more other paragraphs in any suitable manner. Without limiting examples of some of the suitable combinations, some paragraphs hereinafter make specific reference to and further define other paragraphs.
A1, a method for detecting network attacks, comprising:
decoding request data in Web flow, and extracting payload data payload in the request data;
determining a target language in the payload;
performing lexical analysis, syntactic analysis and semantic analysis on the target language;
and judging whether the request data has risks or not according to the results of the lexical analysis, the syntactic analysis and the semantic analysis.
A2, where the method in a1, the decoding the request data in the Web traffic to extract payload data payload therein includes:
analyzing at least one preset head parameter or HTTP Body request Body in the request data according to a hypertext transfer protocol (HTTP);
decoding the at least one header request parameter or the content of the HTTP Body request Body to extract payload contained in the at least one header request parameter or the content of the HTTP Body request Body;
the at least one head parameter comprises one or any combination of the following parameters:
a Request URL parameter, a refer parameter, a cookie parameter, and a User-Agent parameter.
A3, the method as in a1, wherein the determining the target language in the payload includes:
determining that the payload is itself the target language; or
Extracting the target language embedded in the payload from the payload.
A4, the method as in A3, extracting the target language embedded in the payload from the payload, comprising:
modeling a packaging layer outside a target language, and listing all possible positions, appearance forms and encoding forms of the target language;
and constructing a push-down automaton or a finite state automaton by using a modeling model of an external packaging layer to process the external language, marking and extracting the target language.
A5, the method of a1, wherein the lexical analysis of the target language comprises:
sequentially determining the boundary of each lexical element in the target language;
and constructing a finite state automaton, and analyzing the determined lexical elements to form a corresponding token sequence.
A6, the method as in a5, wherein determining the boundary of each lexical element in the target language further comprises:
performing a disambiguation operation on the target language according to a context of the target language.
A7, the method as in a5, wherein parsing the target language comprises:
generating a corresponding push-down automaton or finite state automaton according to the grammar standard of the target language;
and taking the token sequence of the target language obtained by lexical analysis as the input of the push-down automaton or the finite state automaton, and outputting a syntax analysis result whether each clause of the target language meets the syntax standard.
A8, the method as in a7, wherein generating a corresponding push-down automaton or finite state automaton according to the language standard of the target language comprises:
generating the language description Backus-van BNF file aiming at the language standard of the target language;
and generating a corresponding push-down automaton or finite state automaton according to the BNF file.
A9, the method of any one of a1 to A8, wherein semantically analyzing the target language comprises:
and analyzing the semantic structure of the target language in a bottom-up reduction mode, and identifying a key function calling body and a key feature substructure contained in the semantic structure.
A10, the method of A9, wherein the determining whether the requested data is at risk according to the results of the lexical analysis, the syntactic analysis, and the semantic analysis comprises:
calculating the comprehensive scores of the lexical analysis result, the syntactic analysis result and the semantic analysis result of the target language according to a preset algorithm;
comparing the composite score with a preset threshold range of at least one risk level;
and determining whether the requested data has risks and the risk level to which the requested data belongs when the requested data has risks according to the comparison result.
A11, the method of A10, wherein the comprehensive score of the lexical analysis result, the syntactic analysis result and the semantic analysis result of the target language is calculated according to the following formulas:
Figure BDA0001369486000000281
in the above formula:
score (payload) is the composite score for the target language;
tithe occurrence frequency of the ith token sequence of the target language obtained by lexical analysis is obtained;
wtithe weight value corresponding to the ith token sequence;
sjwhether the jth clause of the target language obtained by syntax analysis meets the syntax analysis result of the syntax standard or not is judged, and the syntax analysis result is 0 or 1;
wsjthe weight value corresponding to the jth clause;
mkthe number of times of occurrence of a k-th key function calling body or key feature substructure of the target language obtained by semantic analysis;
wmka weight value corresponding to the kth key function calling body or the key feature substructure;
Ct、Csand CmThe weight coefficients corresponding to lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
A12, the method as described in A11, wherein Ct、CsAnd CmAnd wti、wsjAnd wmkObtained by the following method: the method is obtained by continuously training and optimizing a parameter model established by the initial values of the preset parameters.
B13, an apparatus for detecting a network attack, comprising:
the extraction module is used for decoding the request data in the Web flow and extracting payload data payload in the request data;
the target language determining module is used for determining a target language in the payload;
the lexical analysis module is used for carrying out lexical analysis on the target language;
the grammar analysis module is used for carrying out grammar analysis on the target language;
the semantic analysis module is used for performing semantic analysis on the target language;
and the risk judgment module is used for judging whether the request data has risks or not according to the results of the lexical analysis, the syntactic analysis and the semantic analysis.
B14, the device as described in B13, the extraction module comprising:
the analysis submodule is used for analyzing at least one preset head parameter or HTTP Body request Body in the request data according to an HTTP protocol;
a decoding sub-module for decoding the at least one header request parameter or the content of the HTTP Body request Body;
an extraction submodule, configured to extract payload included in the content decoded by the decoding submodule;
the at least one head parameter comprises one or any combination of the following parameters:
a Request URL parameter, a refer parameter, a cookie parameter, and a User-Agent parameter.
B15, the apparatus of B13, wherein the target language determining module comprises: determining a submodule or extracting a submodule; wherein:
the determining submodule is used for determining that the payload is the target language;
and the extraction submodule is used for extracting the target language embedded in the payload from the payload.
B16, the device as described in B15, wherein the extraction submodule is specifically configured to model a packaging layer outside a target language, enumerate all possible positions, forms and encoding forms of the target language; and constructing a push-down automaton or a finite state automaton by using a modeling model of an external packaging layer to process the external language, marking and extracting the target language.
B17, in the apparatus according to B13, the lexical analysis module is specifically configured to sequentially determine boundaries of each lexical element in the target language; and constructing a finite state automaton, and analyzing the determined lexical elements to form a corresponding token sequence.
B18, the apparatus as in B17, wherein the lexical analysis module is further configured to perform a disambiguation operation on the target language according to a context of the target language when determining boundaries of each lexical element in the target language.
B19, the apparatus as described in B17, the syntax analysis module comprising:
the automatic machine generation submodule is used for generating a corresponding push-down automatic machine or a finite state automatic machine according to the language standard of the target language;
and the grammar analysis submodule is used for taking the token sequence of the target language obtained by lexical analysis as the input of the push-down automaton or the finite state automaton and outputting a grammar analysis result whether each clause of the target language meets a grammar standard.
B20, in the apparatus according to B19, the automaton generation submodule is specifically configured to generate the language description bacauss-paradigm BNF file for the language standard of the target language; and generating a corresponding push-down automaton or finite state automaton according to the BNF file.
B21, the apparatus according to any one of B13 to B20, wherein the semantic analysis module is specifically configured to analyze the semantic structure of the target language in a bottom-up reduction manner, and identify the key function call entity and the key feature substructure contained therein.
B22, the apparatus of B21, wherein the risk assessment module comprises:
the comprehensive score calculation submodule is used for calculating the comprehensive scores of the lexical analysis result, the syntactic analysis result and the semantic analysis result of the target language according to a preset algorithm;
a comparison submodule for comparing the composite score with a preset threshold range of at least one risk level;
and the judgment submodule is used for determining whether the requested data has risks and the risk level to which the requested data belongs when the requested data has risks according to the comparison result.
B23, in the apparatus according to B22, the comprehensive score calculating sub-module is specifically configured to calculate the comprehensive scores of the lexical analysis result, the syntactic analysis result, and the semantic analysis result of the target language according to the following formulas:
Figure BDA0001369486000000301
in the above formula:
score (payload) is the composite score for the target language;
tithe occurrence frequency of the ith token sequence of the target language obtained by lexical analysis is obtained;
wtithe weight value corresponding to the ith token sequence;
sjwhether the jth clause of the target language obtained by syntax analysis meets the syntax analysis result of the syntax standard or not is judged, and the syntax analysis result is 0 or 1;
wsjthe weight value corresponding to the jth clause;
mkthe number of times of occurrence of a k-th key function calling body or key feature substructure of the target language obtained by semantic analysis;
wmka weight value corresponding to the kth key function calling body or the key feature substructure;
Ct、Csand CmThe weight coefficients corresponding to lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
B24, device as described in B23, said Ct、CsAnd CmAnd wti、wsjAnd wmkObtained by the following method: the method is obtained by continuously training and optimizing a parameter model established by the initial values of the preset parameters.
C25, an apparatus for detecting cyber attacks, comprising:
a processor;
a memory for storing processor executable commands;
wherein the processor is configured to:
decoding request data in Web flow, and extracting payload data payload in the request data;
determining a target language in the payload;
performing lexical analysis, syntactic analysis and semantic analysis on the target language;
and judging whether the request data has risks or not according to the results of the lexical analysis, the syntactic analysis and the semantic analysis.
D26, a non-transitory computer readable storage medium having instructions that, when executed by a processor of a network device such as a server, enable the network device to perform the detection of the network attack, comprising:
decoding request data in Web flow, and extracting payload data payload in the request data;
determining a target language in the payload;
performing lexical analysis, syntactic analysis and semantic analysis on the target language;
and judging whether the request data has risks or not according to the results of the lexical analysis, the syntactic analysis and the semantic analysis.
In summary, the technical solutions provided by the embodiments of the present invention at least have the following beneficial effects:
1) in the detection method and the related device for network attacks provided by the embodiment of the invention, the traditional mode of using rules for detection is abandoned, the model of the target language can be defined aiming at various network vulnerabilities (SQL injection attack, XSS cross-site scripting attack, PHP injection attack and the like), the manual maintenance rules are not needed, the target language can be intelligently extracted from the request data, the lexical analysis, the syntactic analysis and the semantic analysis are carried out aiming at the target language, the results of the lexical analysis, the syntactic analysis and the semantic analysis are integrated, whether the target language is the network attack or not is accurately and deeply identified, and then whether the request data has risks or not is judged, the accuracy is higher, and the misjudgment is less. And because a rule detection mode is not adopted, the problem that the speed is slower when more rules are overlapped does not exist, and the running speed is higher.
2) In the method and the related apparatus for detecting a network attack provided by the embodiments of the present invention, a BNF file (lexical or syntactic description definition file) can be converted into a corresponding push-down automaton or finite state automaton to perform lexical analysis or syntactic analysis, so that the result is accurate and the execution speed is fast.
3) In the method for detecting network attacks and the related device provided by the embodiment of the invention, the lexical analysis, the syntactic analysis and the semantic analysis are quantized into the comprehensive score, and in the calculation process of the comprehensive score, various weight parameters of the results of the lexical analysis, the syntactic analysis and the semantic analysis are utilized, and various parameters are automatically learned through a machine learning model, so that the parameters are adjusted and optimized, and the calculation result is more accurate.
4) In the method and the related device for detecting network attacks provided by the embodiments of the present invention, a bottom-up reduction manner is adopted, so that semantic structures such as expressions, key statements, parenthesis matching relationships, function call relationships, and the like in a target language can be identified, and thus a key function call body and a key feature substructure contained therein are identified. The method can realize automatic layer-by-layer analysis of semantics, can accurately identify the semantic structure of the language, has higher accuracy of semantic analysis, and lays a foundation for accurately carrying out comprehensive judgment of network attacks on the results of subsequent comprehensive lexical analysis, syntactic analysis and semantic analysis.

Claims (36)

1. A method for detecting a network attack, the method comprising:
determining a target language from the request data according to the type of the target language;
performing lexical analysis, syntactic analysis and semantic analysis on the target language;
determining the risk level of the request data according to the results of the lexical analysis, the syntactic analysis and the semantic analysis;
wherein determining the target language from the request data according to the type of the target language comprises:
if the type of the target language is a script language, determining the target language embedded in the target language from the payload data of the request data through a first automaton, wherein the first automaton is constructed by using the following model: a model built for a wrapper layer outside the target language, the model comprising: the appearance position, the appearance form and the coding form of the target language;
if the type of the target language is a structured query language, determining the payload data of the request data as the target language;
wherein the lexical analysis of the target language comprises:
determining lexical elements in the target language;
and analyzing the lexical elements through a finite state automaton to obtain a token sequence of the clauses in the target language.
2. The method of claim 1, wherein determining the target language from the request data based on the type of target language further comprises:
identifying whether the target language is encoded;
and if the target language is coded, decoding the target language.
3. The method of claim 1 or 2, wherein the method further comprises:
extracting the payload data from the request data.
4. The method of claim 3, wherein extracting the payload data from the request data comprises:
analyzing a specified head parameter or a request body from the request data;
decoding the header parameters or a request body to obtain the payload data.
5. The method of claim 4,
the head parameters include a combination of one or more of:
a Request network address Request URL parameter, a reference Referer parameter, a cookie parameter, and a User Agent User-Agent parameter.
6. The method of claim 1, wherein lexical analyzing the target language further comprises:
performing a disambiguation operation on the target language according to a context of the target language.
7. The method of claim 1, wherein parsing the target language comprises:
and inputting the token sequence into a second automaton to obtain a syntax analysis result of the clause in the target language, wherein the second automaton is generated according to the syntax standard of the target language.
8. The method of claim 7, wherein the method further comprises:
generating a Backos-normal BNF file according to the grammatical standard of the target language;
and generating the second automaton according to the BNF file.
9. The method of claim 7, wherein semantically analyzing the target language comprises:
and identifying a key function call body and a key feature substructure from the target language.
10. The method of claim 9, wherein identifying key function call volumes and key feature substructures from the target language comprises:
and identifying a key function calling body and a key feature substructure from the target language by adopting a bottom-up reduction mode.
11. The method of claim 10, wherein determining a risk level for the requested data based on results of the lexical analysis, the syntactic analysis, and the semantic analysis comprises:
calculating a comprehensive score of the results of the lexical analysis, the syntactic analysis and the semantic analysis;
comparing the comprehensive score with a set threshold range;
determining a risk level of the requested data according to a result of the comparison.
12. The method of claim 11, wherein calculating a composite score for the results of the lexical analysis, the syntactic analysis, and the semantic analysis comprises:
calculating a first sub-score, a second sub-score and a third sub-score of the target language according to the results of the lexical analysis, the syntactic analysis and the semantic analysis respectively;
respectively weighting the first sub-score, the second sub-score and the third sub-score;
calculating the composite score based on the weighted first sub-score, second sub-score, and third sub-score.
13. The method of claim 12, wherein calculating a first sub-score for the target language based on the results of the lexical analysis comprises:
and calculating the first sub-score according to the occurrence times and the weight parameters of the token sequence.
14. The method of claim 13, wherein calculating a second sub-score for the target language based on the results of the parsing comprises:
and calculating the second sub-score according to the grammar analysis result and the weight parameter of the grammar analysis result.
15. The method of claim 14, wherein calculating a third sub-score for the target language based on the results of the semantic analysis comprises:
and calculating the third sub-score according to the occurrence frequency and the weight parameter of the key function calling body or the key feature substructure.
16. The method of claim 15, wherein the integrated score of the results of the lexical analysis, the syntactic analysis, and the semantic analysis is calculated according to the following formula:
Figure FDA0002495815410000041
in the above formula:
score (payload) is the composite score;
ntthe number of token sequences obtained by the lexical analysis is used as the number of token sequences;
tithe occurrence number of the ith token sequence obtained by the lexical analysis is shown;
wtithe weight parameter of the ith token sequence;
nsthe number of clauses obtained through the syntactic analysis is used as the number of the clauses;
sjthe syntax analysis result of the jth clause obtained by the syntax analysis is 0 or 1;
wsjthe weight parameter is the jth clause;
nkthe number of key function call bodies or key feature substructures obtained through the semantic analysis;
mkthe number of times of occurrence of the kth key function call body or key feature substructure obtained through the semantic analysis;
wmka weight parameter for the kth key function call body or key feature substructure;
Ct、Csand CmAnd the weight parameters of lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
17. The method of claim 16, wherein the method further comprises:
the weight parameters are optimized by machine learning.
18. An apparatus for detecting a cyber attack, the apparatus comprising:
the target language determining module is used for determining the target language from the request data according to the type of the target language;
an analysis module comprising: the lexical analysis unit is used for carrying out lexical analysis on the target language, the syntactic analysis unit is used for carrying out syntactic analysis on the target language, and the semantic analysis unit is used for carrying out semantic analysis on the target language;
the risk level determining module is used for determining the risk level of the request data according to the results of the lexical analysis, the syntactic analysis and the semantic analysis;
wherein the target language determination module comprises:
a determination unit configured to determine, in a case where a type of a target language is a script language, the target language embedded therein from payload data of the request data by a first automaton, wherein the first automaton is constructed using the following model: a model built for a wrapper layer outside the target language, the model comprising: the appearance position, the appearance form and the coding form of the target language;
the target language determination module is further configured to determine the target language from the request data according to the type of the target language by: determining payload data of the request data as the target language in case that the type of the target language is a structured query language;
wherein the lexical analysis unit includes:
a determining component for determining lexical elements in the target language;
and the analysis component is used for analyzing the lexical elements through a finite state automaton to obtain a token sequence of the clauses in the target language.
19. The apparatus of claim 18, wherein the target language determination module further comprises:
an identifying unit for identifying whether the target language is encoded;
and the target language decoding unit is used for decoding the target language under the condition that the target language is coded.
20. The apparatus of claim 18 or 19, wherein the apparatus further comprises:
and the extraction module is used for extracting the effective load data from the request data.
21. The apparatus of claim 20, wherein the extraction module comprises:
the analysis unit is used for analyzing the specified head parameters or the request body from the request data;
a header parameter or request body decoding unit, configured to decode the header parameter or request body to obtain the payload data.
22. The apparatus of claim 21,
the head parameters include a combination of one or more of:
a Request URL parameter, a refer parameter, a cookie parameter, and a User-Agent parameter.
23. The apparatus of claim 18, wherein the lexical analysis unit further comprises:
a disambiguation component to perform a disambiguation operation on the target language according to a context of the target language.
24. The apparatus of claim 18,
the syntax analysis unit is used for realizing syntax analysis of the target language by the following modes: and inputting the token sequence into a second automaton to obtain a syntax analysis result of the clause in the target language, wherein the second automaton is generated according to the syntax standard of the target language.
25. The apparatus of claim 24, wherein the apparatus further comprises:
the file generation module is used for generating a BNF file according to the grammar standard of the target language;
and the automaton generating module is used for generating the second automaton according to the BNF file.
26. The apparatus of claim 24,
the semantic analysis unit is used for performing semantic analysis on the target language by the following modes: and identifying a key function call body and a key feature substructure from the target language.
27. The apparatus of claim 26,
the semantic analysis unit is used for identifying a key function call body and a key feature substructure from the target language by the following modes: and identifying a key function calling body and a key feature substructure from the target language by adopting a bottom-up reduction mode.
28. The apparatus of claim 27, wherein the risk level determination module comprises:
the calculation submodule is used for calculating the comprehensive score of the results of the lexical analysis, the syntactic analysis and the semantic analysis;
the comparison submodule is used for comparing the comprehensive score with a set threshold range;
and the determining submodule is used for determining the risk level of the request data according to the comparison result.
29. The apparatus of claim 28, wherein the computation submodule comprises:
a sub-score calculating unit comprising: a first calculation component for calculating a first sub-score of the target language based on results of the lexical analysis, a second calculation component for calculating a second sub-score of the target language based on results of the syntactic analysis, and a third calculation component for calculating a third sub-score of the target language based on results of the semantic analysis;
the weighting unit is used for respectively weighting the first sub-score, the second sub-score and the third sub-score;
and the comprehensive score calculating unit is used for calculating the comprehensive score according to the weighted first sub-score, the weighted second sub-score and the weighted third sub-score.
30. The apparatus of claim 29,
the first calculation component is used for calculating a first sub-score of the target language according to the result of the lexical analysis by the following means: and calculating the first sub-score according to the occurrence times and the weight parameters of the token sequence.
31. The apparatus of claim 30,
the second calculation component is used for calculating a second sub-score of the target language according to the result of the syntactic analysis by the following method: and calculating the second sub-score according to the grammar analysis result and the weight parameter of the grammar analysis result.
32. The apparatus of claim 31,
the third computing component is used for computing a third sub-score of the target language according to the result of the semantic analysis by the following means: and calculating the third sub-score according to the occurrence frequency and the weight parameter of the key function calling body or the key feature substructure.
33. The apparatus of claim 32,
the calculation submodule is used for calculating the comprehensive score of the results of the lexical analysis, the syntactic analysis and the semantic analysis according to the following formula:
Figure FDA0002495815410000081
in the above formula:
score (payload) is the composite score;
ntthe number of token sequences obtained by the lexical analysis is used as the number of token sequences;
tithe occurrence number of the ith token sequence obtained by the lexical analysis is shown;
wtithe weight parameter of the ith token sequence;
nsthe number of clauses obtained through the syntactic analysis is used as the number of the clauses;
sjthe syntax analysis result of the jth clause obtained by the syntax analysis is 0 or 1;
wsjthe weight parameter is the jth clause;
nkthe number of key function call bodies or key feature substructures obtained through the semantic analysis;
mkthe number of times of occurrence of the kth key function call body or key feature substructure obtained through the semantic analysis;
wmka weight parameter for the kth key function call body or key feature substructure;
Ct、Csand CmAnd the weight parameters of lexical analysis, syntactic analysis and semantic analysis in the comprehensive score are respectively.
34. The apparatus of claim 33, wherein the apparatus further comprises:
an optimization module to optimize the weight parameters through machine learning.
35. A terminal device comprising a memory and a processor; wherein,
the memory is to store one or more computer instructions, wherein the one or more computer instructions, when executed by the processor, are capable of implementing the method of any one of claims 1 to 17.
36. A computer storage medium storing one or more computer instructions which, when executed, are capable of implementing the method of any one of claims 1 to 17.
CN201710656777.7A 2016-08-30 2017-08-03 Network attack detection method and device, terminal equipment and computer storage medium Active CN107659555B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/099556 WO2018041114A1 (en) 2016-08-30 2017-08-30 Method and apparatus for detecting network attack, terminal device, and computer storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2016107726564 2016-08-30
CN201610772656 2016-08-30

Publications (2)

Publication Number Publication Date
CN107659555A CN107659555A (en) 2018-02-02
CN107659555B true CN107659555B (en) 2020-08-11

Family

ID=61128260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710656777.7A Active CN107659555B (en) 2016-08-30 2017-08-03 Network attack detection method and device, terminal equipment and computer storage medium

Country Status (2)

Country Link
CN (1) CN107659555B (en)
WO (1) WO2018041114A1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109218284B (en) * 2018-07-24 2021-11-23 百度在线网络技术(北京)有限公司 XSS vulnerability detection method and device, computer equipment and readable medium
CN109660499B (en) * 2018-09-13 2021-07-27 创新先进技术有限公司 Attack interception method and device, computing equipment and storage medium
CN109298928B (en) * 2018-10-10 2021-05-25 深圳高灯计算机科技有限公司 Service processing method and device
CN110602029B (en) * 2019-05-15 2022-06-28 上海云盾信息技术有限公司 Method and system for identifying network attack
CN111131309A (en) * 2019-12-31 2020-05-08 奇安信科技集团股份有限公司 Distributed denial of service detection method and device and model creation method and device
CN113138913B (en) * 2020-01-17 2024-08-20 深信服科技股份有限公司 Java code injection detection method, device, equipment and storage medium
CN113141332B (en) * 2020-01-17 2023-03-21 深信服科技股份有限公司 Command injection identification method, system, equipment and computer storage medium
CN113221579A (en) * 2021-06-07 2021-08-06 中国光大银行股份有限公司 Enterprise risk assessment processing method and device
CN115171380B (en) * 2022-07-01 2023-05-12 广西师范大学 A control model and method for suppressing Internet of Vehicles congestion caused by network attacks
CN115913655B (en) * 2022-10-28 2024-05-14 华中科技大学 A Shell command injection detection method based on traffic analysis and semantic analysis
CN117971866A (en) * 2024-01-24 2024-05-03 中电云计算技术有限公司 WAF rule engine optimization method and device based on lexical analysis

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100461132C (en) * 2007-03-02 2009-02-11 北京邮电大学 Software security code analyzer and detection method based on source code static analysis
CN101895517B (en) * 2009-05-19 2013-05-15 北京启明星辰信息技术股份有限公司 Method and device for extracting script semantics
CN101778112B (en) * 2010-01-29 2013-01-23 中国科学院软件研究所 Network attack detection method
CN102955914B (en) * 2011-08-19 2015-11-25 百度在线网络技术(北京)有限公司 The detection method of one source file security breaches and pick-up unit
US9313223B2 (en) * 2013-03-15 2016-04-12 Prevoty, Inc. Systems and methods for tokenizing user-generated content to enable the prevention of attacks
US9405652B2 (en) * 2013-10-31 2016-08-02 Red Hat, Inc. Regular expression support in instrumentation languages using kernel-mode executable code
CN103647678A (en) * 2013-11-08 2014-03-19 北京奇虎科技有限公司 Method and device for online verification of website vulnerabilities
CN104899010B (en) * 2014-03-04 2018-12-21 北京金山云网络技术有限公司 The multi-lingual opinion on public affairs method and system of source code
US9356955B2 (en) * 2014-03-15 2016-05-31 Kenneth F. Belva Methods for determining cross-site scripting and related vulnerabilities in applications
CN105488399A (en) * 2014-12-08 2016-04-13 哈尔滨安天科技股份有限公司 Script virus detection method and system based on program keyword calling sequence
CN105160252B (en) * 2015-08-10 2017-12-19 北京神州绿盟信息安全科技股份有限公司 A kind of detection method and device of SQL injection attacks
CN105303109A (en) * 2015-09-22 2016-02-03 电子科技大学 Malicious code information analysis method and system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
第12.3.1节 软件漏洞的静态分析;张剑;《信息安全技术 下 第2版》;电子科技大学出版社;20150531;第335-337页 *
第5.4.1节 基于源代码的静态分析;彭国军等;《软件安全》;武汉大学出版社;20150930;第108-109页 *
第6.5.1节 应用软件安全检测方法;张健;《信息安全技术》;电子科技大学出版社;20131231;第158-159页 *
第8.1.1节 源代码分析技术的理论与实践发展;徐云峰等;《弱点挖掘》;武汉大学出版社;20140131;第249-250页 *

Also Published As

Publication number Publication date
CN107659555A (en) 2018-02-02
WO2018041114A1 (en) 2018-03-08

Similar Documents

Publication Publication Date Title
CN107659555B (en) Network attack detection method and device, terminal equipment and computer storage medium
CN110581864B (en) Method and device for detecting SQL injection attack
CN110210617B (en) A method and device for generating adversarial samples based on feature enhancement
CN113596007B (en) Vulnerability attack detection method and device based on deep learning
US12058173B2 (en) Intelligent signature-based anti-cloaking web recrawling
US9032516B2 (en) System and method for detecting malicious script
CN105160252B (en) A kind of detection method and device of SQL injection attacks
CN110191096B (en) Word vector webpage intrusion detection method based on semantic analysis
CN102833269B (en) The detection method of cross-site attack, device and there is the fire compartment wall of this device
US9692771B2 (en) System and method for estimating typicality of names and textual data
CN107463844B (en) WEB Trojan horse detection method and system
CN110995714A (en) A method, device and medium for detecting gang attacks on Web sites
US10291640B2 (en) System and method for detecting anomalous elements of web pages
CN108718306A (en) A kind of abnormal flow behavior method of discrimination and device
CN114465780A (en) A method and system for detecting phishing emails based on feature extraction
CN110602021A (en) Safety risk value evaluation method based on combination of HTTP request behavior and business process
CN108600270A (en) A kind of abnormal user detection method and system based on network log
KR20220152167A (en) A system and method for detecting phishing-domains in a set of domain name system(dns) records
CN115580494B (en) Method, device and equipment for detecting weak password
Hu et al. Cross-site scripting detection with two-channel feature fusion embedded in self-attention mechanism
CN112600864A (en) Verification code verification method, device, server and medium
CN116800518A (en) Method and device for adjusting network protection strategy
US20230353595A1 (en) Content-based deep learning for inline phishing detection
CN112883372B (en) Cross-site scripting attack detection method and device
EP3293661A1 (en) System and method for detecting anomalous elements of web pages

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20190703

Address after: 100024 Beijing Chaoyang District Guanzhuang Dongli (Chaoyang District Non-staple Food Company) 3 1-storey B26

Applicant after: Beijing Pulsar Technology Co., Ltd.

Address before: 100083 Beijing Haidian District College Road No. 5, Building No. 1, Building No. 3, Building No. 1, West 2-007

Applicant before: BEIJING CHAITIN TECH CO., LTD.

CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100024 B26, floor 1, building 3, Guanzhuang Dongli (non staple food company), Chaoyang District, Beijing

Applicant after: Beijing Changting Future Technology Co., Ltd

Address before: 100024 Beijing Chaoyang District Guanzhuang Dongli (Chaoyang District Non-staple Food Company) 3 1-storey B26

Applicant before: Beijing Pulsar Technology Co., Ltd.

GR01 Patent grant
GR01 Patent grant