CN113132329A - WEBSHELL detection method, device, equipment and storage medium - Google Patents

WEBSHELL detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN113132329A
CN113132329A CN201911421815.6A CN201911421815A CN113132329A CN 113132329 A CN113132329 A CN 113132329A CN 201911421815 A CN201911421815 A CN 201911421815A CN 113132329 A CN113132329 A CN 113132329A
Authority
CN
China
Prior art keywords
data
webshell
response
request
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911421815.6A
Other languages
Chinese (zh)
Inventor
张宏飞
王大伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sangfor Technologies Co Ltd
Original Assignee
Sangfor Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sangfor Technologies Co Ltd filed Critical Sangfor Technologies Co Ltd
Priority to CN201911421815.6A priority Critical patent/CN113132329A/en
Publication of CN113132329A publication Critical patent/CN113132329A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a WEBSHELL detection method, which comprises the steps of obtaining response data fed back to a client by a server according to request data and carrying out WEBSHELL characteristic identification on the response data. In order to improve the detection accuracy, WEBSHELL characteristic identification can be carried out according to the request data and the response data at the same time, and bidirectional detection during interaction between the client and the server is realized. The WEBSHELL detection mode carries out WEBSHELL identification by combining the WEBSHELL response characteristics, has stronger detection capability and generalization capability and lower false alarm rate, and can improve the detection effect of the backdoor. The invention also provides a WEBSHELL detection device, computer equipment and a readable storage medium, which have the beneficial effects.

Description

WEBSHELL detection method, device, equipment and storage medium
Technical Field
The present invention relates to the field of network security, and in particular, to a web page detection method and apparatus, a computer device, and a readable storage medium.
Background
In order to ensure the security of the website server, it is necessary to monitor the illegal access behavior. Wherein web refers to a command execution environment in the form of web page files such as asp, php, jsp or cgi, which is an important tool for hackers to further penetrate websites and hosts, and may also be referred to as a web backdoor. After an illegal access user invades a website, the asp or php backdoor file and the normal webpage file under the WEB directory of the website server are usually mixed together, and then the asp or php backdoor can be accessed by using a browser to obtain a command execution environment, so that the operations of file reading and writing, database query, intranet sniffing and the like are realized.
The flow data generated by accessing the WebShell backdoor is different from the flow characteristics of the normal service, and by using the difference, the industry mainly has two technologies to detect the WebShell backdoor flow: 1. collecting known WebShell tools and extracting fingerprint features. Only after the WebShell tool is exposed can the corresponding fingerprint be extracted, so there is a vacuum period for this detection scheme, which always lags the attacker. And with the enhancement of the confusion encryption capability of the WebShell, the novel WebShell communication is difficult to extract strong features, and false alarm or missing report is easy to cause. 2. The characteristics of the WebShell request flow are extracted to realize WebShell backdoor communication detection, but the request flows of many WebShell backdoors (especially big horses and encrypted WebShell) are similar to normal services and do not contain obvious attack characteristics, so that the detection method based on the request data is low in detection rate and high in false alarm rate.
Therefore, how to improve the webcall recognition accuracy is a technical problem to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide a WEBSHELL detection method, which has high accuracy in detecting WEBSHELL and improves the detection effect of a backdoor; another object of the present invention is to provide a webhell detection apparatus, a computer device and a readable storage medium.
In order to solve the above technical problem, the present invention provides a webhell detection method, including:
after a client sends request data to a server, response data fed back to the client by the server according to the request data is obtained;
and performing WEBSHELL characteristic identification on the service flow according to the response data to generate a WEBSHELL identification result.
Optionally, the WEBSHELL detection method further includes:
acquiring the request data;
correspondingly, performing WEBSHELL characteristic identification on the service flow according to the response data, including: and carrying out WEBSHELL characteristic identification on the service flow according to the request data and the response data.
Optionally, performing a service traffic webhell feature identification according to the request data and the response data, including:
calling a machine learning/deep learning model to perform WEBSHELL feature identification on the request data and the response data to obtain a model identification result;
and generating the WEBSHELL recognition result according to the model recognition result.
Optionally, invoking a machine learning/deep learning model to perform WEBSHELL feature recognition on the request data and the response data, including:
inputting the request data into a request detection model for performing request data characteristic identification to obtain a request identification result;
inputting the response data into a response detection model for response data feature identification to obtain a response identification result;
and taking the request identification result and the response identification result as the model identification result.
Optionally, inputting the request data into a request detection model for performing request data feature identification, including:
inputting the request data into the request detection model;
the request detection model identifies WEBSHELL to the request data according to preset dangerous request characteristics; wherein the hazard request feature comprises: specifying at least one of a dangerous function call, specifying a dangerous command, specifying a special character, specifying a characteristic of a known backdoor, specifying a request traffic information entropy, and specifying an access statistics characteristic.
Optionally, inputting the response data into a response detection model for response data feature identification, including:
inputting the response data into the response detection model;
the response detection model identifies WEBSHELL to the response data according to preset dangerous response characteristics; wherein the hazard response characteristics include: specifying at least one of response package page view data, specifying an operation keyword, specifying sensitive file information, and specifying a character.
Optionally, the WEBSHELL detection method further includes:
judging whether a high-performance mode is started or not;
if the data is started, performing WEBSHELL characteristic identification on the service flow according to the request data or the response data;
and if not, executing the step of acquiring the request data.
Optionally, the WEBSHELL detection method further includes:
and performing corresponding data interception blocking or data release according to the WEBSHELL identification result.
Optionally, performing corresponding data interception blocking or data release according to the WEBSHELL identification result, including:
when the WEBSHELL identification result contains two or more detection values, determining a working mode; wherein the operating modes include: at least one of a high detection mode and a low false alarm mode;
when the working mode is the high detection mode, if the detection value judged as WEBSHELL exists in the detection values, generating a WEBSHELL identification result with the WEBSHELL, and performing connection interception and reporting;
when the working mode is the low false alarm mode, if the detection value which is greater than the interception threshold exists in the detection value, the WEBSHELL identification result which exists in the WEBSHELL is generated, and connection interception and reporting are carried out; wherein the intercept threshold is greater than 1.
The application discloses WEBSHELL detection device includes:
the data acquisition unit is used for acquiring response data fed back to the client by the server according to the request data after the client sends the request data to the server;
and the characteristic identification unit is used for carrying out WEBSHELL characteristic identification on the service flow according to the response data and generating a WEBSHELL identification result.
The application discloses computer equipment includes:
a memory for storing a program;
and the processor is used for realizing the steps of the WEBSHELL detection method when the program is executed.
A readable storage medium having a program stored thereon, which when executed by a processor, performs the steps of the WEBSHELL detection method.
The applicant of the present application finds that, in response data generated by a server according to request data initiated by a client, characteristics of high distinction degree exist between illegal access behaviors and legal access behaviors based on webhell, such as: the operation habit, supported operation and response of the attacker include the result after the attack, such as sensitive file information and the like. The WEBSHELL detection method can perform WEBSHELL identification through the WEBSHELL response characteristics, has strong detection capability and generalization capability and low false alarm rate, and can improve the back door detection effect.
The invention also provides a WEBSHELL detection device, computer equipment and a readable storage medium, which have the beneficial effects and are not described herein again.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a webhell detection method according to an embodiment of the present invention;
fig. 2 is a block diagram of a webhell detection apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The core of the invention is to provide a WEBSHELL detection method, which has high accuracy in WEBSHELL detection and improves the detection effect of a backdoor; another core of the present invention is to provide a web page detection apparatus, a computer device and a readable storage medium.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Through research on interactive data between a client and a server based on web, it is found that in response data generated by the server according to request data initiated by the client, characteristics with high discrimination exist between illegal access behaviors and legal access behaviors based on web, such as: the operation habits of attackers are similar, and the layout (such as background) of returned HTML pages (response packets) is also similar; operations supported by WebShell are the same (such as file operation, database operation and the like), and an HTML page in a response package also contains corresponding keywords; the response contains the result after the attack, such as sensitive file information and the like. The above features only exist in the response data, and therefore, the invention provides a webhell detection method which can improve the webhell detection accuracy.
Example one
Referring to fig. 1, fig. 1 is a flowchart of a websell detection method according to the present embodiment; the method mainly comprises the following steps:
step s110, after the client sends the request data to the server, obtaining response data fed back to the client by the server according to the request data;
the request data refers to data in a request information packet initiated by a client to a server, and comprises the following steps: request line, request header, request body, etc. The response data refers to data in a response information packet from the server side to the client side, and comprises the following steps: the response data in this embodiment refers to response data fed back to the client according to the request data after the server receives the request data, that is, the request data and the response data in this embodiment are data correspondingly generated in one complete interaction.
It should be noted that, in this embodiment, it is necessary to acquire response data and perform corresponding feature recognition according to the response data, but the present invention is not limited to acquiring only the response data to perform the feature recognition, and in addition to the response data, other data may be further acquired to perform the feature recognition together with the response data.
And step s120, performing WEBSHELL characteristic identification on the service flow according to the response data, and generating a WEBSHELL identification result.
In this embodiment, the identification mode and the specific WEBSHELL feature index in the identification process are not limited, and optionally, a machine learning/deep learning model may be invoked to perform WEBSHELL feature identification on response data, and compared with the conventional WEBSHELL fingerprint matching mode, the invoked machine learning/deep learning model has stronger expression and generalization capabilities (adaptability of a machine learning algorithm to a fresh sample), so that the detection capability and accuracy of the WEBSHELL can be greatly improved.
The WebShell attack response generally comprises sensitive information (such as a user name, a sensitive path and the like) obtained after the attack; the habits of WebShell hackers on the HTML page settings of the response package are approximately the same (e.g., the background color of the page is mostly black and some specified page layout features, etc.); operations supported by WebShell are the same (such as file operation, database operation and the like), and an HTML page in a response package also contains corresponding keywords; and some specified character features (such as containing WebShell family names) can be contained in a response packet of a well-known WebShell backdoor. Based on the characteristics, WebShell analysis and recognition of the response package can be performed on the page view data of the response package, the specified operation keywords, the specified sensitive file information and the specified characters, so that accurate distinguishing of WebShell and normal response package is achieved.
The embodiment is applicable to Web attack detection and other scenes, and is applied to products such as firewalls, security situation awareness and the like.
Based on the above description, in the webhell detection method provided in this embodiment, by obtaining the response data that is fed back to the client by the server according to the request data, and performing web traffic flow web feature identification on the response data, the web detection method can perform web identification through the web response feature, has strong detection capability and generalization capability, and a low false alarm rate, and can improve the backdoor detection effect.
Example two
Because the request data sent by the client to the server also contains certain WEBSHELL feature data, for improving the detection accuracy, WEBSHELL feature identification can be simultaneously carried out according to the request data and the response data, and bidirectional detection during interaction between the client and the server is realized, so that WEBSHELL can be more accurately identified. Accordingly, in addition to the above steps, the following steps may be further performed: acquiring the request data, specifically, step s110 in the first embodiment may be adjusted to: acquiring request data sent by the client to the server and response data fed back to the client by the server according to the request data, and accordingly, step s120 specifically includes: and carrying out WEBSHELL characteristic identification on the service flow according to the request data and the response data. The model identification result includes two parts, i.e., a request identification result and a response identification result.
In this embodiment, the execution process of performing the web feature recognition on the service traffic according to the request data and the response data is not limited, and optionally, a machine learning/deep learning model may be invoked to perform the web feature recognition on the request data and the response data to obtain a model recognition result; and generating a WEBSHELL recognition result according to the model recognition result.
Compared with the traditional WebShell fingerprint matching mode, the recognition mode for realizing feature recognition by calling the machine learning/deep learning model has stronger expression and generalization capability (adaptability of a machine learning algorithm to a fresh sample), so that the detection capability and accuracy of WebShell can be greatly improved. It should be noted that, when the machine learning/deep learning model is called, the corresponding models can be respectively established for the request data and the response data, and the process of calling the machine learning/deep learning model to perform webcall feature identification on the request data and the response data specifically includes:
(1) inputting the request data into a request detection model for performing request data characteristic identification to obtain a request identification result;
(2) inputting the response data into a response detection model to perform response data characteristic identification to obtain a response identification result;
(3) and taking the request identification result and the response identification result as the model identification result.
The method can realize accurate feature identification and judgment aiming at the internal features in the request data and the response data respectively.
Or combining the request data and the response data into a message, judging as the input of an overall identification model, and calling a machine learning/deep learning model to perform WEBSHELL feature identification on the request data and the response data, wherein the specific implementation process comprises the following steps:
(1) combining the request data and the response data, and taking the combined data as a data interaction message;
(2) inputting the data interaction message into an interaction model for overall data interaction feature recognition, and taking an output result of the interaction model as a model recognition result; the interactive model is a machine learning/deep learning model obtained by training according to the data interactive message sample.
When the overall recognition model is established, accurate recognition and judgment of the association characteristics between the combined request data and the response data can be realized.
When the models corresponding to the request data and the response data are respectively established for judgment, judgment of the overall flow (request + response) by the overall recognition model can be further added, that is, analysis of the data is realized through the three models.
Whether a request detection model, a response detection model, an interaction model, or other models, the selected implementation of the models can be machine learning algorithms such as SVM, GBDT, RF, etc., deep learning models such as CNN, RNN, LSTM, etc., or a combination of machine learning and deep learning models can be used, including but not limited to a particular algorithm or combination of algorithms.
In addition, in this embodiment, an output form of the model identification result is not limited, and may be a discrete dispersion result or a continuous probability result, the webhell identification result indicates whether the current data obtained according to the model identification is a webhell backdoor, and webhell identification results in different model result output forms are different, so that the webhell identification result generated according to the model identification result is also not limited in this embodiment.
A discrete dispersion result form such as:
reqc: the output of the model representing the determination result of the model on the requested flow is 0/1;
resc: the result of determination of the response flow rate by the model is shown, and the output is 0/1;
if the model output value (req)cOr resc) If the number is 1, the WebShell backdoor exists in the model judgment data; conversely, if the model output value (req)cOr resc) And if the number is 0, the model judges that the WebShell backdoor does not exist in the data.
A form of continuous probabilistic results such as:
reqp: the judgment result of the model on the request flow indicates the probability of WebShell existence;
resp: the judgment result of the model on the response flow indicates the probability of WebShell existence;
the model output value represents the probability that the WebShell backdoor exists in the model judgment data, such as reqp0.9, the probability that the machine learning model considers that the request direction data exists in the WebShell backgate is 0.9. The output value range of the model is [0,1 ]]Meanwhile, a closer 1 indicates a higher probability of the presence of the WebShell backgate, and a closer 0 indicates a lower probability of the presence of the WebShell backgate.
When the judging module judges whether the WebShell backdoor exists in the communication, the judging module can comprehensively judge through the detection results in two directions. Without being limited to a specific scheme, the weighted average is taken as an example below. And P represents the probability of the WebShell backdoor existing in the communication, and then P can be calculated by the following formula:
Figure BDA0002352579160000081
wherein, wiRepresenting the weight, p, occupied by each modeliThe probability that each model detects the WebShell backdoor is represented, and N represents the number of models.
The final WebShell test result can be weighted and averaged from the test results of all models, wherein the model weight wiThe weight setting mode is not limited by the invention, and the weight setting mode can be set by experience, or can be learned by a machine learning model, or can be directly used for weighting all models.
In addition, in this embodiment, only the way of generating the webhell recognition result in the form of two results, i.e., the model recognition result including the request result and the response result, is taken as an example for description, and the web recognition result generation in the recognition result of a single model or three or more models can refer to the above description, and will not be described herein again.
In addition, in this embodiment, the specific identified feature types in each model (which may include a request detection model, a response detection model, and an interaction model) are not limited, and may be set correspondingly according to the analyzed features of the request data, the response data, and the overall interaction traffic data, which is not described in detail in this embodiment.
When a plurality of identification methods are configured in the system, such as identification of request data, identification of response data, and identification of interaction data (request + response associated interaction data), before performing identification one by one, the following steps may be further performed:
judging whether a high-performance mode is started or not;
if the data is started, performing WEBSHELL characteristic identification on the service flow according to the request data or the response data;
and if not, executing the step of acquiring the request data.
The high-performance mode can not start all detection channels when being configured with a plurality of identification modes, and by taking two identification modes as examples in the above modes, the detection system can only start data detection in the request direction and close data detection at the response end by configuring the high-performance mode, and all detection results are determined by a detection model in the request direction; or only starting the data detection in the response direction and closing the data detection at the request end, wherein all detection results are determined by the detection model in the response direction, and the configuration of the modes can be flexibly combined according to requirements. The mode is suitable for scenes with high performance requirements, and can improve detection resource consumption.
Of course, the high performance mode may not be configured, and is not limited herein.
EXAMPLE III
In the second embodiment, the specific identified feature types in the models (which may include the request detection model, the response detection model, and the interaction model) are not limited, and may be set correspondingly according to the analyzed features of the request data, the response data, and the overall interaction traffic data. In order to deepen understanding of the feature recognition process under various models, several implementation manners of feature recognition are described in this embodiment.
Optionally, a request data feature identification manner of the request detection model is as follows:
(1) inputting request data into a request detection model;
(2) the request detection model identifies the request data according to the preset dangerous request characteristics by WEBSHELL; wherein the hazard request feature comprises: specifying at least one of a dangerous function call, specifying a dangerous command, specifying a special character, specifying a characteristic of a known backdoor, specifying a request traffic information entropy, and specifying an access statistics characteristic.
According to the method, based on the machine learning/deep learning model, the characteristics of dangerous function calling, dangerous commands, suspicious (special) characters, the characteristics of a known backdoor, the entropy of requested flow information, access statistical characteristics and the like are identified, and the identification accuracy can be improved.
For the sake of a better understanding, a danger command recognition is described below as an example.
In the communication process of the client and the server, the client firstly sends request data to the server, and the request data comprises all request behaviors sent by the client to the server. For example, in order to obtain sensitive file contents, an attacker often executes malicious commands such as cat/etc/password, and these commands are passed to the server for execution through request traffic, and a WebShell request traffic data is shown as follows:
GET/shell.phpcmd=cat+/etc/passwd HTTP/1.1
User-Agent:python-requests/2.18.4
Accept-Encoding:deflate
Accept:*/*
……
therefore, cat +/etc/password malicious instructions are distinguished from ordinary request flow data in the WebShell request flow data, and the WebShell backdoor flow can be identified by extracting information such as dangerous operation commands appearing in the request flow. In this embodiment, only the principle and implementation manner of identifying a designated malicious command are described, and the above description may be referred to for identification manners of other request data, which is not described herein again.
In addition, it should be noted that the model called in the actual feature recognition process is a pre-trained model, and in order to deepen understanding, the description is given to an implementation manner of the model training process corresponding to the feature recognition method in this embodiment, and specifically includes the following two steps:
(1) request packet data is collected for training the request detection model. The request data comprises normal business data and request data containing WebShell backdoor operation.
(2) And extracting the request flow characteristics and training a request detection model. The extracted features include, but are not limited to, dangerous function calls, dangerous commands, suspicious (special) characters, features of well-known backdoors, entropy of request traffic information, access statistics, and the like.
After the model training is finished, the request data can be judged according to the request detection model obtained by training, and a judgment result is output.
Optionally, an implementation manner of inputting the response data into the response detection model for response data feature identification is as follows:
(1) inputting the response data into a response detection model;
(2) the response detection model identifies WEBSHELL to the response data according to preset dangerous response characteristics; wherein the hazard response characteristics include: specifying at least one of response package page view data, specifying an operation keyword, specifying sensitive file information, and specifying a character.
The WebShell attack response generally comprises sensitive information (such as a user name, a sensitive path and the like) obtained after the attack; the habits of WebShell hackers on the HTML page settings of the response package are approximately the same (e.g., the background color of the page is mostly black and some specified page layout features, etc.); operations supported by WebShell are the same (such as file operation, database operation and the like), and an HTML page in a response package also contains corresponding keywords; and some specified character features (such as containing WebShell family names) can be contained in a response packet of a well-known WebShell backdoor. In the mode, based on the machine learning/deep learning model, the WebShell analysis and identification of the response package are carried out on the page view data including but not limited to the specified response package, the specified operation keywords, the specified sensitive file information and the specified characters, so that the WebShell and the normal response package can be accurately distinguished.
For deepening understanding, the WebShell response packet analysis and recognition are performed based on a keyword corresponding to an operation supported by the WebShell.
And after receiving the request data of the client, the server side sends response data to the client. Since the response contains sensitive information that the attacker is trying to obtain, it is much different from normal traffic. After receiving the traffic data containing malicious instructions such as cat/etc/password, the server analyzes the data packet, executes the corresponding command, and finally returns the executed result (/ etc/password file content) as a response packet. Thus, one approach is to extract information in response traffic in a format like A: B: C: D: E: F: G to identify WebShell back-door communications.
Response packet information of WebShell response flow is as follows:
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sy
……
it can be seen that information in the format of A: B: C: D: E: F: G, such as root: 0:0: root:/root:/bin/bash, daemon: x:1:1: daemon:/usr/sbin:/usr/sbin/nomodin, bin: x:2:2: bin:/bin:/usr/sbin/nomodin, sys: x:3:3: sys:/dev:/usr/sbin/nomodin, is present in the WebShell response flow data to distinguish from the normal response flow data, so that WebShell back door flow can be identified by extracting dangerous operation information present in the response flow. In the above manner, only the reasons and implementation processes for identifying the WebShell operation keyword of the response package are taken as examples, and the implementation processes based on other keyword identifications and other rule identifications can refer to the above description, and are not described herein again.
In addition, it should be noted that the model called in the actual feature recognition process is a pre-trained model, and in order to deepen understanding, the description is given to an implementation manner of the model training process corresponding to the feature recognition method in this embodiment, and specifically includes the following two steps:
(1) response packet data is collected for training the response detection model. The response packet data comprises normal service data and response data containing WebShell backdoor behaviors.
(2) And extracting response flow characteristics and training a response detection model. Wherein, the extracted features include but are not limited to response sensitive information (such as user name, sensitive path, etc.); hacking habits (e.g., page background color, page layout, etc.); known as WebShell backdoor features (e.g., WebShell family names), and the like.
After the training of the response detection model is completed, the response detection model obtained by training can be used for judging the response data and outputting a judgment result.
The present application does not limit the specific implementation of the extraction implementation of the requested data features, and is only described by taking the above implementation as an example, and other implementations based on the present application can refer to the description of the present embodiment, and are not described herein again.
Example four
Based on the above embodiment, after the webhell identification result is generated, in order to ensure the security of the system operation, corresponding data interception blocking or data release can be performed according to the webhell identification result. Specifically, when the identification result shows that the current flow is web traffic, data interception and reporting can be performed, so that further diffusion of dangerous data is avoided; and when the identification result shows that the data is the normal data currently, the data is released.
When the webhell recognition result only includes one detection value (for example, only includes a model recognition result of response data or only includes a model recognition result of overall interactive data, etc.), corresponding webhell determination may be directly performed, and when the webhell recognition result includes two or more detection values, in order to adapt to different user needs, a corresponding module interception policy may be configured to implement different operation modes, specifically, the process of performing corresponding data interception blocking or data release according to the webhell recognition result may specifically include: when the WEBSHELL identification result contains two or more detection values, determining a working mode; wherein, the mode of operation includes: at least one of a high detection mode and a low false alarm mode (the WebShell detection device working modes include, but are not limited to, the above two modes);
when the working mode is a high detection mode, if the detection value which is judged as WEBSHELL exists in the detection values, generating a WEBSHELL identification result with the WEBSHELL, and performing connection interception and reporting;
when the working mode is a low false alarm mode, if the detection value which is greater than the interception threshold exists in the detection value, the WEBSHELL is judged to exist, a WEBSHELL identification result which exists in the WEBSHELL is generated, and connection interception and reporting are carried out; wherein the intercept threshold is greater than 1.
Using the example of WEBSHELL recognition results including discrete results for the request detection model and the response detection model, req when configured in high detection modec(request output result of detection model), rescAs long as one of the values is 1, the comprehensive detection module intercepts the connection (i.e. as long as the WebShell backdoor is detected in one of the directions regardless of the request traffic or the response traffic), and the comprehensive judgment module intercepts the connection and reports the connection.
Req when configured as low false alarm modec(request output result of detection model) and rescThe comprehensive detection module intercepts all the request traffic and the response traffic (the output result of the response detection model) which are 1, namely, the comprehensive judgment module intercepts the connection and reports the connection only when the WebShell backdoor is detected in both directions.
The specific working mode can be configured according to the requirements of customers, for example, the requirement on safety is high, and high detection can be adjusted; for some customers (such as financial industry), the tolerance to false alarm is low, and the low false alarm mode can be selected to be turned on, in this embodiment, only the above cases are taken as an example for description, and other implementation manners based on the present application can refer to the description of this embodiment, and are not described herein again.
EXAMPLE five
Referring to fig. 2, fig. 2 is a block diagram of a webhell detection apparatus provided in the present embodiment; the device mainly includes: a data acquisition unit 210 and a feature identification unit 220. The web detection apparatus provided in this embodiment can be compared with the web detection method.
The data obtaining unit 210 is mainly configured to obtain response data that is fed back to the client by the server according to the request data after the client sends the request data to the server;
the feature identification unit 220 is mainly configured to perform webhell feature identification on the service traffic according to the response data, and generate a webhell identification result.
The WEBSHELL detection device described in this embodiment can realize accurate detection of WEBSHELL backdoor traffic.
EXAMPLE six
The present embodiment provides a computer device, including: a memory and a processor.
Wherein, the memory is used for storing programs;
for the steps of the method for detecting webhell when the processor is used to execute the program, reference may be specifically made to the description of the method for detecting webhell in the foregoing embodiment, and details are not described here again.
Referring to fig. 3, a schematic structural diagram of a computer device provided in this embodiment may have a larger difference due to different configurations or performances, and may include one or more processors (CPUs) 322 (e.g., one or more processors) and a memory 332, and one or more storage media 330 (e.g., one or more mass storage devices) storing applications 342 or data 344. Memory 332 and storage media 330 may be, among other things, transient storage or persistent storage. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instructions operating on a data processing device. Still further, the central processor 322 may be configured to communicate with the storage medium 330 to execute a series of instruction operations in the storage medium 330 on the computer device 301.
The computer device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input-output interfaces 358, and/or one or more operating systems 341, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
The steps in the WEBSHELL detection method described in fig. 1 above can be implemented by the structure of the computer device in this embodiment.
EXAMPLE seven
The present embodiment discloses a readable storage medium, on which a program is stored, and the program, when executed by a processor, implements the steps of the method for detecting web, which can be referred to the description of the method for detecting web in the above embodiments.
The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and various other readable storage media capable of storing program codes.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The web detection method, the web detection device, the computer device and the readable storage medium provided by the present invention are described in detail above. The principles and embodiments of the present invention are explained herein using specific examples, which are presented only to assist in understanding the method and its core concepts. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (13)

1. A method for webhell detection, comprising:
after a client sends request data to a server, response data fed back to the client by the server according to the request data is obtained;
and performing WEBSHELL characteristic identification on the service flow according to the response data to generate a WEBSHELL identification result.
2. The WEBSHELL detection method of claim 1, further comprising:
acquiring the request data;
correspondingly, performing WEBSHELL characteristic identification on the service flow according to the response data, including: and carrying out WEBSHELL characteristic identification on the service flow according to the request data and the response data.
3. The method of claim 2, wherein said performing a service traffic webhell signature based on said request data and said response data comprises:
calling a machine learning/deep learning model to perform WEBSHELL feature identification on the request data and the response data to obtain a model identification result;
and generating the WEBSHELL recognition result according to the model recognition result.
4. The web page detection method of claim 3, wherein invoking the machine learning/deep learning model for web page feature recognition on the request data and the response data comprises:
combining the request data and the response data, and using the combined data as a data interaction message;
inputting the data interaction message into an interaction model for overall data interaction feature recognition, and taking an output result of the interaction model as a model recognition result; the interactive model is a machine learning/deep learning model obtained by training according to the data interactive message sample.
5. The web page detection method of claim 3, wherein invoking the machine learning/deep learning model for web page feature recognition on the request data and the response data comprises:
inputting the request data into a request detection model for performing request data characteristic identification to obtain a request identification result;
inputting the response data into a response detection model for response data feature identification to obtain a response identification result;
and taking the request identification result and the response identification result as the model identification result.
6. The web page detection method of claim 5, wherein inputting the request data into a request detection model for request data feature identification comprises:
inputting the request data into the request detection model;
the request detection model identifies WEBSHELL to the request data according to preset dangerous request characteristics; wherein the hazard request feature comprises: specifying at least one of a dangerous function call, specifying a dangerous command, specifying a special character, specifying a characteristic of a known backdoor, specifying a request traffic information entropy, and specifying an access statistics characteristic.
7. The web page detection method of claim 5, wherein inputting the response data into a response detection model for response data feature identification comprises:
inputting the response data into the response detection model;
the response detection model identifies WEBSHELL to the response data according to preset dangerous response characteristics; wherein the hazard response characteristics include: specifying at least one of response package page view data, specifying an operation keyword, specifying sensitive file information, and specifying a character.
8. The WEBSHELL detection method of claim 2, further comprising:
judging whether a high-performance mode is started or not;
if the data is started, performing WEBSHELL characteristic identification on the service flow according to the request data or the response data;
and if not, executing the step of acquiring the request data.
9. The WEBSHELL detection method of any one of claims 1 to 8, further comprising:
and performing corresponding data interception blocking or data release according to the WEBSHELL identification result.
10. The webhell detection method of claim 9, wherein performing corresponding data interception blocking or data release according to the webhell identification result comprises:
when the WEBSHELL identification result contains two or more detection values, determining a working mode; wherein the operating modes include: at least one of a high detection mode and a low false alarm mode;
when the working mode is the high detection mode, if the detection value judged as WEBSHELL exists in the detection values, generating a WEBSHELL identification result with the WEBSHELL, and performing connection interception and reporting;
when the working mode is the low false alarm mode, if the detection value which is greater than the interception threshold exists in the detection value, the WEBSHELL identification result which exists in the WEBSHELL is generated, and connection interception and reporting are carried out; wherein the intercept threshold is greater than 1.
11. A webhell detection apparatus, comprising:
the data acquisition unit is used for acquiring response data fed back to the client by the server according to the request data after the client sends the request data to the server;
and the characteristic identification unit is used for carrying out WEBSHELL characteristic identification on the service flow according to the response data and generating a WEBSHELL identification result.
12. A computer device, comprising:
a memory for storing a program;
a processor for implementing the steps of the WEBSHELL detection method as claimed in any one of claims 1 to 10 when executing said program.
13. A readable storage medium, characterized in that the readable storage medium has stored thereon a program which, when being executed by a processor, carries out the steps of the WEBSHELL detection method according to any one of claims 1 to 10.
CN201911421815.6A 2019-12-31 2019-12-31 WEBSHELL detection method, device, equipment and storage medium Pending CN113132329A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911421815.6A CN113132329A (en) 2019-12-31 2019-12-31 WEBSHELL detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911421815.6A CN113132329A (en) 2019-12-31 2019-12-31 WEBSHELL detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113132329A true CN113132329A (en) 2021-07-16

Family

ID=76770126

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911421815.6A Pending CN113132329A (en) 2019-12-31 2019-12-31 WEBSHELL detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113132329A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113765911A (en) * 2021-09-02 2021-12-07 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for detecting webshell encrypted flow
CN113761522A (en) * 2021-09-02 2021-12-07 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for detecting webshell flow
CN114301697A (en) * 2021-12-29 2022-04-08 山石网科通信技术股份有限公司 Data attack detection method and device
CN114499944A (en) * 2021-12-22 2022-05-13 天翼云科技有限公司 Method, device and equipment for detecting WebShell

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105933268A (en) * 2015-11-27 2016-09-07 中国银联股份有限公司 Webshell detection method and apparatus based on total access log analysis
CN107302586A (en) * 2017-07-12 2017-10-27 深信服科技股份有限公司 A kind of Webshell detection methods and device, computer installation, readable storage medium storing program for executing
CN107689940A (en) * 2016-08-04 2018-02-13 深圳市深信服电子科技有限公司 WebShell detection method and device
CN107888571A (en) * 2017-10-26 2018-04-06 江苏省互联网行业管理服务中心 A kind of various dimensions webshell intrusion detection methods and detecting system based on HTTP daily records
CN107888616A (en) * 2017-12-06 2018-04-06 北京知道创宇信息技术有限公司 The detection method of construction method and Webshell the attack website of disaggregated model based on URI
CN109743311A (en) * 2018-12-28 2019-05-10 北京神州绿盟信息安全科技股份有限公司 A kind of WebShell detection method, device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105933268A (en) * 2015-11-27 2016-09-07 中国银联股份有限公司 Webshell detection method and apparatus based on total access log analysis
CN107689940A (en) * 2016-08-04 2018-02-13 深圳市深信服电子科技有限公司 WebShell detection method and device
CN107302586A (en) * 2017-07-12 2017-10-27 深信服科技股份有限公司 A kind of Webshell detection methods and device, computer installation, readable storage medium storing program for executing
CN107888571A (en) * 2017-10-26 2018-04-06 江苏省互联网行业管理服务中心 A kind of various dimensions webshell intrusion detection methods and detecting system based on HTTP daily records
CN107888616A (en) * 2017-12-06 2018-04-06 北京知道创宇信息技术有限公司 The detection method of construction method and Webshell the attack website of disaggregated model based on URI
CN109743311A (en) * 2018-12-28 2019-05-10 北京神州绿盟信息安全科技股份有限公司 A kind of WebShell detection method, device and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113765911A (en) * 2021-09-02 2021-12-07 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for detecting webshell encrypted flow
CN113761522A (en) * 2021-09-02 2021-12-07 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for detecting webshell flow
CN114499944A (en) * 2021-12-22 2022-05-13 天翼云科技有限公司 Method, device and equipment for detecting WebShell
CN114499944B (en) * 2021-12-22 2023-08-08 天翼云科技有限公司 Method, device and equipment for detecting WebShell
CN114301697A (en) * 2021-12-29 2022-04-08 山石网科通信技术股份有限公司 Data attack detection method and device

Similar Documents

Publication Publication Date Title
CN107888571B (en) Multi-dimensional webshell intrusion detection method and system based on HTTP log
US9848016B2 (en) Identifying malicious devices within a computer network
US11003773B1 (en) System and method for automatically generating malware detection rule recommendations
CN109922052B (en) Malicious URL detection method combining multiple features
CN113132329A (en) WEBSHELL detection method, device, equipment and storage medium
US9349006B2 (en) Method and device for program identification based on machine learning
CN107749859B (en) Malicious mobile application detection method for network encryption traffic
CN108156131B (en) Webshell detection method, electronic device and computer storage medium
CN109586282B (en) Power grid unknown threat detection system and method
WO2018188558A1 (en) Method and apparatus for identifying account permission
CN108259514B (en) Vulnerability detection method and device, computer equipment and storage medium
RU2634173C1 (en) System and detecting method of remote administration application
CN109922065B (en) Quick identification method for malicious website
CN107612926B (en) One-sentence speech WebShell interception method based on client recognition
CN109547426B (en) Service response method and server
CN111245784A (en) Method for multi-dimensional detection of malicious domain name
CN112671724B (en) Terminal security detection analysis method, device, equipment and readable storage medium
CN103488947A (en) Method and device for identifying instant messaging client-side account number stealing Trojan horse program
CN112668005A (en) Webshell file detection method and device
CN113965418B (en) Attack success judgment method and device
CN106911665B (en) Method and system for identifying malicious code weak password intrusion behavior
CN111953665A (en) Server attack access identification method and system, computer equipment and storage medium
CN113472798B (en) Method, device, equipment and medium for backtracking and analyzing network data packet
CN111182002A (en) Zombie network detection device based on HTTP (hyper text transport protocol) first question-answer packet clustering analysis
CN114218561A (en) Weak password detection method, terminal equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210716