CN114329467A - Memory WebShell detection method and device and electronic equipment - Google Patents

Memory WebShell detection method and device and electronic equipment Download PDF

Info

Publication number
CN114329467A
CN114329467A CN202111577421.7A CN202111577421A CN114329467A CN 114329467 A CN114329467 A CN 114329467A CN 202111577421 A CN202111577421 A CN 202111577421A CN 114329467 A CN114329467 A CN 114329467A
Authority
CN
China
Prior art keywords
memory
webshell
risk
preset
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111577421.7A
Other languages
Chinese (zh)
Inventor
羊昕瑜
罗伟
游江
任家西
郭健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Original Assignee
Nsfocus Technologies Inc
Nsfocus Technologies Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nsfocus Technologies Inc, Nsfocus Technologies Group Co Ltd filed Critical Nsfocus Technologies Inc
Priority to CN202111577421.7A priority Critical patent/CN114329467A/en
Publication of CN114329467A publication Critical patent/CN114329467A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Storage Device Security (AREA)

Abstract

The application discloses a memory WebShell detection method, a memory WebShell detection device and electronic equipment, wherein the method comprises the following steps: acquiring all classes of a target application pair, and screening M first classes with memory WebShell risk in all classes; determining codes corresponding to each byte code of the first type in the memory; performing memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks; and marking the classes corresponding to the N code files, and taking the marked classes as a first memory WebShell detection result. Based on the method, risk detection is carried out on the code file corresponding to the byte codes in the slave memory, the method is suitable for the non-text memory WebShell, can also be suitable for the encryption processing and the text memory WebShell after the confusion processing, and can improve the accuracy of the memory WebShell detection.

Description

Memory WebShell detection method and device and electronic equipment
Technical Field
The application relates to the technical field of information security, in particular to a memory WebShell detection method and device and electronic equipment.
Background
The WebShell is a malicious script file which exists in a Web container in a file form and can control the server to execute any instruction through the WebShell, so that sensitive data or credential can be stolen or the WebShell can be used as a springboard for attacking an intranet host. However, with the vigorous development of the internet, the scope of attack and defense practice increases, the protection means of the defense is gradually improved, and the attack difficulty of the WebShell existing in the Web container in a file form is gradually increased. Therefore, an attacker adopts a more obvious attack means, such as an attack mode based on memory WebShell, and when the attack mode has the right of a Web container process to execute a user, the attack is convenient and can completely control data and codes in an address space corresponding to the process, thereby achieving the purpose of controlling the server. When an attack mode based on the memory WebShell is faced, the traditional protection mode appears to catch the elbow, so that a WebShell detection method is urgently needed to detect the memory WebShell.
The existing method for detecting the memory WebShell is mainly based on a WebShell file entity detection method, such as a feature detection method, and the method extracts malicious features from a known WebShell sample to perform pattern matching so as to detect the memory WebShell; as another example, in a statistical analysis method, the method utilizes some statistical methods to identify and detect the WebShell file, and then extracts the characteristics of a feature code, an information entropy, a longest word, a coincidence index, compression and the like in the WebShell file to perform abnormal detection, so as to find the memory WebShell; and a machine learning method is adopted, wherein a detection model is obtained by training a sample in modes of decision trees, deep learning and the like, and then the detection model is used for detecting the memory WebShell.
However, the detection method based on the WebShell file entity is difficult to detect the non-text WebShell and cannot detect the text WebShell processed by means of confusion, encryption and the like, so that the detection method has a high possibility of false alarm and false alarm regardless of a feature detection method, a statistical analysis method or a machine learning method.
Disclosure of Invention
After all classes in a virtual machine corresponding to a target application are preliminarily screened, and under the condition that a bytecode file corresponding to target application process information is not required to be compared with an original bytecode file, memory WebShell detection and labeling are further performed on a code file corresponding to a bytecode of the screened class in a memory.
In a first aspect, the present application provides a memory WebShell detection method, including:
acquiring all classes in a virtual machine corresponding to a target application, and screening M first classes with memory WebShell risk in all the classes, wherein M is an integer greater than or equal to 1;
determining codes corresponding to the byte codes of each first type in the M first types in the memory respectively to obtain M code files;
performing memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks, wherein one first preset rule corresponds to one memory WebShell risk, and K and N are integers greater than or equal to 1;
and carrying out risk marking on the classes corresponding to the N code files by using a first preset identifier, and taking the class containing the first preset identifier as a first memory WebShell detection result.
By the method, after all classes in the virtual machine corresponding to the target application are preliminarily screened, the code files corresponding to the byte codes of the screened classes in the memory are further subjected to memory WebShell detection and labeling under the condition that the byte code files corresponding to the process information of the target application do not need to be compared with the original byte code files, so that the false alarm rate is reduced.
Further, the screening of M first classes with WebShell risk in all classes includes:
screening out a second class which does not have a source byte code file locally in all the classes, and taking the second class as the first class which has the WebShell risk; and/or
And screening out a third class matched with any one of H second preset rules from all the classes, and taking the third class as the first class with the WebShell risk, wherein one second preset rule corresponds to the memory WebShell risk, and H is an integer greater than or equal to 1.
By the method, all classes in the virtual machine corresponding to the target application are preliminarily screened, the investigation range of subsequent code risk scanning is narrowed, meanwhile, all class information of the entity file without the local source byte code is obtained, the memory WebShell without the entity file has a good aiming effect, and the detection precision and the detection accuracy of the memory WebShell are improved.
Further, the determining the code corresponding to each of the first classes respectively to obtain M code files includes:
extracting byte codes in M first classes respectively corresponding to the memories;
and converting the byte codes into M code files of preset types.
By the method, the byte codes in the memory corresponding to the extracted class are converted into the code files of the preset type, so that the subsequent code risk checking is convenient to further perform, the method can be suitable for the non-text memory WebShell, and can also be suitable for the text memory WebShell after encryption processing and confusion processing, and the detection accuracy of the memory WebShell is improved.
Further, the performing memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks, including:
respectively matching the text corresponding to each code file in the M code files with a preset risk level character string;
if a first character string consistent with the preset risk level character string exists in a text corresponding to any code file, determining that the code text to which the first character string belongs has a memory WebShell risk; and/or
Detecting the number of preset risk characters in the text corresponding to each code file in the M code files;
judging whether the numerical values are larger than a preset threshold value or not;
and if the first code text with the risk character number larger than the preset threshold exists, determining that the memory WebShell risk exists in the first code text.
By the method, the M first types are detected by using the K first preset rules for detecting different risk types, different types of memory WebShell risks can be adapted, and the accuracy of memory WebShell detection is improved.
In one possible design, before acquiring all classes in the virtual machine corresponding to the target application, the method further includes:
scanning all currently running applications to obtain basic information respectively corresponding to all applications and process information respectively corresponding to all applications, wherein the basic information at least comprises complete class information of an application entrance;
selecting a target application from all the applications according to the basic information corresponding to all the applications;
and loading the basic information corresponding to the target application and the process information corresponding to the target application to a virtual machine corresponding to the target application.
By the method, the information corresponding to the virtual machine when the target application is running is acquired, compared with the method of accessing and analyzing a static file or acquiring the information by an external access mode for the running target application, the acquired information is more comprehensive, and the detection accuracy is improved.
In one possible design, after the risk labeling is performed on the classes corresponding to the N code files by using a first preset identifier, and the class including the first preset identifier is used as a first memory WebShell detection result, the method further includes:
determining a risk type corresponding to the class containing the first preset identifier in the first memory WebShell detection result;
and adding the information corresponding to the risk type into the first memory WebShell detection result to obtain a second memory WebShell detection result.
By the method, the risk type corresponding to the class with the memory WebShell risk is further determined, the risk information corresponding to the memory WebShell in the memory WebShell detection result is perfected, and the memory WebShell detection accuracy is improved.
Further, determining a risk type corresponding to the class containing the first preset identifier in the first memory WebShell detection result includes:
matching a second code file of a preset type corresponding to the class containing the first preset identification in the first memory WebShell detection result with any one first preset rule;
and if the matching is consistent, using the memory WebShell risk corresponding to the first preset rule which is consistent with the matching of the second code file as a risk type.
By the method, the risk type corresponding to the class with the memory WebShell risk is determined to be the string risk and/or the character risk according to different rules, the memory WebShell detection result is refined, and the detection accuracy is improved.
In one possible design, after the risk labeling is performed on the classes corresponding to the N code files by using a first preset identifier, and the class including the first preset identifier is used as a first memory WebShell detection result, the method further includes:
marking the third category by using a second preset identification, wherein the second preset identification corresponds to a memory WebShell risk;
adding the labeled third class into the first memory WebShell detection result to obtain a third memory WebShell detection result; or
And adding the labeled third class into the second memory WebShell detection result to obtain a third memory WebShell detection result.
By the method, the class matched with the second preset rule is labeled, and the labeled class is added to the memory WebShell detection result, so that the memory WebShell detection result is more comprehensive, and the detection accuracy is improved.
In a second aspect, the present application provides a memory WebShell detection apparatus, where the apparatus includes:
the system comprises a screening module, a judging module and a judging module, wherein the screening module is used for acquiring all classes in a virtual machine corresponding to a target application and screening M first classes with memory WebShell risk in all the classes, and M is an integer greater than or equal to 1;
a first determining module, configured to determine codes corresponding to the bytecode of each of the M first classes in the memory, to obtain M code files;
the detection module is used for carrying out memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks, wherein one first preset rule corresponds to one memory WebShell risk, and K and N are integers greater than or equal to 1;
and the first marking module is used for carrying out risk marking on the classes corresponding to the N code files by using a first preset identifier, and taking the class containing the first preset identifier as a first memory WebShell detection result.
Further, the screening module is specifically configured to:
screening out a second class which does not have a source byte code file locally in all the classes, and taking the second class as the first class which has the WebShell risk; and/or
And screening out a third class matched with any one of H second preset rules from all the classes, and taking the third class as the first class with the WebShell risk, wherein one second preset rule corresponds to the memory WebShell risk, and H is an integer greater than or equal to 1.
Further, the first determining module is specifically configured to:
extracting byte codes in M first classes respectively corresponding to the memories;
and converting the byte codes into M code files of preset types.
Further, the detection module is specifically configured to:
respectively matching the text corresponding to each code file in the M code files with a preset risk level character string;
if a first character string consistent with the preset risk level character string exists in a text corresponding to any code file, determining that the code text to which the first character string belongs has a memory WebShell risk; and/or
Detecting the number of preset risk characters in the text corresponding to each code file in the M code files;
judging whether the numerical values are larger than a preset threshold value or not;
and if the first code text with the risk character number larger than the preset threshold exists, determining that the memory WebShell risk exists in the first code text.
In one possible design, the apparatus further includes:
the system comprises a scanning module, a processing module and a processing module, wherein the scanning module is used for scanning all currently running applications to obtain basic information respectively corresponding to all the applications and process information respectively corresponding to all the applications, and the basic information at least comprises application entry complete class information;
the selection module is used for selecting a target application from all the applications according to the basic information corresponding to all the applications;
and the loading module is used for loading the basic information corresponding to the target application and the process information corresponding to the target application to the virtual machine corresponding to the target application.
In one possible design, the apparatus further includes:
a second determining module, configured to determine a risk type corresponding to the class including the first preset identifier in the first memory WebShell detection result;
and the first adding module is used for adding the information corresponding to the risk type into the first memory WebShell detection result to obtain a second memory WebShell detection result.
Further, the second determining module is specifically configured to:
matching a second code file of a preset type corresponding to the class containing the first preset identification in the first memory WebShell detection result with any one first preset rule;
and if the matching is consistent, determining the memory WebShell risk corresponding to the first preset rule which is consistent with the matching of the second code file as a risk type.
In one possible design, the apparatus further includes:
the second marking module is used for marking the third category by using a second preset identification, wherein the second preset identification corresponds to a memory WebShell risk;
the second adding module is used for adding the labeled third class to the first memory WebShell detection result to obtain a third memory WebShell detection result; or adding the labeled third class to the second memory WebShell detection result to obtain a third memory WebShell detection result.
In a third aspect, the present application provides an electronic device, comprising:
a memory for storing a computer program;
and the processor is used for realizing the steps of the memory WebShell detection method when executing the computer program stored in the memory.
In a fourth aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the above-mentioned memory WebShell detection method steps.
Based on the memory WebShell detection method, after all classes in the virtual machine corresponding to the target application are preliminarily screened, memory WebShell detection and labeling are further performed on code files corresponding to the byte codes of the screened classes in the memory under the condition that the byte code files corresponding to the process information of the target application do not need to be compared with original byte code files, and the false alarm rate is reduced. Meanwhile, the code file is converted from the byte codes extracted from the memory, so that the method is suitable for detecting the non-text memory WebShell and can also be suitable for encrypting and confusing the processed text memory WebShell, and therefore the detection efficiency of the memory WebShell can be improved.
For each of the second to fourth aspects and possible technical effects of each aspect, reference is made to the above description of the possible technical effects of the first aspect or various possible schemes of the first aspect, and repeated description is omitted here.
Drawings
Fig. 1 is a flowchart of a memory WebShell detection method provided in the present application;
fig. 2 is a schematic diagram of a memory WebShell detection method provided by the present application;
fig. 3 is a schematic structural diagram of a memory WebShell detection device provided in the present application;
fig. 4 is a schematic structural diagram of an electronic device provided in the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clear, the present application will be further described in detail with reference to the accompanying drawings. The particular methods of operation in the method embodiments may also be applied to apparatus embodiments or system embodiments. It should be noted that "a plurality" is understood as "at least two" in the description of the present application. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. A is connected with B and can represent: a and B are directly connected and A and B are connected through C. In addition, in the description of the present application, the terms "first," "second," and the like are used for descriptive purposes only and are not intended to indicate or imply relative importance nor order to be construed.
The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
The existing method for detecting the memory WebShell is mainly based on a WebShell file entity detection method, such as a feature detection method, and the method extracts malicious features from a known WebShell sample to perform pattern matching so as to detect the memory WebShell; as another example, in a statistical analysis method, the method utilizes some statistical methods to identify and detect the WebShell file, and then extracts the characteristics of a feature code, an information entropy, a longest word, a coincidence index, compression and the like in the WebShell file to perform abnormal detection, so as to find the memory WebShell; and a machine learning method is adopted, wherein a detection model is obtained by training a sample in modes of decision trees, deep learning and the like, and then the detection model is used for detecting the memory WebShell.
However, the detection method based on the WebShell file entity is difficult to detect the non-text WebShell and cannot detect the text WebShell processed by means of confusion, encryption and the like, so that the detection method has high possibility of false alarm and missing report no matter a feature detection method, a statistical analysis method or a machine learning method, and the accuracy of the detection of the stored WebShell is low.
In order to solve the problems, the application provides a memory forensics method, after all classes in a virtual machine corresponding to a target application are preliminarily screened, memory WebShell detection and labeling are further performed on a code file corresponding to byte codes of the screened classes in a memory, and the code file is converted from the byte codes extracted from the memory, so that the memory forensics method is suitable for detecting non-text memory WebShell and can also adapt to encryption processing and memory WebShell of text classes after confusion processing, and the memory WebShell detection efficiency is improved.
The method and the device in the embodiment of the application are based on the same technical concept, and because the principles of the problems solved by the method and the device are similar, the device and the embodiment of the method can be mutually referred, and repeated parts are not repeated.
As shown in fig. 1, a flowchart of a three-dimensional model generation method provided by the present application specifically includes the following steps:
s11, acquiring all classes in the virtual machine corresponding to the target application, and screening M first classes with memory WebShell risk in all classes;
in this embodiment of the present application, before performing memory WebShell detection on a target application, first, a target application needs to be selected from all applications currently running, and relevant information of the target application is loaded into a virtual machine corresponding to the target application, which specifically may be:
scanning all running applications on a host to obtain basic information corresponding to each application in all the applications and process information corresponding to each application, wherein the basic information at least comprises complete information of an application entrance, and can also comprise a name, a version number, authority, signature summary information and the like;
next, according to the basic information corresponding to each application, a target application is selected from all applications based on a user requirement, then the basic information corresponding to the target application and the process information corresponding to the target application are loaded into a Virtual Machine corresponding to the target application, for example, if the target application is a JAVA application, an Agent program may be written in a Native manner and in a JAVA Instrumentation interface manner, and then the process information corresponding to the target application and the basic information corresponding to the target application are loaded into a JAVA Virtual Machine (JVM) corresponding to the target application as parameters through the Agent program, where the specific loading time may be before the JVM is started or during the operation of the JVM.
After loading the basic information corresponding to the target application and the process information corresponding to the target application into the virtual machine corresponding to the target application, acquiring all classes in the virtual machine, and screening M first classes with memory WebShell risk in all classes, where M is an integer greater than or equal to 1, and the specific screening method may be:
detecting path information of each class in the local source byte code file in all classes, if any class does not have the local source byte code file path information, indicating that the class without the path information does not have the local source byte code file, screening out a second class without the local source byte code file in all classes, and taking the second class as a first class with WebShell risk;
in addition to the above method, a specific method for screening M first classes with the risk of memory WebShell in all classes may also be:
determining H second preset rules, wherein H is an integer greater than or equal to 1, the second preset rules are set according to preset type memory WebShell features, one second preset rule corresponds to one memory WebShell risk, and for example, if the preset type WebShell features are that no entity file exists, the first preset rule corresponding to the WebShell features is used for detecting that no entity file exists in the memory WebShell risk. After the second preset rules are determined, each class in all classes is matched with each second preset rule in the H second preset rules, and a third class which is matched and consistent with any one second preset rule is used as the first class with the WebShell risk.
The two screening methods can be used simultaneously or separately.
By the method, all classes in the virtual machine corresponding to the target application are preliminarily screened, the investigation range of subsequent code risk scanning is narrowed, meanwhile, all class information without the local source bytecode entity file is obtained, a good aiming effect is achieved for the memory WebShell without the entity file, and the detection precision and the detection efficiency of the memory WebShell are improved.
S12, determining codes corresponding to the byte codes of each first type in the M first types in the memory respectively to obtain M code files;
in the application embodiment, after the M first classes with the memory WebShell risk are preliminarily screened out from all classes, further processing needs to be performed on the M first classes, and a specific processing method may be as follows:
the method comprises the steps of extracting M byte codes in a memory respectively corresponding to a first class by using a byte code enhancement technology, and converting the byte codes into M code files of preset types, wherein the preset types can be JAVA or other types of code files, and the accuracy is determined according to user requirements.
By the processing method, the byte codes in the memory are converted into corresponding codes, so that subsequent further code risk investigation can be suitable for the non-text memory WebShell and can also be suitable for the WebShell after encryption processing and confusion processing.
S13, performing memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks;
in this embodiment of the present application, after the M first classes are respectively converted into the corresponding M code files, further, memory WebShell risk detection is performed on the M code files by using K first preset rules, where K is an integer greater than or equal to 1, where one first preset rule corresponds to one memory WebShell risk, and the memory WebShell risk corresponding to the first preset rule may be different from the WebShell risk type corresponding to the second preset rule, and specifically, the detection method corresponding to the first preset rule may be:
and respectively matching the text corresponding to each code file in the M code files with a preset risk level character string, wherein the preset risk level character string is set in advance and can be divided into a high risk character string, a medium risk character string and a low risk character string. Certainly, the character strings of the preset risk level may also be divided into 1-level risk character strings, 2-level risk character strings, 3-level risk character strings, and the like, and a specific risk level classification method is not limited herein;
and if a first character string consistent with the preset risk level character string exists in the text corresponding to any one code file, determining that the code text to which the first character string belongs has the memory WebShell risk. For example, when a character string consistent with the high-risk character string exists in the text corresponding to the code file, it can be determined that the code file has the risk of WebShell in memory, and the risk level is high risk.
In addition, the detection method corresponding to the first preset rule may further be:
detecting the number of preset risk characters in a text corresponding to each code file in the M code files, for example, if 'a' and 'b' are preset risk characters, and if the text corresponding to the code file contains 'a' or 'b', the sum of the number of 'a' and the number of 'b' is the number of the preset risk characters;
judging whether the number value is larger than a preset threshold value or not;
and if the first code text with the risk character number larger than the preset threshold exists, determining that the memory WebShell risk exists in the first code text.
Certainly, regular matching rules can also be used to perform memory WebShell risk detection on the M code files, and a specific detection mode is not described in detail here.
By the method, memory WebShell risk detection is carried out on the M code files, and N code files with memory WebShell risk can be screened out, wherein N is an integer greater than or equal to 1.
And S14, carrying out risk labeling on the classes corresponding to the N code files by using the first preset identification, and taking the class containing the first preset identification as a first memory WebShell detection result.
In the embodiment of the application, after the N code files with the memory WebShell risk are determined, further, the classes corresponding to the N code files are subjected to risk labeling by using the first preset identification, and the class containing the first preset identification is used as a first memory WebShell detection result.
In a possible design, in order to make the information contained in the first in-memory WebShell detection result more comprehensive, the risk types of the classes corresponding to the N code files may be added to the first in-memory WebShell detection result, specifically:
firstly, determining a risk type corresponding to a class containing a first preset identifier in a first memory WebShell detection result, wherein the method for specifically determining the risk type comprises the following steps:
matching a second code file of a class corresponding to a preset type, which contains a first preset identifier in a first memory WebShell detection result, with any one first preset rule; and if the matching is consistent, using the memory WebShell risk corresponding to the first preset rule which is consistent with the matching of the second code file as a risk type.
For example, matching a second code file of a preset type corresponding to a class including a first preset identifier in a first memory WebShell detection result with a preset risk level character string; and if the second code file has a character string consistent with the preset risk level character string, determining the risk type as the character string risk.
For another example, detecting the number value of the preset risk characters in the second code file, and judging whether the number value is greater than a preset threshold value; and if so, determining the risk type as the character risk.
Further, adding information corresponding to the risk types into the first memory WebShell detection result to obtain a second memory WebShell detection result.
By the method, the risk type corresponding to the class with the memory WebShell risk is further determined, the risk information corresponding to the memory WebShell in the memory WebShell detection result is perfected, and the memory WebShell detection accuracy is improved.
In one possible design, a third class matching with any one of the second preset rules in the detection method shown in fig. 1 is labeled, and the labeled third class is added to the first memory WebShell detection or the second memory WebShell detection, so as to improve the accuracy of the memory WebShell detection, specifically:
marking a third class by using a second preset identification, wherein the second preset identification corresponds to a memory WebShell risk;
adding the labeled third class into the first memory WebShell detection result to obtain a third memory WebShell detection result; or
Adding the labeled third class into the second memory WebShell detection result to obtain a third memory WebShell detection result;
and taking the third memory WebShell detection result as a final memory WebShell detection result.
By the method, the class matched with the second preset rule is labeled, and the labeled class is added to the memory WebShell detection result, so that the memory WebShell detection result is more comprehensive, and the detection accuracy is improved.
According to the memory WebShell detection method, after all classes in the virtual machine corresponding to the target application are preliminarily screened, memory WebShell detection and labeling are further performed on code files corresponding to the byte codes of the screened classes in the memory under the condition that comparison between byte code files corresponding to the process information of the target application and original byte code files is not needed, the false alarm rate is reduced, meanwhile, the code files are obtained by converting byte codes extracted from the memory, the method is suitable for detecting the memory WebShell of the non-text class and can adapt to the memory WebShell of the text class after encryption processing and confusion processing, and therefore the memory WebShell detection accuracy can be improved.
Further, in order to describe the memory WebShell detection method provided in the embodiment of the present application in more detail, the method provided in the present application is described in detail below through a specific application scenario.
Specifically, referring to fig. 2, in fig. 2, before performing memory WebShell detection on a target application, a JAVA application running on a host is first scanned, and all JAVA application basic information and process information are listed, where the manner of scanning all JAVA applications on the host includes, but is not limited to: directly using JPS provided by JDK to obtain all JAVA process information; writing a scan code using a sun.jvmstat.monitor packet; and acquiring the instruction of the process information on the platform.
And then, selecting a target JAVA application from all JAVA applications according to the basic information corresponding to each JAVA application in all JAVA applications, and then loading the basic information corresponding to the target JAVA application and the process information corresponding to the target JAVA application into a target JVM corresponding to the target JAVA application by using a JAVA Agent technology.
Further, all classes currently loaded in the target JVM are obtained, wherein obtaining all classes loaded in the target JVM is performed in a manner including, but not limited to, utilizing an Instrumentation interface.
Next, class information and bytecode that have a WebShell risk are extracted preliminarily from all classes, and in this example, two ways are adopted, one of which is to extract class information and bytecode of a locally non-existent bytecode file from all classes, that is: searching path information of a source byte code file in all types of information, if any type does not have the path information, indicating that the type does not have a corresponding byte code file locally, and extracting the type of information and byte codes of the type of information in a memory at the moment, wherein the byte code enhancement technology is adopted for extracting the byte codes in the memory; another approach is to use a first custom rule for extraction, where the first custom rule includes, but is not limited to, a feature matching rule.
Further, performing first risk judgment and labeling on all classes extracted in the process and having the WebShell risk, wherein in the class extraction process, each extracted class corresponds to a rule, and each rule corresponds to a JAVA application memory WebShell risk.
Further, the byte codes in the memories corresponding to all the extracted classes are converted into JAVA codes, and the process of converting the byte codes into the JAVA codes can be realized by utilizing an open source decompilation technology or an realized decompilation tool.
Further, a second self-defined rule is used for detecting the JAVA code, wherein the second self-defined rule can be used for detecting the JAVA code by adopting a conventional character string matching mode, a frequency matching mode or a regular matching mode and the like, and carrying out second risk judgment and marking on the class according to a detection result, wherein risks include but are not limited to existence of a designated high-risk character string, or occurrence frequency of the risk character string exceeds a designated threshold value or the risk character string is matched by using a regular matching method and the like. Therefore, risk labeling can be carried out on the detection result.
And finally, taking all classes with memory WebShell risks, the first risk judgment and labeling information and the second risk judgment and labeling information in the process as the detection results of the memory WebShell of the target JAVA application.
According to the memory WebShell detection method, after all classes in the target JVM are preliminarily screened, the preliminarily screened classes are subjected to primary memory WebShell detection risk judgment and labeling, the investigation range of subsequent code risk scanning is narrowed, then the code files corresponding to the byte codes in the preliminarily screened classes in the memory are subjected to secondary memory WebShell detection and labeling, and in the processes of twice risk judgment and labeling, the byte code files corresponding to the target application process information do not need to be compared with the original byte code files, so that the false alarm rate can be reduced. Meanwhile, the code file is converted from the byte codes extracted from the memory, so that the method is suitable for detecting the non-text memory WebShell and can also be suitable for encrypting and confusing the encrypted text memory WebShell, and the accuracy of detecting the memory WebShell can be improved. And finally, taking the results of the risk judgment and the labeling twice as the memory WebShell detection result of the target JAVA application, so that the memory WebShell detection accuracy can be further improved.
Based on the same inventive concept, an embodiment of the present application further provides a memory WebShell detection apparatus, as shown in fig. 3, which is a schematic structural diagram of the memory WebShell detection apparatus in the present application, and the apparatus includes:
the screening module 31 is configured to acquire all classes in the virtual machine corresponding to the target application, and screen out M first classes having a risk of WebShell in the all classes, where M is an integer greater than or equal to 1;
a first determining module 32, configured to determine codes corresponding to the bytecode of each of the M first classes in the memory, to obtain M code files;
the detection module 33 is configured to perform memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks, where one of the first preset rules corresponds to one of the memory WebShell risks, and K and N are integers greater than or equal to 1;
and the first labeling module 34 is configured to perform risk labeling on the classes corresponding to the N code files by using a first preset identifier, and use the class including the first preset identifier as a first memory WebShell detection result.
Further, the screening module 31 is specifically configured to:
screening out a second class which does not have a source byte code file locally in all the classes, and taking the second class as the first class which has the WebShell risk; and/or
And screening out a third class matched with any one of H second preset rules from all the classes, and taking the third class as the first class with the WebShell risk, wherein one second preset rule corresponds to the memory WebShell risk, and H is an integer greater than or equal to 1.
Further, the first determining module 32 is specifically configured to:
extracting byte codes in M first classes respectively corresponding to the memories;
and converting the byte codes into M code files of preset types.
Further, the detection module 33 is specifically configured to:
respectively matching the text corresponding to each code file in the M code files with a preset risk level character string;
if a first character string consistent with the preset risk level character string exists in a text corresponding to any code file, determining that the code text to which the first character string belongs has a memory WebShell risk; and/or
Detecting a preset risk character number value in a text corresponding to each code file in the M code files;
judging whether the numerical values are larger than a preset threshold value or not;
and if the first code text with the risk character number larger than the preset threshold exists, determining that the memory WebShell risk exists in the first code text.
In one possible design, the apparatus further includes:
the system comprises a scanning module, a processing module and a processing module, wherein the scanning module is used for scanning all currently running applications to obtain basic information respectively corresponding to all the applications and process information respectively corresponding to all the applications, and the basic information at least comprises application entry complete class information;
the selection module is used for selecting a target application from all the applications according to the basic information corresponding to all the applications;
and the loading module is used for loading the basic information corresponding to the target application and the process information corresponding to the target application to the virtual machine corresponding to the target application.
In one possible design, the apparatus further includes:
a second determining module, configured to determine a risk type corresponding to the class including the first preset identifier in the first memory WebShell detection result;
and the first adding module is used for adding the information corresponding to the risk type into the first memory WebShell detection result to obtain a second memory WebShell detection result.
Further, the second determining module is specifically configured to:
matching a second code file of a preset type corresponding to the class containing the first preset identification in the first memory WebShell detection result with any one first preset rule;
and if the matching is consistent, determining the memory WebShell risk corresponding to the first preset rule which is consistent with the matching of the second code file as a risk type.
In one possible design, the apparatus further includes:
the second marking module is used for marking the third category by using a second preset identification, wherein the second preset identification corresponds to a memory WebShell risk;
the second adding module is used for adding the labeled third class to the first memory WebShell detection result to obtain a third memory WebShell detection result; or adding the labeled third class to the second memory WebShell detection result to obtain a third memory WebShell detection result.
Based on the memory WebShell detection device, after all classes in the virtual machine corresponding to the target application are preliminarily screened, memory WebShell detection and labeling are further performed on the code file corresponding to the byte codes of the screened classes in the memory, and the code file is converted from the byte codes extracted from the memory, so that the memory WebShell detection device is suitable for detecting the non-text memory WebShell, can also adapt to the encryption processing and the memory WebShell of the text class after the confusion processing, and improves the memory WebShell detection accuracy.
Based on the same inventive concept, an embodiment of the present application further provides an electronic device, where the electronic device may implement the function of the foregoing WebShell detection apparatus, and with reference to fig. 4, the electronic device includes:
at least one processor 41, and a memory 42 connected to the at least one processor 41, in this embodiment, a specific connection medium between the processor 41 and the memory 42 is not limited, and fig. 4 illustrates an example where the processor 41 and the memory 42 are connected through a bus 40. The bus 40 is shown in fig. 4 by a thick line, and the connection manner between other components is merely illustrative and not limited thereto. The bus 40 may be divided into an address bus, a data bus, a control bus, etc., and is shown with only one thick line in fig. 4 for ease of illustration, but does not represent only one bus or type of bus. Alternatively, processor 41 may also be referred to as a controller, without limitation to name a few.
In the embodiment of the present application, the memory 42 stores instructions executable by the at least one processor 41, and the at least one processor 41 can execute the memory WebShell detection method discussed above by executing the instructions stored in the memory 42. The processor 41 may implement the functions of the various modules in the apparatus shown in fig. 3.
The processor 41 is a control center of the apparatus, and may connect various parts of the entire control device by using various interfaces and lines, and perform various functions of the apparatus and process data by operating or executing instructions stored in the memory 42 and calling up data stored in the memory 42, thereby performing overall monitoring of the apparatus.
In one possible design, processor 41 may include one or more processing units, and processor 41 may integrate an application processor, which primarily handles operating systems, user interfaces, application programs, and the like, and a modem processor, which primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 41. In some embodiments, processor 41 and memory 42 may be implemented on the same chip, or in some embodiments, they may be implemented separately on separate chips.
The processor 41 may be a general-purpose processor, such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like, that may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the memory WebShell detection method disclosed by the embodiment of the application can be directly implemented by a hardware processor, or implemented by combining hardware and software modules in the processor.
Memory 42, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 42 may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charge Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and the like. The memory 42 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 42 in the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.
By programming the processor 31, the code corresponding to the memory WebShell detection method described in the foregoing embodiment may be solidified into the chip, so that the steps of the memory WebShell detection method in the embodiment shown in fig. 1 can be executed when the chip runs. How to program the processor 41 is well known to those skilled in the art and will not be described in detail here.
Based on the same inventive concept, embodiments of the present application further provide a storage medium, where the storage medium stores computer instructions, and when the computer instructions are executed on a computer, the computer executes the memory WebShell detection method discussed above.
In some possible embodiments, the aspects of the memory WebShell detection method provided in this application may also be implemented in the form of a program product, which includes program code for causing the control apparatus to perform the steps in the memory WebShell detection method according to various exemplary embodiments of this application described above in this specification, when the program product is run on a device.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (18)

1. A memory WebShell detection method is characterized by comprising the following steps:
acquiring all classes in a virtual machine corresponding to a target application, and screening M first classes with memory WebShell risk in all the classes, wherein M is an integer greater than or equal to 1;
determining codes corresponding to the byte codes of each first type in the M first types in the memory respectively to obtain M code files;
performing memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks, wherein one first preset rule corresponds to one memory WebShell risk, and K and N are integers greater than or equal to 1;
and carrying out risk marking on the classes corresponding to the N code files by using a first preset identifier, and taking the class containing the first preset identifier as a first memory WebShell detection result.
2. The method of claim 1, wherein said screening M first classes among said all classes for risk of WebShell comprises:
screening out a second class which does not have a source byte code file locally in all the classes, and taking the second class as the first class which has the WebShell risk; and/or
And screening out a third class matched with any one of H second preset rules from all the classes, and taking the third class as the first class with the WebShell risk, wherein one second preset rule corresponds to the memory WebShell risk, and H is an integer greater than or equal to 1.
3. The method of claim 1, wherein said determining the code corresponding to each of said first classes, respectively, to obtain M code files, comprises:
extracting byte codes in M first classes respectively corresponding to the memories;
and converting the byte codes into M code files of preset types.
4. The method as claimed in claim 1, wherein the performing memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risk comprises:
respectively matching the text corresponding to each code file in the M code files with a preset risk level character string;
if a first character string consistent with the preset risk level character string exists in a text corresponding to any code file, determining that the code text to which the first character string belongs has a memory WebShell risk; and/or
Detecting the number of preset risk characters in the text corresponding to each code file in the M code files;
judging whether the numerical values are larger than a preset threshold value or not;
and if the first code text with the risk character number larger than the preset threshold exists, determining that the memory WebShell risk exists in the first code text.
5. The method of claim 1, prior to obtaining all classes in the virtual machine corresponding to the target application, further comprising:
scanning all currently running applications to obtain basic information corresponding to each application in all the applications and process information corresponding to each application, wherein the basic information at least comprises complete class information of an application entrance;
selecting a target application from all the applications according to the basic information corresponding to each application in all the applications;
and loading the basic information corresponding to the target application and the process information corresponding to the target application to a virtual machine corresponding to the target application.
6. The method as claimed in claim 1, wherein after performing risk labeling on the classes corresponding to the N code files by using a first preset identifier and using the class containing the first preset identifier as a first memory WebShell detection result, the method further comprises:
determining a risk type corresponding to the class containing the first preset identifier in the first memory WebShell detection result;
and adding the information corresponding to the risk type into the first memory WebShell detection result to obtain a second memory WebShell detection result.
7. The method as claimed in claim 6, wherein determining the risk type corresponding to the class including the first preset identifier in the first memory WebShell detection result includes:
matching a second code file of a preset type corresponding to the class containing the first preset identification in the first memory WebShell detection result with any one first preset rule;
and if the matching is consistent, determining the memory WebShell risk corresponding to the first preset rule which is consistent with the matching of the second code file as a risk type.
8. The method as claimed in claim 2 or claim 6, wherein after performing risk labeling on the classes corresponding to the N code files by using a first preset identifier and taking the class containing the first preset identifier as a first memory WebShell detection result, the method further comprises:
marking the third category by using a second preset identification, wherein the second preset identification corresponds to a memory WebShell risk;
adding the labeled third class into the first memory WebShell detection result to obtain a third memory WebShell detection result; or
And adding the labeled third class into the second memory WebShell detection result to obtain a third memory WebShell detection result.
9. A memory WebShell detection device, the device comprising:
the system comprises a screening module, a judging module and a judging module, wherein the screening module is used for acquiring all classes in a virtual machine corresponding to a target application and screening M first classes with memory WebShell risk in all the classes, and M is an integer greater than or equal to 1;
a first determining module, configured to determine codes corresponding to the bytecode of each of the M first classes in the memory, to obtain M code files;
the detection module is used for carrying out memory WebShell risk detection on the M code files through K first preset rules to obtain N code files with memory WebShell risks, wherein one first preset rule corresponds to one memory WebShell risk, and K and N are integers greater than or equal to 1;
and the first marking module is used for carrying out risk marking on the classes corresponding to the N code files by using a first preset identifier, and taking the class containing the first preset identifier as a first memory WebShell detection result.
10. The apparatus of claim 9, wherein the screening module is specifically configured to:
screening out a second class which does not have a source byte code file locally in all the classes, and taking the second class as the first class which has the WebShell risk; and/or
And screening out a third class matched with any one of H second preset rules from all the classes, and taking the third class as the first class with the WebShell risk, wherein one second preset rule corresponds to the memory WebShell risk, and H is an integer greater than or equal to 1.
11. The apparatus of claim 9, wherein the first determining module is specifically configured to:
extracting byte codes in M first classes respectively corresponding to the memories;
and converting the byte codes into M code files of preset types.
12. The apparatus of claim 9, wherein the detection module is specifically configured to:
respectively matching the text corresponding to each code file in the M code files with a preset risk level character string;
if a first character string consistent with the preset risk level character string exists in a text corresponding to any code file, determining that the code text to which the first character string belongs has a memory WebShell risk; and/or
Detecting the number of preset risk characters in the text corresponding to each code file in the M code files;
judging whether the numerical values are larger than a preset threshold value or not;
and if the first code text with the risk character number larger than the preset threshold exists, determining that the memory WebShell risk exists in the first code text.
13. The apparatus of claim 9, wherein the apparatus further comprises:
the system comprises a scanning module, a processing module and a processing module, wherein the scanning module is used for scanning all currently running applications to obtain basic information respectively corresponding to all the applications and process information respectively corresponding to all the applications, and the basic information at least comprises application entry complete class information;
the selection module is used for selecting a target application from all the applications according to the basic information corresponding to all the applications;
and the loading module is used for loading the basic information corresponding to the target application and the process information corresponding to the target application to the virtual machine corresponding to the target application.
14. The apparatus of claim 9, wherein the apparatus further comprises:
a second determining module, configured to determine a risk type corresponding to the class including the first preset identifier in the first memory WebShell detection result;
and the first adding module is used for adding the information corresponding to the risk type into the first memory WebShell detection result to obtain a second memory WebShell detection result.
15. The apparatus of claim 14, wherein the second determining module is specifically configured to:
matching a second code file of a preset type corresponding to the class containing the first preset identification in the first memory WebShell detection result with any one first preset rule;
and if the matching is consistent, determining the memory WebShell risk corresponding to the first preset rule which is consistent with the matching of the second code file as a risk type.
16. The apparatus of claim 10 or claim 14, wherein the apparatus further comprises:
the second marking module is used for marking the third category by using a second preset identification, wherein the second preset identification corresponds to a memory WebShell risk;
the second adding module is used for adding the labeled third class to the first memory WebShell detection result to obtain a third memory WebShell detection result; or adding the labeled third class to the second memory WebShell detection result to obtain a third memory WebShell detection result.
17. An electronic device, comprising:
a memory for storing a computer program;
a processor for implementing the method steps of any one of claims 1-8 when executing the computer program stored on the memory.
18. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1-8.
CN202111577421.7A 2021-12-22 2021-12-22 Memory WebShell detection method and device and electronic equipment Pending CN114329467A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111577421.7A CN114329467A (en) 2021-12-22 2021-12-22 Memory WebShell detection method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111577421.7A CN114329467A (en) 2021-12-22 2021-12-22 Memory WebShell detection method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114329467A true CN114329467A (en) 2022-04-12

Family

ID=81055501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111577421.7A Pending CN114329467A (en) 2021-12-22 2021-12-22 Memory WebShell detection method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114329467A (en)

Similar Documents

Publication Publication Date Title
Carmony et al. Extract Me If You Can: Abusing PDF Parsers in Malware Detectors.
CN109359439B (en) software detection method, device, equipment and storage medium
CN107688743B (en) Malicious program detection and analysis method and system
CN107659570A (en) Webshell detection methods and system based on machine learning and static and dynamic analysis
CN108446559B (en) APT organization identification method and device
CN108009425A (en) File detects and threat level decision method, apparatus and system
Zakeri et al. A static heuristic approach to detecting malware targets
CN109598124A (en) A kind of webshell detection method and device
CN110866258B (en) Rapid vulnerability positioning method, electronic device and storage medium
EP3028203A1 (en) Signal tokens indicative of malware
CN111191243B (en) Vulnerability detection method, vulnerability detection device and storage medium
US20210157909A1 (en) Sample data generation apparatus, sample data generation method, and computer readable medium
CN107247902A (en) Malware categorizing system and method
CN114760106B (en) Network attack determination method, system, electronic equipment and storage medium
CN107292168A (en) Detect method and device, the server of program code
CN112817877B (en) Abnormal script detection method and device, computer equipment and storage medium
CN113132311A (en) Abnormal access detection method, device and equipment
US8151117B2 (en) Detection of items stored in a computer system
CN109800569A (en) Program identification method and device
CN116932381A (en) Automatic evaluation method for security risk of applet and related equipment
CN105468972B (en) A kind of mobile terminal document detection method
CN103095714A (en) Trojan horse detection method based on Trojan horse virus type classification modeling
CN115643044A (en) Data processing method, device, server and storage medium
CN114143074B (en) webshell attack recognition device and method
CN114329467A (en) Memory WebShell detection method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination