CN115118452A

CN115118452A - Malicious code detection model processing method, detection method and device

Info

Publication number: CN115118452A
Application number: CN202210552298.1A
Authority: CN
Inventors: 赖豪华; 蔡晨; 郑荣锋
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-05-20
Filing date: 2022-05-20
Publication date: 2022-09-27

Abstract

The application relates to a malicious code detection model processing method, a malicious code detection method and a malicious code detection device. The method comprises the following steps: generating a sample behavior feature group of the code training sample based on the code behavior type of the sample code behavior recorded by the sample behavior record set by referring to the target behavior feature group; and training a detection model based on the sample behavior feature group and the sample label. Because the malicious code behavior with the attack property can be generated only when the malicious code is executed, that is, whether the code behavior is malicious or not can directly reflect whether the code generating the code behavior is malicious or not, the detection is performed through the detection model based on the code behavior, and the detection cannot be bypassed like the file characteristics of the code, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious code can be improved. In addition, the iterative training process of the detection model and the incremental storage of the detection data to the sample database are automatically deployed, so that the efficiency is improved, and the labor cost is saved.

Description

Malicious code detection model processing method, detection method and device

Technical Field

The present application relates to the field of computer security technologies, and in particular, to a malicious code detection model processing method, a malicious code detection device, a malicious code detection apparatus, a malicious code detection device, a malicious code detection storage medium, and a malicious code detection program product.

Background

With the rapid development of computer technology, computer technology has penetrated all aspects of life. Along with this, computer security has become an important issue. Malicious code running on a computer device has created significant challenges to computer security, among other things. Thus, there is a need to detect malicious code running on a computer device.

In the related art, malicious code detection is mainly realized through file feature detection. When the file characteristic detection is carried out, the structural characteristic of the code file to be detected is determined, so that whether the structural characteristic exists in a pre-established malicious code characteristic library or not is detected, and if so, the code file to be detected is determined to be recorded with malicious codes. However, malicious code detection is performed through file feature detection, so that detection omission is easy to occur, and the accuracy of detecting the malicious code is low.

Disclosure of Invention

In view of the foregoing, it is necessary to provide a detection model processing method, a detection method, an apparatus, a computer device, a storage medium, and a computer program product for detecting malicious code with high accuracy in detecting the malicious code.

In one aspect, the present application provides a malicious code detection model processing method, including:

acquiring a sample behavior record set of the code training sample, wherein the sample behavior record set records sample code behaviors generated after the corresponding code training sample runs;

acquiring a target behavior type characteristic set, wherein the target behavior type characteristic set comprises a code behavior type used for training a detection model;

referring to the target behavior type feature set, and determining a sample behavior type belonging to the target behavior type feature set based on the code behavior type of the sample code behavior recorded by the sample behavior record set;

generating a sample behavior characteristic group of the code training sample based on the sample behavior type belonging to the target behavior type characteristic set;

and acquiring a sample label for representing the malicious attribute of the code training sample, and training a detection model based on the sample behavior feature group and the sample label.

In one embodiment, the update termination condition includes that a behavior type feature set with a detection effect evaluation value larger than a preset threshold exists in the behavior type feature set population after the update processing of the code behavior type.

In one embodiment, the update process includes: at least one of aggregating between different code behavior types, pruning a code behavior type, adding a code behavior type, disassembling a code behavior type, or altering a code behavior type.

In one embodiment, the sample behavior feature group has feature bits corresponding to the code behavior types in the target behavior type feature set in a one-to-one manner; in the sample behavior feature group, the feature bit corresponding to the sample behavior type belonging to the target behavior type feature set is a first value, and the remaining feature bits in the sample behavior feature group with the removed value of the first value feature bit are second values.

On the other hand, the application also provides a malicious code detection model processing device, which comprises:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a sample behavior record set of a code training sample, and the sample behavior record set records sample code behaviors generated after the corresponding code training sample runs;

the second acquisition module is used for acquiring a target behavior type characteristic set, and the target behavior type characteristic set comprises a code behavior type used for training a detection model;

the determining module is used for determining a sample behavior type belonging to the target behavior type feature set on the basis of the code behavior type of the sample code behavior recorded by the sample behavior recording set by referring to the target behavior type feature set;

the generating module is used for generating a sample behavior characteristic group of the code training sample based on the sample behavior type belonging to the target behavior type characteristic set;

and the training module is used for acquiring a sample label for representing the malicious attribute of the code training sample, and training the detection model based on the sample behavior feature group and the sample label.

In one embodiment, the system comprises a first obtaining module, a second obtaining module, a third obtaining module and a fourth obtaining module, wherein the first obtaining module is used for running a code training sample in a closed behavior perception environment, and the behavior perception environment is configured to record sample code behaviors generated after the code training sample is run; and forming a sample behavior record set of the code training sample based on the sample code behaviors recorded by the behavior perception environment.

In one embodiment, the target behavior type feature set is obtained by the target behavior type feature set construction step; the device also includes:

the first construction module is used for acquiring an initial behavior type feature set and a plurality of sample behavior record sets corresponding to a plurality of code samples; determining respective occurrence probabilities of malicious code samples and non-malicious code samples in the plurality of code samples, and calculating the information entropy of the code samples according to the respective occurrence probabilities; determining the conditional entropy of each code behavior type in the initial behavior type feature set; determining an information gain value of each code behavior type according to the information entropy and the conditional entropy of each code behavior type; the information gain value is used for indicating the contribution degree of the corresponding code behavior type to the malicious code detection; and selecting code behavior types corresponding to a preset number of information gain values before in descending order according to the information gain values from the initial behavior type feature set to construct a target behavior type feature set.

In one embodiment, the first building module is further configured to, for a target code behavior corresponding to each code behavior type in the initial behavior type feature set, obtain a first number of sample behavior record sets in which the target code behavior is recorded in the plurality of sample behavior record sets, and a second number of sample behavior record sets in which the target code behavior is not recorded; determining a third number of malicious code samples and a fourth number of non-malicious code samples in code samples corresponding to a sample behavior record set in which target code behaviors are recorded in a plurality of sample behavior record sets; determining a fifth number of malicious code samples and a sixth number of non-malicious code samples in code samples corresponding to a sample behavior record set in which target code behaviors are not recorded in a plurality of sample behavior record sets; and calculating the conditional entropy of any code behavior type according to the first number, the second number, the third number, the fourth number, the fifth number and the sixth number.

the second construction module is used for acquiring an initial behavior type feature set and a plurality of sample behavior record sets corresponding to the plurality of code samples; according to the target code behaviors corresponding to each code behavior type in the initial behavior type feature set, referring to the sample code behaviors recorded in each of the plurality of sample behavior record sets, and determining the code sample with the target code behaviors recorded in the sample behavior record set in the plurality of code samples as the target code sample; calculating a first proportion of non-malicious code samples in the target code sample to non-malicious code samples in the plurality of code samples; calculating a second percentage of malicious code samples in the target code sample in malicious code samples in the plurality of code samples; according to the first proportion and the second proportion, acquiring the contribution degree of each code behavior type to malicious code detection; and screening the initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection to obtain a target behavior type feature set.

In one embodiment, the target behavior type feature set is updated through the target behavior type feature set updating step; the device also includes:

the updating module is used for acquiring a behavior type feature set population, and the behavior type feature set population comprises at least one behavior type feature set; performing iterative update processing of the code behavior type on the behavior type feature set in the behavior type feature set population until an update termination condition is reached, and obtaining a finally updated behavior type feature set population; and screening out the target behavior type feature set from the finally updated behavior type feature set population.

In one embodiment, the updating module is further configured to obtain an initial behavior type feature set and a plurality of sample behavior record sets of a plurality of code samples; for each code behavior type in the initial behavior type feature set, calculating the contribution degree of each code behavior type to malicious code detection according to the distribution of the code behavior corresponding to each code behavior type in a plurality of sample behavior record sets and the distribution of malicious code samples and non-malicious code samples in a plurality of code samples; screening an initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection, and obtaining screened code behavior types; and combining the screened code behavior types to obtain a behavior type characteristic set population.

In one embodiment, the updating module is further configured to select at least a part of behavior type feature sets in the behavior type feature set population to be updated in the current iteration; updating at least part of code behavior types in each selected behavior type feature set; and detecting each behavior type feature set in the behavior type feature set population subjected to the updating processing of the code behavior type based on the trained detection model, and determining the behavior type feature set population subjected to the iteration updating processing according to a corresponding detection result.

In one embodiment, the updating module is further configured to obtain a sample behavior feature group set and a sample label set corresponding to each behavior feature group from the behavior feature group population subjected to the update processing of the code behavior type, where each sample behavior feature group set is generated by referring to the corresponding behavior feature group; inputting each sample behavior feature group in each sample behavior feature group set into the trained detection model to obtain a detection result set corresponding to each sample behavior feature group set; and screening at least part of the behavior type feature sets from the behavior type feature set populations subjected to the updating processing of the code behavior types according to the detection result set and the sample label set corresponding to each behavior type feature set.

In one embodiment, the updating module is further configured to obtain a detection effect evaluation value of each behavior type feature set when the behavior type feature set is used for malicious code detection according to the detection result set and the sample label set corresponding to each behavior type feature set; and screening a preset number of behavior type feature sets before the behavior type feature sets are sorted according to the descending order of the detection effect evaluation value in the behavior type feature set population subjected to the updating processing of the code behavior type according to the detection effect evaluation value of each behavior type feature set.

In one embodiment, the determining module is configured to reconstruct at least a part of sample code behaviors recorded in the sample behavior record set, and obtain a reconstructed sample behavior type; and determining the sample behavior type belonging to the target behavior type feature set in the reconstructed sample behavior type by referring to the target behavior type feature set.

In one embodiment, the determining module is further configured to obtain a behavior reconstruction file, where the behavior reconstruction file is used to record a code behavior type and a reconstruction processing mode required for reconstructing when a code behavior is reconstructed to a code behavior type included in the target behavior type feature set; and under the condition that at least one part of sample code behaviors recorded in the sample behavior record set belong to the code behavior type required by reconstruction in the behavior reconstruction file, reconstructing at least one part of sample code behaviors according to the corresponding reconstruction processing mode recorded in the behavior reconstruction file.

In one embodiment, the sample behavior feature set has feature bits corresponding to the code behavior types in the target behavior type feature set in a one-to-one manner; in the sample behavior feature group, the feature bit corresponding to the sample behavior type belonging to the target behavior type feature set is a first value, and the remaining feature bits in the sample behavior feature group with the removed value of the first value feature bit are second values.

In one embodiment, the number of the detection models and the number of the target behavior type feature sets are multiple, and each detection model corresponds to one target behavior type feature set; the training module is also used for testing each trained detection model through a code sample to obtain a detection effect evaluation value of the corresponding target behavior type feature set; and screening out a detection model for detecting the malicious codes from all the trained detection models according to the detection effect evaluation value of each target behavior type feature set.

On the other hand, the application also provides a computer device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor implements the steps in the detection model processing method for malicious codes when executing the computer program.

In another aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in the above method for processing a malicious code detection model.

In another aspect, the present application further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the above malicious code detection model processing method.

According to the malicious code detection model processing method, the malicious code detection model processing device, the malicious code detection model processing device, the malicious code detection method and the malicious code detection system, the detection model is trained on the basis of the sample behavior characteristic set and the sample behavior characteristic set of the code training sample, the target behavior characteristic set is referred to, the sample behavior characteristic set of the code training sample is generated on the basis of the code behavior type of the sample code behaviors recorded by the sample behavior recording set, and the sample behavior characteristic set and the sample labels. Because the malicious code behaviors can be generated only in the execution process of the malicious codes, that is, whether the code behaviors are malicious or not can directly reflect whether the codes generating the code behaviors are malicious or not, the code behaviors generated in the operation process of the codes are used in the training process of the detection model and are detected by the detection model, and the codes cannot be bypassed like the file characteristics of the codes, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious codes can be improved.

In another aspect, the present application provides a method for detecting malicious code, including:

acquiring a behavior record set of the target code, wherein the behavior record set records code behaviors generated after the target code runs;

acquiring a target behavior type characteristic set, wherein the target behavior type characteristic set comprises a code behavior type required when a trained detection model detects malicious codes;

referring to the target behavior type feature set, and determining a target behavior type belonging to the target behavior type feature set based on the code behavior type of the code behavior recorded by the behavior recording set;

generating a behavior feature group of the target code based on the target behavior type belonging to the target behavior type feature group;

and detecting the malicious codes based on the behavior feature group through the trained detection model to obtain the malicious attributes of the target codes.

In another aspect, the present application provides an apparatus for detecting malicious code, including:

the first acquisition module is used for acquiring a behavior record set of the target code, and the behavior record set records code behaviors generated after the target code runs;

the second acquisition module is used for acquiring a target behavior type characteristic set, wherein the target behavior type characteristic set comprises a code behavior type required when the trained detection model detects the malicious code;

the determining module is used for determining a target behavior type belonging to the target behavior type feature set based on the code behavior type of the code behavior recorded by the behavior recording set by referring to the target behavior type feature set;

the generating module is used for generating a behavior characteristic group of the target code based on the target behavior type belonging to the target behavior type characteristic set;

and the detection module is used for detecting the malicious codes based on the behavior feature group through the trained detection model to obtain the malicious attributes of the target codes.

In one embodiment, the determining module is configured to reconstruct at least a part of code behaviors recorded in the behavior record set to obtain a reconstructed behavior type; and determining the target behavior type belonging to the target behavior type feature set in the reconstructed behavior type by referring to the target behavior type feature set.

In one embodiment, the detection module is further configured to push, to the outside, verification request information for verifying a malicious attribute of the target code; obtaining a returned checking result based on the checking request information, wherein the checking result is used for determining a sample label when the target code is used as a code training sample; and training the detection model again based on the code training sample as the target code and the corresponding sample label.

On the other hand, the application also provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the provided malicious code detection method when executing the computer program.

In another aspect, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the provided malicious code detection method.

In another aspect, the present application further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for detecting malicious code provided above.

According to the malicious code detection method, the malicious code detection device, the malicious code detection computer equipment, the malicious code detection medium and the malicious code detection computer program product, the behavior record set of the target code is obtained, the target behavior type characteristic set is referred to, the code behavior type of the code behavior recorded by the behavior record set is generated based on the code behavior type of the code behavior recorded by the behavior record set, and the target code is detected through the detection model based on the behavior characteristic set. Because the malicious code behaviors can be generated only in the execution process of the malicious codes, that is, whether the code behaviors are malicious or not can directly reflect whether the codes generating the code behaviors are malicious or not, the code behaviors generated in the operation process of the codes are used in the training process of the detection model and are detected by the detection model, and the codes cannot be bypassed like the file characteristics of the codes, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious codes can be improved.

Drawings

FIG. 1 is a diagram of an application environment of a malicious code detection model processing method and a malicious code detection method in one embodiment;

FIG. 2 is a flowchart illustrating a malicious code detection model processing method according to an embodiment;

FIG. 3 is a flowchart illustrating a method for malicious code detection according to an embodiment;

FIG. 4 is a flowchart illustrating a malicious code detection model processing method according to another embodiment;

FIG. 5 is an architectural diagram of a detection environment in one embodiment;

FIG. 6 is a schematic diagram of a deployment flow of a detection model in one embodiment;

FIG. 7 is a block flow diagram of an inspection flow and an iterative training flow in one embodiment;

FIG. 8 is a block diagram of an apparatus for processing a malicious code detection model in one embodiment;

FIG. 9 is a block diagram of an apparatus for malicious code detection in one embodiment;

fig. 10 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

First, terms referred to in the embodiments of the present application are briefly explained:

malicious code: refers to code that is not functional but is dangerous, and one of the safest definitions is to treat all unnecessary code as malicious, which has a broader meaning than malicious code, including all software that may conflict with some organizational security policy. Reference to code (including malicious code) in embodiments of the present application generally refers to executable programs in a computer device. Some malicious codes have all functions that a complete program should have, and can be independently propagated and run, and such malicious codes do not need to be hosted in another program, and can be called as independent malicious codes. Some malicious codes are only a section of codes and need to be embedded into a certain complete program to be propagated and run as a component of the program, and such malicious codes can be called as dependent malicious codes, which can cause the host program to embody maliciousness.

Malicious property: generally corresponds to two types of code evaluation results, namely malicious code and non-malicious code.

The honeypot technology comprises the following steps: the method refers to a technology for cheating an attacker, which is characterized in that the attacker is induced to attack the host, network service or information serving as bait by arranging the host, network service or information as bait, so that the attack behavior can be captured and analyzed, tools and methods used by the attacker are known, attack intention and motivation are speculated, defenders can clearly know the security threat faced by the attacker, and the security protection capability of a real system is enhanced through technical and management means.

Code behavior: refers to the behavior exhibited by the code after execution, and particularly may be the behavior exhibited by a process or thread created by the code after execution.

Information entropy: for describing the uncertainty of the random variable. In the embodiment of the application, the information entropy is used for describing the uncertainty of whether the code is malicious code or non-malicious code.

Conditional entropy: refers to the uncertainty of a random variable under some known variable. In the embodiment of the application, the conditional entropy is used for describing the uncertainty of whether a certain code is known to generate a certain type of code behavior, and the code is malicious code or non-malicious code.

In addition, in the embodiment of the present application, the training process of the detection model and the subsequent malicious code detection application process mainly relate to Artificial Intelligence (AI), and are designed based on Machine Learning (ML) technology in the AI. Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence.

Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology mainly includes computer vision technology, natural language processing technology, machine learning/deep learning and other directions. With the research and progress of artificial intelligence technology, artificial intelligence is researched and applied in a plurality of fields, such as common smart homes, smart customer service, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, robots, smart medical treatment and the like.

Machine learning is a multi-field cross discipline and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis and algorithm complexity theory. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Compared with the method for finding mutual characteristics among big data by data mining, the machine learning focuses on the design of an algorithm, so that a computer can automatically learn rules from the data and predict unknown data by using the rules.

Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, and inductive learning. Reinforcement Learning (RL), also known as refinish Learning, evaluative Learning or Reinforcement Learning, is one of the paradigms and methodologies of machine Learning, and is used to describe and solve the problem that agents (agents) can achieve maximum return or achieve specific goals through Learning strategies in the process of interacting with the environment.

In some embodiments, the malicious code detection model processing method or the malicious code detection method provided in the embodiments of the present application may be applied to an application environment as shown in fig. 1. The terminal 102 may communicate with the server 104 directly or indirectly through a wired or wireless network, which is not particularly limited in the embodiment of the present application. In addition, the terminal 102 or the server 104 may be used alone to execute the malicious code detection model processing method in the embodiment of the present application, or may be used alone to execute the malicious code detection method in the embodiment of the present application; the two methods may be used in cooperation to execute a detection model processing method for malicious codes in the embodiment of the present application, or may be used in cooperation to execute a detection method for malicious codes in the embodiment of the present application.

For the independent execution, one implementation process when the server 104 executes the detection model processing method of the malicious code alone is taken as an example. Specifically, the server 104 may obtain and store a sample behavior record set of the code training sample in advance, refer to a locally stored target behavior type feature set, determine a sample behavior type belonging to the target behavior type feature set based on the code behavior type of the sample code behavior recorded by the sample behavior record set, and thereby generate a sample behavior feature group. The sample behavior feature set can be used as an input item of a locally stored detection model, the sample label of the code training sample can be used as a supervision item, and the detection model can be trained through supervised learning.

Also for the independent execution, one implementation procedure when the terminal 102 independently executes the detection method of the malicious code is taken as an example. Specifically, the terminal 102 may obtain a behavior record set of the target code, and obtain a target behavior type feature set. The terminal 102 refers to the target behavior type feature set, determines a target behavior type belonging to the target behavior type feature set in the code behavior types of the code behaviors recorded in the behavior record set, and generates a behavior feature group of the target code based on the target behavior type feature set. The terminal 102 inputs the behavior feature group into the detection model to obtain the malicious property of the target code.

For cooperative execution, taking one implementation process of the detection model processing method for the malicious code cooperatively executed by the two as an example, the terminal 102 may collect a code training sample, and may obtain and store a target behavior type feature set issued by the server 104. After collecting a new code training sample, the terminal 102 may refer to a locally stored target behavior type feature set, determine a sample behavior type belonging to the target behavior type feature set based on the code behavior type of the sample code behavior recorded in the sample behavior recording set, generate a sample behavior feature set based on the sample behavior feature set, and upload the sample behavior feature set to the server 104 by the terminal 102. The server 104 may use the sample behavior feature set as an input item of the locally stored detection model, and the sample label of the code training sample may be used as a supervision item, and the detection model may be trained through supervised learning. The data storage system may store the code training samples obtained by the server 104, and may also store the detection model, and then train the detection model based on the code training samples. The data storage system may be integrated on the server 104, or may be integrated on the cloud or other server.

Also for cooperative execution, taking one implementation process of the method for detecting malicious codes cooperatively executed by the two as an example, the terminal 102 may obtain a behavior record set of the target code, and upload the behavior record set to the server 104. The server 104 may locally obtain the target behavior type feature set, and determine, among the code behavior types of the code behaviors recorded in the behavior record set, a target behavior type belonging to the target behavior type feature set with reference to the target behavior type feature set, and generate a behavior feature group based on the target behavior type feature set. The server 104 inputs the behavior feature set into the locally stored detection model to obtain the malicious property of the target code.

It can be understood that, for cooperative execution, the server 104 may be integrated in a cloud end, that is, the malicious code detection model processing method and the malicious code detection method mentioned in the embodiments of the present application may both be implemented by a cloud technology. Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.

The Cloud technology (Cloud technology) is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in a Cloud computing business model, can form a resource pool, can be used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.

The embodiment of the application can relate to the field of cloud security application in cloud technology, and is specifically embodied in that malicious codes running in different computer devices in a wide area network or a local area network can be detected through the cloud computing technology. The Cloud Security (Cloud Security) refers to a generic term of Security software, hardware, users, organizations and Security Cloud platforms applied based on Cloud computing business models. The cloud security integrates emerging technologies and concepts such as parallel processing, grid computing and unknown virus behavior judgment, the latest information of Trojan horse malicious programs in the internet is obtained through abnormal monitoring of a large number of netted clients on software behaviors in the network, the latest information is sent to the server for automatic analysis and processing, and then the virus and Trojan horse solutions are distributed to each client.

The main research directions of cloud security include: 1. the cloud computing security mainly researches how to guarantee the security of the cloud and various applications on the cloud, including the security of a cloud computer system, the secure storage and isolation of user data, user access authentication, information transmission security, network attack protection, compliance audit and the like. 2. The cloud of the security infrastructure mainly researches how to adopt cloud computing to newly build and integrate security infrastructure resources and optimize a security protection mechanism, and the cloud computing technology is used for constructing a super-large-scale security event and an information acquisition and processing platform, so that the acquisition and correlation analysis of mass information are realized, and the overall network security event control capability and risk control capability are improved. 3. The cloud security service mainly researches various security services, such as anti-virus services and the like, provided for users based on a cloud computing platform.

The terminal 102 may be, but not limited to, various desktop computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. An application program, such as a video application, an audio application, or the like, may be run on the terminal for presenting the code data. The server 104 may be a background server corresponding to software, a web page, an applet, or a server specially used for detecting a code, which is not limited in this embodiment of the present application. The server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.

In some embodiments, in combination with the above noun explanation, technical explanation and implementation environment description, as shown in fig. 2, a malicious code detection model processing method is provided, which is described by taking an example that the method is applied to a computer device (the computer device may specifically be a terminal or a server in fig. 1), and includes the following steps:

step 202, a sample behavior record set of the code training sample is obtained, and the sample behavior record set records sample code behaviors generated after the corresponding code training sample is operated.

In connection with the above explanation of terms, a code training sample may be a executable program that is a training sample. It is understood that a program naturally generates a series of code behaviors after running, such as command execution, code injection, memory residence, credential collection (mainly referring to user name and password), and information duplication. Thus, the computer device can record these code behaviors and aggregate into a set of records. It is to be understood that the same type of sample code behavior may exist in the sample behavior record set due to the record code behavior. For example, during the running process of a certain code, the code behavior of command execution is generated firstly, then the code behavior of code injection is generated, and then the code behavior of command execution is generated. At this time, there are two "command execution" code behaviors in the behavior record set of the code.

It should be noted that a code may last long and thus may generate a large amount of code behavior. At this time, when the computer device acquires the behavior record set of the code, it is also unlikely to record all code behaviors generated in the whole running process of the code, which mainly means that if all code behaviors are recorded, a large storage space is occupied, and more system resources are occupied by subsequently processing such many code behaviors. Therefore, when the sample behavior record set is obtained, the computer device can obtain the sample code behaviors of the code training sample in a part of time period in the operation process. For example, the sample code behavior may be a sample code behavior in one time period, or may be a sample code behavior in multiple time periods, which is not specifically limited in this embodiment of the present application. In addition, the same code training sample may result in multiple runs. For a certain code training sample, the sample code behavior recorded in the sample behavior record set of the code training sample may be generated in the same operation process or in multiple operation processes, which is not specifically limited in this embodiment of the present application.

It should be further noted that, because the execution process of the program in the computer device is actually embodied by a process or a thread as an essential execution entity, the code behaviors mentioned in the embodiments of the present application may be generated by the corresponding process or thread of the code during the execution process. In addition, to record the code behavior, the sample code behavior in the sample behavior record set needs to be represented in a presentation form. In the actual implementation process, the method can be represented by character strings, and different code behavior types correspond to different character strings; the behavior record set can be represented by encoding, different code behavior types can all be encoded in the same encoding mode, for example, all adopt binary encoding or all adopt 16-ary encoding, and the like.

For convenience of illustration, the embodiment of the present application takes binary coding as an example of code behavior, for example, a code behavior type of "command execution" may be represented by binary coding "001", and a code behavior type of "code injection" may be represented by binary coding "010". Thus, the sample behavior record set may actually be a series of binary-coded sets. It should be noted that, in the above mentioned example, the encoding lengths of the binary codes corresponding to different code behavior types are all the same. In an actual implementation process, the encoding lengths of the binary codes corresponding to different code behavior types may also be different, for example, the code behavior type of "command execution" may adopt a three-bit binary code "001", and the code behavior type of "code injection" may adopt a four-bit binary code "0010", which is also not specifically limited in this embodiment of the present application.

As can be seen from the above, the computer device may be configured to limit the size of the formed sample behavior record set by "obtaining the sample code behavior of the code training sample in a part of the running process", that is, by limiting the amount of data obtained. In practical implementation, the computer device may further limit the size of the sample behavior record set by directly limiting the size of the sample behavior record set. Specifically, taking the example that the code behaviors are represented by the binary coding mentioned above, and each code behavior uses 3-bit coding, assuming that the scale of the sample behavior record set is directly limited to record 100 sample code behaviors, the sample behavior record set may be 100 × 3-300 bits by combining with the relevant description that each code behavior uses 3-bit coding. Thus, the sample behavior record set may be limited to a size of 300 bits.

It should be noted that, in the above process, the size of the sample behavior record set is mainly limited by directly limiting the scale of the sample behavior record set, so that when the computer device "acquires data", that is, when acquiring a sample code behavior generated after the code training sample is run, the computer device may not limit the amount of the acquired data, but subsequently sort the acquired data amount based on the scale of the sample behavior record set. Specifically, if the amount of data acquired in the actual implementation process is large, for example, 200 sample code behaviors are acquired, the computer device may calculate that all sample code behaviors represented by binary coding occupy 200 × 3 to 600 bits, and the size of the sample behavior record set is 300 bits. Thus, the computer device may perform pruning processing on all the code behaviors that are obtained, such as retaining only 100 code behaviors.

The above process is one possible way to reduce the amount of data. Considering the combinability of binary codes as codes, if the binary codes of the obtained 200 sample code behaviors form a group of "coded streams" in a way of encoding head bits to be connected, the way of deleting the data amount by the computer device may also be "coded stream truncation". It should be noted that 600 bits can be formed by 200 sample code behaviors, 300 bits are theoretically randomly truncated, and then 100 binary codes can be obtained by using every 3 bits as a group of binary codes according to the sequence of the coded bits for 300 bits, which just fits the limited scale of the sample behavior record set.

However, due to random truncation, the integrity of a group of binary codes of original 3 bits may be damaged, and recombination between adjacent binary codes formed after head-to-tail connection is caused, so that the difference between the sample code behavior recorded in the sample behavior record set generated after truncation and the originally acquired sample code behavior is too large. This may amount to randomly constructing a sample behavior record set, resulting in loss of meaning of the record sample code behavior, nor being based on the actual situation of code execution. Based on the above description, in actual implementation, truncation may be performed in units of 3 bits. For example, in connection with the above example, if the "encoded stream" is 600 bits, the computer device may truncate the bits from the 4 th bit to the 303 th bit as a basis for constructing the sample behavior record set. At this time, the integrity of the binary code of the original 3-bit group is not destroyed, so that the original meaning of the binary code is maintained.

The main statement above is how to sort the acquired data volume based on the scale of the sample behavior record set to limit the size of the sample behavior record set under the condition that the acquired data volume is relatively large in the actual implementation process. In an actual implementation process, a situation that the amount of acquired data is relatively small may also occur, such as acquiring 60 code behaviors. The computer device can calculate that all sample code behaviors, represented by binary coding, take 60 x 3-180 bits, while the sample behavior record set is 300 bits in size. Obviously, at this time, the bits occupied by all sample code behaviors have not yet reached the limit size of the sample behavior record set.

At the same time, it is understood that there will typically be more than one code training sample. It extends therefrom whether the actual sizes of the sample behavior record sets of different code training samples need to be identical. Because the subsequent processing process and the scale of the sample behavior record set do not have a relationship, in the actual implementation process, the limit scale may not be set for the sample behavior record set, or the limit scale may be set, and the data volume may be deleted only by using the limit scale under the condition that the obtained data volume is large. Based on this, the amount of data acquired is completely retained as much as possible.

Of course, it is contemplated that sample behavior record sets of the same size may be handled uniformly, as data records may be consistent in form. Therefore, in the actual implementation process, for a certain code training sample, under the condition that all the acquired code behaviors of the code training sample do not reach the limit scale of the sample behavior record set, the computer device can perform data volume supplementary processing on all the acquired code behaviors. It is to be understood that the supplemental data may be content that has no meaning to refer to, or may otherwise add sample code behavior that the sample code training sample never produced to the set of sample behavior records. In conjunction with the above description of "encoded streams," the way in which the computer device supplements the amount of data may be by padding bits into the encoded streams. Now, the following description is made: if 60 code behaviors are acquired, it can be determined that 180 bits are occupied, and the limit size of the sample behavior record set can be reached by supplementing 120 bits.

Since the sample code behavior corresponds to 3-bit binary codes, if bits are randomly filled, the integrity of the original 3-bit binary codes may be damaged. Based on the same reason that random truncation is not performed, the computer device can perform bit padding on the premise of maintaining the integrity of the original binary code of one group of 3 bits. The specific filling manner may be filling before the encoded stream, or filling after the encoded stream, or filling in the middle of the encoded stream, or combining multiple manners, and the embodiment of the present application does not specifically limit the filling manner.

It should be noted that, in the above process, there is mentioned "the supplemental data may be content without meaning" so that the filling content may not point to any code behavior type. In connection with the above, the code line type "command execution" is represented by binary coding "001", the code line type "code injection" is represented by binary coding "010", followed by "011" and so on. It follows that, at least in this example, the binary "000" is not enabled for referring to the code behavior type. Thus, in this example, bits having a value of "0" may all be filled. Of course, in the actual implementation process, the filling manner may also be determined by combining the binary codes specifically enabled when referring to the code behavior type, so as to ensure that the filling content does not point to any code behavior type, and the embodiment of the present application does not specifically limit the filling manner.

In the above-mentioned related description about the representation of the sample code behavior by binary coding, it is mentioned that the sample behavior record set can be represented by binary coding in an end-to-end manner to form a "code stream". Since they are end-to-end, a sequence is usually required. And the code behavior generated in the running process of the code can naturally form a generation time sequence of the code behavior. Therefore, in an actual implementation process, the sample behavior record set can be a sequence, wherein the recorded sample code behaviors can be ordered according to the generation time sequence of the sample code behaviors.

As will be understood from the subsequent steps, when the computer device generates the sample behavior feature set, the feature bits in the sample behavior feature set actually correspond to the code behavior type. The set of sample behavior features is generated based on the sample behavior types belonging to the set of target behavior type features, and the sample behavior types belonging to the set of target behavior type features are determined based on the code behavior types of the sample code behaviors recorded by the set of sample behavior records. The above process shows that the code behavior type corresponding to the feature bit in the sample behavior feature group is associated with the sample code behavior recorded by the sample behavior record set. Based on the correlation, the time sequence originally existing in the sample behavior record set can be transmitted to the sample behavior feature set, so that the ordering of the feature bits in the sample behavior feature set is generated. The ordering of the feature bits is actually the ordering between corresponding code behavior types, and the specific implementation process can refer to the description of the subsequent process.

It should be noted that, when it is considered that malicious code presents malicious nature in the running process, there may exist a certain code behavior generation timing indeed, and such code behavior generation timing may have a certain malicious logic. For example, a malicious behavior of a certain malicious code may be that a large amount of malicious code is copied first, then a plurality of processes generated by the large amount of copied malicious code reside in a memory for a long time, and finally credentials are collected continuously by the plurality of processes residing for the long time. Therefore, the time sequence for generating the sample code behaviors can be introduced into the sample behavior record set, and the time sequence is taken as one of the reference factors for training the detection model through the mode stated in the brief description, so that the detection accuracy of the malicious code can be improved. But given the confusion of malicious code today and the liberty of the programming style of the producer of the malicious code, the code behavior generation sequence presented by the malicious code may no longer be associated with malicious logic, and introducing the code behavior generation sequence may have an adverse effect. Therefore, in practical implementation, whether to introduce a code behavior generation timing sequence into the sample behavior record set may be determined based on actual conditions, which is not specifically limited in the embodiment of the present application.

And 204, acquiring a target behavior type characteristic set, wherein the target behavior type characteristic set comprises a code behavior type for training a detection model.

It is understood that the code behaviors that the code can present can all be collected in advance. And of all the code behaviors collected, only part of the code behaviors are usually malicious. For example, code generally does not produce the behavior of turning over information, which affects information security and is generally malicious code behavior. Thus, which code behavior types of code behaviors may be malicious can be known in advance. Based on the above description, in this step, the computer device may determine which code behaviors of the code behavior types may have maliciousness, and then form the target behavior type feature set from the code behavior types.

Like the sample behavior record set, the code behavior types in the target behavior type feature set also need to be represented in a presentation form. And according to the content in the steps, the code behavior type can be represented by binary coding, the sample code behavior in the sample behavior record set is also represented by binary coding, and each time a record of the binary coding exists in the sample behavior record set, the code behavior generating the corresponding code behavior type is represented. Therefore, in this step, the code behavior type in the target behavior type feature set can be represented by binary coding. Of course, other presentation forms may be adopted, and the embodiment of the present application is not particularly limited thereto.

And step 206, referring to the target behavior type feature set, and determining the sample behavior type belonging to the target behavior type feature set based on the code behavior type of the sample code behavior recorded by the sample behavior recording set.

It should be noted that the sample behavior feature set in the subsequent step is actually used as an input item of the detection model. And the target behavior type feature set is mainly used as a guide for generating the input items, namely, the code behavior types required by the input items for determining the detection model. Wherein an entry may not necessarily completely cover all code behavior types comprised by the target behavior type feature set, but the code behavior types covered by the entry are subordinate to the target behavior type feature set. It should be further noted that the code behavior types included in the target behavior type feature set are not uniform, one code behavior type may be broken down into multiple code behavior subtypes, and multiple code behavior types may be aggregated into a new code behavior type.

As can be seen from the above steps, the code behavior in the target behavior type feature set is actually determined based on the code behaviors that may be malicious, and the target behavior type feature set is mainly used as a guide for generating the input item. Therefore, for the detection model, whether the concerned code generates the code behavior corresponding to the code behavior type in the target behavior type feature set is compared, and whether the concerned code is the malicious code is judged. Therefore, in a certain aspect, the process of "determining the sample behavior types belonging to the target behavior type feature set" in this step may be understood as "taking an intersection" of all code behavior types included in the target behavior type feature set and all code behavior types corresponding to the sample code behaviors recorded in the sample behavior record set.

However, it has been explained above that the code behavior types included in the target behavior type feature set are not invariable, and some code behavior types included in the target behavior type feature set may not be obtained only by recording the code behaviors. For example, all code behavior types corresponding to the sample code behaviors recorded in the sample behavior record set include two code behavior types, a1 and a2, and the target behavior type feature set includes the code behavior type a. Wherein A is polymerized from A1 and A2. It is clear that there is no sample code behavior of code behavior type a in the set of sample behavior records, but a1 and a2 are closely associated with a. While the code behavior types in the target behavior type feature set are determined based on those code behaviors that may be malicious, since A is considered to be a code behavior type that may be malicious, it is obviously not possible to ignore the malicious behaviors of the two code behavior types A1 and A2 because the sample behavior record set does not have the sample code behaviors of the code behavior type A.

Therefore, only in this step, a "reference target behavior type feature set" is mentioned, where a "reference" two words mainly refer to a code behavior type corresponding to a sample code behavior recorded in a sample behavior record set, and some changes may occur toward a direction conforming to the code behavior type in the target behavior type feature set. In this variation, some code behavior types may be generated that would otherwise not be covered by the sample behavior record set. But this change still requires "the type of code behavior based on the sample code behavior recorded by the set of sample behavior records". It should be noted that the "sample behavior type" mentioned in this step is mainly different from an expression of a code behavior type covered by an original sample behavior record set, and is also essentially a code behavior type.

It is understood that the "sample behavior types" are all subordinate to the target behavior type feature set, that is, all the code behavior types in the target behavior type feature set. However, some "sample behavior types" may be covered by the sample behavior record set, and the code behavior types covered by the sample behavior record set and the code behavior types included in the target behavior type feature set are "intersected" and have not undergone change. Meanwhile, some "sample behavior types" may not be covered by the sample behavior record set, and are generated by "intersecting" at least a part of the code behavior types covered by the sample behavior record set with the code behavior types included in the target behavior type feature set.

For ease of understanding, the code behavior types covered by the sample behavior record set include A1, A2, B, C, and E, while the code behavior types included in the target behavior type feature set include A, B1, C, and D. Wherein A is formed by polymerization of A1 and A2, and B is formed by polymerization of B1 and B2. Therefore, the target behavior type feature set can be referred to, and the direction of the code behavior type covered by the sample behavior record set is changed towards the direction fitting the code behavior type in the target behavior type feature set. In particular, the types of code behavior covered by the set of changed sample behavior records may include A, B1, B2, C, and E. Thus, by taking the intersection, the sample behavior types belonging to the target behavior type feature set can be determined to be A, B1 and C. It should be noted that the above-mentioned variations are mainly polymerization and disassembly, and other variations may be available in the practical implementation process, and this is not specifically limited in the examples of the present application.

And step 208, generating a sample behavior characteristic group of the code training sample based on the sample behavior type belonging to the target behavior type characteristic set.

As can be seen from the above explanation of the steps, the sample behavior feature set is used as an input item of the detection model, and the target behavior type feature set is mainly used as a guide for generating the input item. Therefore, for the detection model, whether the concerned code generates the code behavior corresponding to the code behavior type in the target behavior type feature set is compared, and whether the concerned code is the malicious code is judged. And the sample behavior feature set is mainly used for indicating which sample code behaviors of the code behavior type in the target behavior type feature set are generated by the code training sample. Or further, after the sample code behaviors generated by the code training samples are changed, the sample code behaviors belonging to the code behavior types in the target behavior type feature set are generated.

Thus, sample code behaviors that are subordinate to the code behavior type in the target behavior type feature set are ultimately generated in order to be able to represent the code training samples. The sample behavior feature group can be provided with a plurality of feature bits, the number of the feature bits can be consistent with the number of the code behavior types in the target behavior type feature set, each feature bit refers to one code behavior type in the target behavior type feature set, and information on each feature bit is one sample behavior feature. Thus, the representation of the information on the characteristic bit may be 1 or 0. If the information on a certain feature bit in the sample behavior feature group is 1, it can be shown that the code training sample finally generates a sample code behavior corresponding to the code behavior type indicated by the feature bit; if the information on a certain feature bit in the sample behavior feature group is 0, it may indicate that the code training sample does not finally generate the sample code behavior corresponding to the code behavior type indicated by the feature bit. Of course, in practical implementation, the meanings of 1 and 0 may be reversed, and this is not specifically limited in this application.

When a plurality of code training samples train the detection model in the same batch, the code behavior types indicated by the feature bits of the respective sample behavior feature groups at the same position are the same. For example, the code training samples a and b are both used in the same batch of training process, and if the first feature bit in the sample behavior feature group of a refers to the code behavior type a, the first feature bit in the sample behavior feature group of b refers to the code behavior type a, that is, the code behavior types referred to by the feature bits at the same position are the same. The sample behavior feature set is regarded as a feature as a whole, and therefore the feature set needs to be the same, mainly to ensure that the input items in the same batch of training processes are all the same type of features. Meanwhile, for a certain model, what kind of features are used in the input items in the final training process, and the same kind of features are generally used in the subsequent application process. Therefore, for the sample behavior feature group input in the final training process and the behavior feature group input in the subsequent application process, the code behavior types indicated by the feature bits at the same position are the same.

And step 210, obtaining a sample label for representing the malicious attribute of the code training sample, and training a detection model based on the sample behavior feature group and the sample label.

The malicious attribute may correspond to two code evaluation results, namely, malicious code and non-malicious code. The computer device may perform supervised training of the detection model based on the sample labels by taking the set of sample behavior features as input. The detection model may be a neural network model for implementing a classification function, such as a recurrent neural network or a convolutional neural network, and the condition for finishing training may be detection model convergence, which is not specifically limited in this embodiment of the present application.

It should be noted that, in the actual implementation process, all code training samples may be divided into two parts, one part is used in the training process, and the other part may be used to test the trained detection model. It should also be noted that all code training samples may be from the sample database. The sample database can comprise malicious code samples and non-malicious code samples, and the sample source in the sample database can comprise three parts: the first is the code sample shared by the source code opening party, the second is the code sample generated by the computer security related staff in the attack drilling, and the third is the code sample captured in the network (which may include wide area network or local area network). Of course, in an actual implementation process, the sample database may also have other code sample sources, which is not specifically limited in this embodiment of the present application.

The malicious code detection model processing method includes the steps of obtaining a sample behavior record set and a target behavior type characteristic set of a code training sample, referring to the target behavior type characteristic set, generating a sample behavior characteristic set of the code training sample based on a code behavior type of a sample code behavior recorded by the sample behavior record set, and training a detection model based on the sample behavior characteristic set and a sample label. Because the malicious code behaviors can be generated only in the execution process of the malicious codes, that is, whether the code behaviors are malicious or not can directly reflect whether the codes generating the code behaviors are malicious or not, the code behaviors generated in the operation process of the codes are used in the training process of the detection model and are detected by the detection model, and the codes cannot be bypassed like the file characteristics of the codes, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious codes can be improved.

In addition, the sample behavior feature group is generated by referring to the target behavior feature group and based on the sample behavior record set, and the target behavior feature group is formed by based on the code behavior type which may have malice, so that the sample behavior feature group can reflect the malice degree of the code behavior of the sample behavior record set as much as possible, and further the sample behavior feature group is used as an input item of the detection model, so that the training effect of the detection model can be improved, and the detection accuracy of subsequent malicious codes can be improved.

Finally, the code is detected by using the detection model, so that the occupied storage resource is continuously increased due to frequent updating of the malicious code feature library as in file feature detection, but only the detection model needs to be trained again by updating the sample, namely the detection model does not need to occupy too much storage resource, and the storage resource can be saved. Meanwhile, the malicious code feature library is gradually huge, so that the search times in the subsequent detection process are gradually increased, the detection efficiency is influenced, and the system performance is influenced due to too many processing resources occupied. And by using the detection model, the prediction result can be obtained only by an input and output process, so that the detection efficiency can be improved, and the influence on the system performance can be reduced.

As known from the naming of the sample behavior record set, the sample code behavior of the code training sample needs to be recorded. Thus, in some embodiments, obtaining a sample behavior record set of code training samples comprises: running the code training sample in a closed behavior perception environment, wherein the behavior perception environment is configured to record sample code behaviors generated after the code training sample is run; and forming a sample behavior record set of the code training sample based on the sample code behaviors recorded by the behavior perception environment.

Specifically, the behavior awareness environment is mainly used for inducing an attacker to attack a host, a network service or information as a bait by arranging the host, the network service or the information, so that attack behaviors can be perceived and recorded. Even more, the attack behavior can be analyzed to understand the tools and methods used by the attacker, to infer the attack intention and motivation, and to perform backward tracing. Through the behavior perception environment, a defending party can clearly know the facing security threat, and the security protection capability of the system is enhanced through technical and management means. In practical implementation, the behavior awareness environment may be configured based on honeypot technology.

In addition, emphasis is placed on "closed" behavior-aware environments in the embodiments of the present application, primarily because behavior-aware environments are actually a secure resource that is valuable for detection, attack, and damage, thereby collecting evidence of intrusion. In actual implementation, the behavior-aware environment may also be configured to provide a certain vulnerability to intrusion. However, it can be understood that if the computer device configured with the behavior awareness environment is penetrated integrally, the purpose cannot be achieved, and an intruder may also perform an intrusion behavior to other computer devices by using the computer device configured with the behavior awareness environment as a springboard, which may cause more serious consequences. Thus, "closed" in the embodiments of the present application mainly emphasizes that when configuring a behavior-aware environment, a trapping technology for setting a vulnerability for intrusion needs to have sufficient security, otherwise, code behaviors cannot be recorded effectively.

The computer device runs the code training sample in a closed behavior awareness environment, and the behavior awareness environment can record the generated sample code behavior. For the representation of the recorded time period and the sample code behavior, reference may be made to the description of the above embodiments, which is not repeated herein. The form of the sample behavior record set may be related to the representation form of the sample code behavior, and the embodiment of the present application does not specifically limit the form of the sample behavior record set. For example, if the sample code behavior is represented by binary coding, the sample behavior record set may be constructed in the form of a binary coding set.

In the above embodiment, since the code behaviors can be perceived and recorded through the behavior perception environment with sufficient security, on the premise that the computer device configured with the behavior perception environment is not permeated by the whole, the perception and recording of malicious code behaviors can be realized, and the security of the computer device and a computer device system composed of all computer devices can be enhanced.

In the above embodiments, the target behavior type feature set is formed based on a code behavior type that may be malicious, and an implementation manner for obtaining the target behavior type feature set based on empirical judgment is also provided. It can be understood that, in the actual implementation process, a part of code behavior types can be screened from a plurality of code behavior types based on some quantifiable judgment bases to construct a target behavior type feature set. It should be noted that, the "obtaining a target behavior type feature set" mentioned in the training process of the detection model may be temporarily constructed in the training process, or may be constructed in advance before the training, and is directly obtained and used in the training process, which is not specifically limited in this embodiment of the present application. Based on this description, in some embodiments, the target behavior type feature set is obtained by a target behavior type feature set construction step, which includes:

acquiring an initial behavior type feature set and a plurality of sample behavior record sets corresponding to a plurality of code samples; determining respective occurrence probabilities of malicious code samples and non-malicious code samples in the plurality of code samples, and calculating the information entropy of the code samples according to the respective occurrence probabilities; determining the conditional entropy of each code behavior type in the initial behavior type feature set; determining an information gain value of each code behavior type according to the information entropy and the conditional entropy of each code behavior type; the information gain value is used for indicating the contribution degree of the corresponding code behavior type to the malicious code detection; and selecting code behavior types corresponding to a preset number of information gain values before in descending order according to the information gain values from the initial behavior type feature set to construct a target behavior type feature set.

Specifically, also included in the initial behavior type feature set is the code behavior type. The initial behavior type feature set may be composed of all known code behavior types, or may be composed of code behavior types that may be determined based on experience and have malicious characteristics, as stated in the previous embodiment, and the embodiment of the present application does not specifically limit the manner in which the initial behavior type feature set is composed. The initial behavior type feature set is mainly used as a basis for constructing the target behavior type feature set, and the code behavior types in the target behavior type feature set obtained by subsequent construction are derived from the initial behavior type feature set.

In addition to obtaining the initial behavior type feature set, the computer device may obtain a plurality of sample behavior record sets corresponding to the plurality of code samples. Wherein, each code sample corresponds to a sample behavior record set. It should be noted that the "code sample" is mainly distinguished from the "code training sample" mentioned in the above embodiments by name, and the "code training sample" in the above embodiments mainly refers to a code sample for training a detection model. The two codes may be different in nature, or even the same code sample, and the embodiment of the present application is not particularly limited thereto.

Of the plurality of code samples, which code samples are malicious code samples and which code samples are non-malicious code may be known in advance. Therefore, the computer device can calculate and obtain the occurrence probability of each of malicious code samples and non-malicious code samples in the plurality of code samples according to the distribution of the malicious code samples and the non-malicious code samples in the plurality of code samples. In combination with the definition of the information entropy, whether the code sample is a malicious code sample or not may be set as a random variable, a value of the random variable is a malicious code sample or a non-malicious code sample, and the calculation process of the information entropy of the code sample may refer to the following formula (1):

in the above formula (1), X represents a random variable, and X _i Represents the ith value of the random variable, n represents the total n values of the random variable, p (x) _i ) The probability that the value of the random variable is the ith value is represented. In the embodiment of the present application, the random variable X has two values, and the respective probabilities of the two values of the random variable X are, respectively, the occurrence probabilities of malicious code samples and non-malicious code samples in the plurality of code samples.

After determining the information entropy of the code sample, the computer device may determine the uncertainty of whether the code sample is a malicious code sample or a non-malicious code sample under different code behavior types as known conditions, that is, determine the conditional entropy of each code behavior type in the initial behavior type feature set. Where conditional entropy can be calculated with reference to the associated definitions and formulas.

For any code behavior type in the initial behavior type feature set, the computer equipment makes a difference between the information entropy and the conditional entropy of the code behavior type, and then the information gain value of the code behavior type can be obtained. The information entropy is used for representing the inherent uncertainty of a random variable, the conditional entropy is used for representing the uncertainty of a random variable under a certain known condition, the difference between the information entropy and the conditional entropy is used for representing the degree of uncertainty reduction of the random variable under the known condition, and the difference between the information entropy and the conditional entropy is an information gain value. In conjunction with the description of the principle, it is obvious that the larger the value of the information gain value of a certain code behavior type is, the larger the degree of reduction of the uncertainty of whether a code sample is a malicious code sample or a non-malicious code sample under the condition that the code behavior type is known (actually, the code behavior of the code behavior type is generated by the code sample as the known condition). Therefore, the method shows that the contribution degree of the code behavior type to the result guidance of the inference process is larger for inferring whether the code sample is a malicious code sample or a non-malicious code sample. That is, the code behavior type contributes to a greater extent to malicious code detection.

As can be seen from the above, the larger the information gain value is, the greater the contribution degree of the corresponding code behavior type to malicious code detection is. Therefore, the computer device can sort the information gain values of the code behavior types in the initial behavior type feature set in descending order from large to small, and select a preset number of information gain values sorted in the top from the information gain values. And constructing a target behavior type feature set based on the code behavior types corresponding to the information gain values. It should be noted that the information gain value may also be obtained by performing a difference calculation in the reverse direction, that is, performing a difference between the conditional entropy of the code behavior type and the information entropy. At this time, the information gain value is set to a reverse meaning, that is, the smaller the information gain value is, the greater the contribution degree of the corresponding code behavior type to the malicious code detection is, and a target behavior type feature set can be constructed through reverse selection. In the practical implementation process, which manner is specifically selected is not specifically limited in this application example.

In the above embodiment, since the information gain values of different code behavior types can be calculated, the degree of contribution of different code behavior types to malicious code detection can be effectively determined through the information gain values. Therefore, the code behavior types with large contribution degree to malicious code detection can be screened out as much as possible in the initial behavior type feature set, and the code behavior types are used in the training process of the detection model, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious code is improved.

In some embodiments, determining the conditional entropy for each code behavior type in the initial behavior type feature set comprises:

aiming at a target code behavior corresponding to each code behavior type in the initial behavior type feature set, acquiring a first number of sample behavior record sets in which the target code behaviors are recorded in a plurality of sample behavior record sets and a second number of sample behavior record sets in which the target code behaviors are not recorded; determining a third number of malicious code samples and a fourth number of non-malicious code samples in code samples corresponding to a sample behavior record set in which target code behaviors are recorded in a plurality of sample behavior record sets; determining a fifth number of malicious code samples and a sixth number of non-malicious code samples in code samples corresponding to a sample behavior record set in which target code behaviors are not recorded in a plurality of sample behavior record sets; and calculating the conditional entropy of any code behavior type according to the first number, the second number, the third number, the fourth number, the fifth number and the sixth number.

Specifically, for any code behavior type in the initial behavior type feature set, the code behavior corresponding to the code behavior type may be used as the target code behavior. It will be appreciated that some of the plurality of code samples may or may not produce the object code behavior during runtime. Thus, there may be some sample behavior record sets in the plurality of sample behavior record sets in which the object code behavior is recorded, and there may be some sample behavior record sets in which the object code behavior is not recorded. Therefore, the computer device may first determine which sample behavior record sets of the plurality of sample behavior record sets have the target code behavior recorded therein, and count the first number; it may also be determined which sample behavior record sets have no behavior of the object code recorded therein, and a second quantity may be counted.

For the sample behavior record set in which the target code behavior is recorded in the plurality of sample behavior record sets, the computer device may further determine which code samples are malicious code samples in all code samples corresponding to the sample behavior record sets, and count a third number; it may also be determined which code samples are non-malicious code samples and a fourth number may be counted. For a sample behavior record set in which the target code behavior is not recorded in the plurality of sample behavior record sets, the computer device may further determine which code samples are malicious code samples in all code samples corresponding to the sample behavior record sets, and count a fifth number; it may also be determined which code samples are non-malicious code samples and a sixth number may be counted.

For any code behavior type, the computer device calculates the conditional entropy of the code behavior type according to the first number to the sixth number, and the process comprises the following steps: calculating a first probability that the target code behavior appears and a second probability that the target code behavior does not appear in the plurality of sample behavior record sets according to the first number and the second number; according to the third quantity and the fourth quantity, calculating a third probability that the code sample is a malicious code sample and a fourth probability that the code sample is a non-malicious code sample on the premise that the target code behavior occurs; according to the fifth quantity and the sixth quantity, calculating a fifth probability that the code sample is a malicious code sample and a sixth probability that the code sample is a non-malicious code sample on the premise that the target code behavior does not appear; and calculating the conditional entropy of the code behavior type according to the first probability to the sixth probability.

Wherein, according to the first to sixth probabilities, the process of calculating the conditional entropy of the code behavior type may include: calculating a first information entropy of the code sample on the premise of the target code behavior according to the third probability and the fourth probability; according to the fifth probability and the sixth probability, calculating a second information entropy of the code sample on the premise that the target code behavior does not appear; and calculating the product of the first information entropy and the first probability, calculating the product of the second information entropy and the second probability, and taking the sum of the two products as the conditional entropy of the code behavior type. Specifically, the calculation process of the conditional entropy may refer to the following formula (2):

in the above formula (2), Y in H (Y/X) represents a random variable, X represents a given condition of the random variable X, and the whole represents the uncertainty of the random variable Y under the given condition of the random variable X. X represents a specific given condition, X ∈ X represents one of all given conditions for which X belongs to the random variable X, and p (X) represents the probability of occurrence of the specific given condition, X. H (Y/X ═ X) represents the entropy of information of the random variable Y under the specific given condition of X ═ X.

For ease of understanding, the sample tags of malicious attributes of each of the plurality of code samples and the contents of the sample behavior record set are referred to as shown in table 1 below as an example:

TABLE 1

Command execution	Code injection	Memory resident	Whether or not it is a malicious code sample
				Is provided with	Is provided with	Is provided with	Is that
Is free of	Is provided with	Is free of	Is that
				Is provided with	Is composed of	Is free of	Whether or not

The second line, the third line and the fourth line in the table correspond to a code sample, the first line indicates that three types of sample code behaviors, namely 'command execution', 'code injection' and 'memory residence', are recorded in the sample behavior record set of the code sample, and the code sample is a malicious code sample.

Based on table 1 above and equation (1) above, the entropy of the code sample can be calculated as: - (2/3) × log (2/3) - (1/3) × log (1/3) ═ 0.2763. For the code behavior type of "command execution", the value of X of the random variable in the above formula (2) is "present" or "absent", which respectively represents a specific given condition that "the command execution type of code behavior is recorded in the sample behavior record set", and a specific given condition that "the command execution type of code behavior is not recorded in the sample behavior record set".

Since only 2 of the 3 sample behavior record sets corresponding to table 2 record the target code behavior of the type "command execution", the first number is 2, and the second number is 1. Since only the second row and the fourth row in table 2 respectively correspond to sample behavior record sets in which target code behaviors of the type "command execution" are recorded, where the code samples corresponding to the second row are malicious code samples, and the code samples corresponding to the fourth row are non-malicious code samples, the third number is 1, and the fourth number is 1. Since only the target code behavior of the type "command execution" is recorded in the sample behavior record set corresponding to the third row in table 2, and the code sample corresponding to the third row is a malicious code sample, the fifth number is 1, and the sixth number is 0.

Based on the respective quantities, the numerical values of the different factors in the above formula (2) can be calculated. Where p (x ═ there) is 2/3 the first number/(first number + second number). p (x ═ none) ═ second quantity/(first quantity + second quantity) ═ 1/3. Reference is made to the calculation process of formula (1) above, and H (Y/X ═ there ") - [ third quantity/(third quantity + fourth quantity) ] - [ third quantity/(third quantity + fourth quantity ] - [ fourth quantity/(third quantity + fourth quantity) ] - [ fourth quantity/(third quantity + fourth quantity ] - (1/2) } log (1/2) - (1/2) × log (1/2) ═ 0.301.

H (Y/X ═ none ") - [ fifth quantity/(fifth quantity + sixth quantity) ] - [ log [ fifth quantity/(fifth quantity + sixth quantity) ] - [ sixth quantity/(fifth quantity + sixth quantity) ] - (1 log (1) -0 ═ 0. In combination with the above values, the conditional entropy of the code type "command execution" can be calculated to be (2/3) × 0.301+ (1/3) × 0 ═ 0.2.

In the above embodiment, the conditional entropies of different code behavior types can be calculated, and the information gain values of different code behavior types can be calculated according to the conditional entropies of different code behavior types, so that the contribution degree of different code behavior types to malicious code detection can be effectively judged through the information gain values. Therefore, the code behavior types with large contribution degree to malicious code detection can be screened out as much as possible in the initial behavior type feature set, and the code behavior types are used in the training process of the detection model, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious code is improved.

The method mainly includes calculating information gain of each code behavior type, screening out code behavior types with large contribution degree to malicious code detection based on the information gain, and constructing a target behavior type feature set based on the code behavior types. In the actual implementation process, the target behavior type feature set can also be constructed in other ways. Thus, in some embodiments, the target behavior type feature set is obtained by a target behavior type feature set construction step, the target behavior type feature set construction step comprising:

acquiring an initial behavior type feature set and a plurality of sample behavior record sets corresponding to a plurality of code samples; according to the target code behaviors corresponding to each code behavior type in the initial behavior type feature set, referring to the sample code behaviors recorded in the plurality of sample behavior record sets, and determining the code sample with the target code behaviors recorded in the sample behavior record set as a target code sample in the plurality of code samples; calculating a first proportion of non-malicious code samples in the target code sample to non-malicious code samples in the plurality of code samples; calculating a second percentage of malicious code samples in the target code sample in malicious code samples in the plurality of code samples; according to the first proportion and the second proportion, acquiring the contribution degree of each code behavior type to malicious code detection; and screening the initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection to obtain a target behavior type feature set.

Specifically, for the related description of the initial behavior type feature set, reference may be made to the contents of the previous embodiments, which are not described herein in detail. For the code behavior type of "command execution", in conjunction with table 1 above, the computer device may determine that the code sample in the sample behavior record set, in which the target code behavior is recorded, is a code sample corresponding to the second line and the fourth line in table 1, and may be used as the target code sample. Wherein, the process of calculating the first ratio can refer to the following formula (3):

R _b ＝N _b (t)/S _b (t)； (3)

in the above formula (3), N _b (t) represents the number of non-malicious code samples in the target code sample, S _b (t) represents the number of non-malicious samples in the plurality of code samples, R _b Indicating a first fraction. In combination with the above Table 1, N _b (t) has a value of 1, and S _b (t) has a value of 1, whereby R _b Also 1.

The process of calculating the first ratio may refer to the following formula (4):

R _m ＝N _m (t)/S _m (t)； (4)

in the above formula (4), N _m (t) represents the number of malicious code samples in the target code sample, S _m (t) represents the number of malicious samples in the plurality of code samples, R _m Indicating a second fraction. In combination with the above Table 1, N _m (t) has a value of 1, and S _m (t) has a value of 2, whereby R _m Is 0.5.

In combination with the above calculation process, the first ratio may reflect the ratio of the non-malicious code samples in which the target code behavior occurs in all the non-malicious code samples, and the second ratio may reflect the ratio of the malicious code samples in which the target code behavior occurs in all the malicious code samples. It can be understood that the larger the second ratio relative to the first ratio, the closer the code behavior type corresponding to the target code behavior is associated with the malicious code, and the target code behavior is easier to appear in the code behavior generated by the malicious code. Therefore, in an actual implementation process, the computer device may calculate a ratio between the second ratio and the first ratio, and use the ratio as a degree of contribution of a code behavior type corresponding to the target code behavior to malicious code detection. Of course, in the actual implementation process, the contribution degree may also be calculated in other manners, for example, negative logarithm values are respectively taken for the second proportion and the first proportion, and a negative logarithm value difference between the two is taken as the contribution degree of the code behavior type to the malicious code detection.

After the contribution degree of each code behavior type to malicious code detection is calculated, the computer device may screen the code behavior types in the initial behavior type feature set. The screening method may be a threshold comparison method, which is not specifically limited in this embodiment of the present application. Taking the example of calculating the ratio between the second ratio and the first ratio and calculating the contribution degree of the code behavior type to the malicious code detection as an example, for any code behavior type, if the calculated ratio is 3 and the threshold is 2, the code behavior type can be added to the target behavior type feature set because the ratio is greater than the threshold.

In the embodiment, because the contribution degrees of different code behavior types to malicious code detection can be calculated, the code behavior types with larger contribution degrees to the malicious code detection are screened out as much as possible in the initial behavior type feature set based on the contribution degrees of different code behavior types, and the code behavior types are used in the training process of the detection model, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious code is improved.

As mentioned in the previous embodiments, the code behavior types included in the target behavior type feature set are not invariant. It can be understood that, if the target behavior type feature set can be updated in a direction (toward the direction of producing the code behavior type with the contribution degree to the malicious code detection as large as possible) for many times, the training effect of the detection model can be improved. Based on this description, in some embodiments, the target behavior type feature set is updated by a target behavior type feature set updating step, which includes:

acquiring a behavior type feature set population, wherein the behavior type feature set population comprises at least one behavior type feature set; performing iterative update processing of the code behavior type on the behavior type feature set in the behavior type feature set population until an update termination condition is reached, and obtaining a finally updated behavior type feature set population; and screening out the target behavior type feature set from the finally updated behavior type feature set population.

The behavior type feature set population may be formed based on any combination of known code behavior types, or may be formed based on combination of code behavior types judged by experience and possibly having malice.

It is understood that, in each iteration of the update process, the computer device may update the code behavior type of a part of the behavior type feature sets in the behavior type feature set population, and may also update the code behavior type of all the behavior type feature sets, which is not specifically limited in this embodiment of the present application. In addition, the "updating of the code behavior type to the behavior type feature set" mainly includes changing the code behavior type included in the behavior type feature set. The change manner may be the polymerization and the disassembly mentioned in the previous embodiment, and for how to change, reference may be made to the content of the previous embodiment, which is not described herein again.

In addition, the update termination condition may be set to yield a type of code behavior that contributes as much to the detection of malicious code as possible. Therefore, the update termination condition may be set in combination with the evaluation index associated with the contribution degree of malicious code detection, for example, the update termination condition may be set in combination with the above-mentioned information gain value, which is not specifically limited in the embodiment of the present application.

It will also be appreciated that the greater the number of iterative update processes, the more likely a new code behavior type will be produced. Under the condition of generating a new code behavior type, the iterative update processing process is equivalent to an outward exploration process, so that the code behavior type covered by the behavior type feature set group is not limited to the existing code behavior type any more, the diversity of the code behavior type is gradually expanded, and the possibility of searching a global optimal solution is provided. Thus, the update termination condition may also be set to enable the update process to be iterated a sufficient number of times, such as 2000 iterations.

After the iterative update processing is completed each time, the computer device can obtain the behavior type feature set population after the iterative update processing, and the behavior type feature set population obtained by the last iterative update processing can be used as the basis of the next iterative update processing. And after the updating termination condition is reached, the computer equipment carries out iterative updating processing for the last time to obtain a behavior type feature set population, namely the behavior type feature set population after final updating. The computer device can optionally screen out the target behavior type feature set from the finally updated behavior type feature set population. The malicious code may also be screened according to the above mentioned evaluation index associated with the contribution degree of the malicious code detection, which is not specifically limited in the embodiment of the present application.

In the above embodiment, the iterative update processing of the code behavior type is performed on the behavior type feature set in the behavior type feature set population based on the update termination condition. The updating termination condition can be set by combining with the evaluation index associated with the contribution degree of the malicious code detection, so that the target behavior type characteristic set containing the code behavior type with larger contribution degree to the malicious code detection can be screened out as much as possible subsequently, the training effect of the detection model can be improved, and the detection accuracy rate of the subsequent malicious code can be improved. Meanwhile, the detection performance of the detection model can be continuously improved through a plurality of iterative updating processing processes. In addition, the updating termination condition can also expand the diversity of the code behavior types to be set for consideration, so that the screened target behavior type characteristic set is not limited to the existing code behavior types any more, the possibility of finding the global optimal solution is provided, the training effect of the detection model can be further improved, and the detection accuracy of the subsequent malicious codes is improved.

The above embodiments mention that the combination of types of code behavior that may be judged to be malicious may be based on experience. Thus, in some embodiments, the behavior type feature set population is obtained by a behavior type feature set population construction step, the behavior type feature set population construction step includes:

acquiring an initial behavior type feature set and a plurality of sample behavior record sets of a plurality of code samples; for each code behavior type in the initial behavior type feature set, calculating the contribution degree of each code behavior type to malicious code detection according to the distribution of the code behavior corresponding to each code behavior type in a plurality of sample behavior record sets and the distribution of malicious code samples and non-malicious code samples in a plurality of code samples; screening an initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection, and obtaining screened code behavior types; and combining the screened code behavior types to obtain a behavior type characteristic set population.

Specifically, based on the distribution of the code behavior corresponding to each code behavior type in the plurality of sample behavior record sets and the distribution of malicious code samples and non-malicious code samples in the plurality of code samples, the computer device may calculate the conditional entropy of each code behavior type. Thus, as described in connection with the correlation calculation regarding the information gain values in the above embodiments, the information gain value of each code behavior type may be used as a degree of contribution to malicious code detection.

In addition, based on the distribution of the code behaviors corresponding to each code behavior type in the plurality of sample behavior record sets and the distribution of the malicious code samples and the non-malicious code samples in the plurality of code samples, the computer device can further calculate a first proportion of the non-malicious code samples with the target code behaviors in all the non-malicious code samples and a second proportion of the malicious code samples with the target code behaviors in all the malicious code samples. Thus, in conjunction with the description of the related calculation of the first and second ratios in the above embodiment, the computer device may calculate the degree of contribution of each code behavior type to malicious code detection. Of course, in the actual implementation process, besides the two ways described above, there may be other ways to calculate the degree of contribution of each code behavior type to malicious code detection, which is not specifically limited in the embodiment of the present application.

For the subsequent screening process, the computer device may select a preset number of code behavior types sorted from large to small according to the contribution degree as the screened code behavior types. In addition, the process of combining the screened code behavior types may be any combination, or may be a combination according to some rules, for example, the number of code behavior types included in each behavior type feature set in the behavior type feature set population is the same, which is not specifically limited in this embodiment of the present application.

In the embodiment, before the iterative update processing is performed on the behavior type feature set population, the code behavior type with the contribution degree as large as possible can be screened out according to the contribution degree of the code behavior type to the malicious code detection, and the behavior type feature set population is constructed based on the code behavior type feature set population, so that the target behavior type feature set containing the code behavior type with the higher contribution degree can be screened out as much as possible when the target behavior type feature set is screened out based on the iterative update processing performed on the behavior type feature set population in the follow-up process, and further, the training effect of the detection model can be improved, so that the detection accuracy of the follow-up malicious code can be improved.

In some embodiments, the iterative update processing of the code behavior type is performed on the behavior type feature set in the behavior type feature set population, which includes:

selecting at least part of behavior type characteristic sets in the behavior type characteristic set population to be updated in the iteration; updating at least part of code behavior types in each selected behavior type feature set; and detecting each behavior type feature set in the behavior type feature set population subjected to the updating processing of the code behavior type based on the trained detection model, and determining the behavior type feature set population subjected to the iteration updating processing according to a corresponding detection result.

It can be understood that, in each iterative update process, the computer device may not iterate all behavior type feature sets in the behavior type feature set population for the update process, but may select at least a part of the behavior type feature set as an iterative update process target, so that before each iterative update process, the behavior type feature set population may always retain some existing code behavior types, and it is the same that at least a part of the code behavior types are selected for the update process. The content of the previous embodiment can be referred to in the update process, and is not described herein again.

In addition, in the actual implementation process, after each iteration update processing, the computer device can directly use the result of the iteration update processing as the behavior type feature set population after the iteration update processing. Of course, the computer device may further process the result of the iterative update processing, for example, determine the behavior type feature set population after the iterative update processing according to the detection result of each behavior type feature set in the behavior type feature set population.

The "detection result of detecting each behavior type feature set in the behavior type feature set population" may actually be a detection result of detecting based on a code sample with reference to the behavior type feature set in the behavior type feature set population. The detection result can be used as a basis for further processing, because the sample label of the code sample is known, and the degree of contribution of the referenced behavior type feature set to the malicious code detection can be evaluated by comparing the sample label with the detection result. It should be noted that, in the embodiment of the present application, the "trained detection model" is mainly used, and the untrained detection model cannot provide a relatively accurate detection result, and thus cannot be used as an objective basis for evaluating the contribution degree of the behavior type feature set to the malicious code detection.

The computer device can calculate the correct detection ratio corresponding to each behavior type feature set which is referred to, and the correct detection ratio is used as a basis for further processing. Of course, in the actual implementation process, other bases and manners may be further adopted for further processing, and this is not specifically limited in the embodiments of the present application.

In the above embodiment, since after each iteration of the updating process, based on the degree of contribution of each behavior type feature set in the result of the current iteration of the updating process to the detection of malicious codes, further processing the result of the iteration update processing to determine the behavior type characteristic set population after the iteration update processing, therefore, the behavior type feature set with larger contribution degree to the malicious code detection can be screened out as much as possible, a behavior type feature set population is constructed on the basis of the behavior type feature set, therefore, when the iterative update processing is carried out on the subsequent behavior type characteristic set based on the behavior type characteristic set population to screen out the target behavior type characteristic set, the target behavior type feature set containing the code behavior type with larger contribution degree can be screened out as much as possible, and further, the training effect of the detection model can be improved, so that the detection accuracy of subsequent malicious codes is improved. Meanwhile, the detection performance of the detection model can be continuously improved through a plurality of iterative updating processing processes. In addition, due to the further processing procedure in each iteration updating process, the data volume of the next iteration updating process can be reduced, so that the processing resources of the computing equipment can be saved, and the processing performance can be improved.

In some embodiments, detecting each behavior type feature set in the behavior type feature set population after the update processing of the code behavior type based on the trained detection model, and determining the behavior type feature set population after the update processing of the current iteration according to the corresponding detection result includes:

acquiring a sample behavior feature group set and a sample label set corresponding to each behavior feature group from the behavior feature group subjected to the updating processing of the code behavior type, wherein each sample behavior feature group set is generated by referring to the corresponding behavior feature group; inputting each sample behavior feature group in each sample behavior feature group set into the trained detection model to obtain a detection result set corresponding to each sample behavior feature group set; and screening at least part of the behavior type feature sets from the behavior type feature set populations subjected to the updating processing of the code behavior types according to the detection result set and the sample label set corresponding to each behavior type feature set.

Specifically, for any behavior type feature set, the sample behavior feature set corresponding to the behavior type feature set may be generated based on the code behavior type of the sample code behavior recorded by the sample behavior recording set of the code sample with reference to the behavior type feature set. For a specific implementation process, reference may be made to the contents of the previous embodiments, which are not described herein again.

According to the detection result set and the sample label set corresponding to each behavior type feature set, the computer device can calculate the loss function value corresponding to each behavior type feature set. And according to the loss function value corresponding to each behavior type feature set, screening the behavior type feature set population subjected to the updating processing of the code behavior type. The loss function used for calculating the loss function value may be a regression loss function, a mean square error loss function, or a cross entropy loss function, and the like, which is not specifically limited in this embodiment of the present application. Taking the example that the smaller the loss function value is, the better the detection effect of the detection model is, the loss function values corresponding to each behavior type feature set in the behavior type feature set population can be sorted from small to large, and a preset number of behavior type feature sets are selected as at least part of screened behavior type feature sets.

In the above embodiment, since after each iteration of the updating process, based on the degree of contribution of each behavior type feature set in the result of the current iteration of the updating process to the detection of malicious codes, further screening the result of the iterative update processing to determine the behavior type characteristic set population after the iterative update processing, thereby screening out the behavior type feature set with larger contribution degree to malicious code detection as much as possible, constructing a behavior type feature set population based on the behavior type feature set population, therefore, when the iterative update processing is carried out on the subsequent behavior type characteristic set based on the behavior type characteristic set population to screen out the target behavior type characteristic set, the target behavior type feature set containing the code behavior type with larger contribution degree can be screened out as much as possible, and further, the training effect of the detection model can be improved, so that the detection accuracy of subsequent malicious codes is improved. Meanwhile, the detection performance of the detection model can be continuously improved through a plurality of iterative updating processing processes. In addition, due to the further screening process in each iteration updating process, the data volume of the next iteration updating process can be reduced, so that the processing resources of the computing equipment can be saved, and the processing performance can be improved.

The embodiment mainly provides a process of calculating a loss function value corresponding to each behavior type feature set based on a loss function of a detection model, and screening a behavior type feature set population according to the loss function value. Of course, in the actual implementation process, other modes can be adopted for screening. Therefore, in some embodiments, according to the detection result set and the sample label set corresponding to each behavior type feature set, at least part of behavior type feature sets are screened from the behavior type feature set population subjected to the update processing of the code behavior type, including:

acquiring a detection effect evaluation value of each behavior type feature set when the behavior type feature set is used for detecting malicious codes according to a detection result set and a sample label set corresponding to each behavior type feature set; and screening a preset number of behavior type feature sets before the behavior type feature sets are sorted according to the descending order of the detection effect evaluation value in the behavior type feature set population subjected to the updating processing of the code behavior type according to the detection effect evaluation value of each behavior type feature set.

Specifically, for any behavior type feature set, according to the detection result set and the sample tag set corresponding to the behavior type feature set, it can be actually determined how many code samples are correctly detected, how many code samples are originally malicious code samples but are falsely detected as non-malicious code samples, and how many code samples are originally non-malicious code samples but are falsely detected as malicious code samples when a plurality of code samples are detected by the detection model on the premise that the behavior type feature set is used as a reference.

It is obvious that each of the above mentioned quantities can be directly or indirectly used for evaluating the detection effect, and can be used as a basis for calculating the evaluation value of the detection effect. For convenience of understanding, for any behavior type feature set, taking an example that a detection result set corresponding to the behavior type feature set is obtained based on a plurality of code samples, the detection effect evaluation value of the behavior type feature set may refer to the following calculation methods:

(1) rate of accuracy

The specific calculation process can refer to the following formula (5):

in the above formula (5), N _TP Representing the number of code samples of the plurality of code samples that are identified as malicious code samples and that are in fact malicious code samples, N _FP Representing the number of code samples in the plurality of code samples that are identified as malicious code samples, but are in fact non-malicious code samples. Precision represents the accuracy and can be directly used as a detection effect evaluation value.

(2) Recall rate

The specific calculation process can refer to the following formula (6):

in the above formula (6), N _TP Reference may be made to the definition of the above formula (5), N _FN Representing the number of code samples in the plurality of code samples that are identified as non-malicious code samples but are actually malicious code samples. Recall represents an accuracy rate, and can be directly used as a detection effect evaluation value.

(3) F1 score

The specific calculation process can refer to the following formula (7):

in the above formula (7), F ₁ Score represents F1 score, which can be directly used as a detection effect evaluation value. The F1 score is an index used in statistics to measure the accuracy of the two classification models, and can be considered as a harmonic mean of the model accuracy and the recall rate, while taking into account the accuracy and the recall rate of the classification models.

(4) Probability of election

In the above equation (8), Fit _i And (3) F1 scores corresponding to the ith behavior type feature set in the behavior type feature set population are shown, n shows that the behavior type feature set population has n behavior type feature sets in total, and P (i) shows the probability of selection of the ith behavior type feature set and can be directly used as a detection effect evaluation value.

Note that the above corresponds to four types of detection effect evaluation values being provided, and the addition of the loss function value provided in the previous embodiment corresponds to five types of detection effect evaluation values being provided. In actual implementation, one of the two methods can be selected for screening. For the four types of detection effect evaluation values provided above, it is general that the larger the value, the better the detection effect is represented. Therefore, in the embodiment of the present application, the computer device may sort the detection effect evaluation values in a descending order from large to small, and select a preset number of behavior type feature sets from the sorted detection effect evaluation values.

Of course, in an actual implementation process, the computer device may further select and screen multiple detection effect evaluation values of the above five types, which is not specifically limited in this embodiment of the present application. For example, five detection effect evaluation values may be simultaneously selected, each detection effect evaluation value is normalized, then the normalization results corresponding to the five detection effect evaluation values are weighted and summed, and the final summation result is used as the final detection effect evaluation value.

In the above embodiment, since after each iteration update processing, it is possible to evaluate the detection effect of each behavior type feature set in the result of the current iteration update processing when being used for malicious code detection, further screening the result of the iterative update processing to determine the behavior type characteristic set population after the iterative update processing, therefore, the behavior type feature set with good detection effect on malicious code detection can be screened out as much as possible, a behavior type feature set population is constructed on the basis of the behavior type feature set population, therefore, when the iterative update processing is carried out on the subsequent behavior type characteristic set based on the behavior type characteristic set population to screen out the target behavior type characteristic set, the target behavior type feature set containing the code behavior type with better detection effect can be screened out as much as possible, and further, the training effect of the detection model can be improved, so that the detection accuracy of subsequent malicious codes is improved. Meanwhile, the detection performance of the detection model can be continuously improved through a plurality of iterative updating processing processes. In addition, the behavior type feature set can be screened based on the detection effect evaluation value so as to reduce the data volume of next iteration updating processing, thereby saving the processing resources of the computing equipment and improving the processing performance.

As can be seen from the description of the foregoing embodiment, the update termination condition may be set in consideration of expanding diversity of code behavior types, and may also be set in combination with an evaluation index associated with the contribution degree of malicious code detection. Based on the description, in some embodiments, the update termination condition includes that a behavior type feature set with a detection effect evaluation value larger than a preset threshold exists in the behavior type feature set group after the update processing of the code behavior type.

It can be understood that as the iterative update processing process advances, the detection effect corresponding to the behavior type feature set in the behavior type feature set population should be better and better, and the contribution degree to the detection of the malicious code should also be higher and higher. Therefore, in the embodiment of the present application, the update termination condition may be set to that there is a behavior type feature set whose detection effect evaluation value is greater than a preset threshold value in the updated behavior type feature set population. Of course, in the actual implementation process, the update termination condition may also be set to include other items by referring to the relevant descriptions in the embodiment of the present application, and this is not specifically limited in the embodiment of the present application.

In the above embodiment, the iterative update processing of the code behavior type is performed on the behavior type feature set in the behavior type feature set population based on the update termination condition. The updating termination condition can be set in combination with the detection effect, so that the target behavior type characteristic set containing the code behavior type with better detection effect on the malicious code can be screened out as much as possible subsequently, the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious code can be improved.

In some embodiments, the update process comprises: at least one of aggregating between different code behavior types, pruning a code behavior type, adding a code behavior type, disassembling a code behavior type, or altering a code behavior type.

Specifically, how to aggregate different code behavior types and how to disassemble the code behavior types may refer to the related description in the previous embodiment, which is not described herein again. The code behavior type deletion refers to directly deleting the code behavior type in the behavior type feature set, and the code behavior type can be randomly deleted from the code behavior type in the actual implementation process; adding the code behavior type refers to directly adding the code behavior type in the behavior type feature set, and the code behavior type can be randomly selected from known code behavior types in actual implementation, which is not specifically limited in this embodiment of the application.

The change of the code behavior type may refer to replacing the code behavior type present in the behavior type feature set. Whereas, as stated in connection with the previous embodiment, the code behavior type may be represented by binary coding. Therefore, when the code behavior type is changed, except for direct replacement, mutation can be generated on the information on the characteristic bit in the binary code, so that the original code behavior type is changed. The mutation may be one or more of the characteristic bits, which is not specifically limited in this application.

In addition, which feature bit is mutated may be completely random, and the selection may be performed by roulette selection, which is not specifically limited in the embodiment of the present application. Finally, the mutation result has no meaning considering that if random mutation occurs, the binary code which may cause mutation does not have any known code behavior type corresponding to the mutation. Therefore, in the actual implementation process, a mutation boundary limiting condition may also be set, for example, it is ensured that there is a known code behavior type corresponding to the mutation result, which is not specifically limited in the embodiment of the present application.

For example, a certain code behavior type binary code is "001", and the information on two characteristic bits can be mutated, for example, the information on the second characteristic bit from the left can be mutated from "0" to 1, and the information on the third characteristic bit from the left can be mutated from "1" to "0". As can be seen from the description of the binary coding example in the previous embodiment, this mutation actually changes the code behavior type from "command execution" to "code injection".

In the embodiment, the code behavior types are updated in different updating processing modes, so that the diversity of the code behavior types can be expanded, the screened target behavior type feature set is not limited to the existing code behavior types, the possibility of finding the global optimal solution is provided, the training effect of the detection model can be further improved, and the detection accuracy of the subsequent malicious codes is improved.

In the foregoing embodiment, it is mentioned that the code behavior type corresponding to the sample code behavior recorded in the sample behavior record set may be changed toward a direction conforming to the code behavior type in the target behavior type feature set. Based on the description, in some embodiments, determining a sample behavior type belonging to the target behavior type feature set based on the code behavior type of the sample code behavior recorded by the sample behavior record set with reference to the target behavior type feature set includes:

reconstructing at least a part of sample code behaviors recorded in the sample behavior record set to obtain a reconstructed sample behavior type; and determining the sample behavior type belonging to the target behavior type feature set in the reconstructed sample behavior type by referring to the target behavior type feature set.

Specifically, "reconstructing" mainly means reconstructing a code behavior type corresponding to at least a part of sample code behaviors recorded in the sample behavior record set. Wherein, the data before reconstruction is the code behavior type, and the result after reconstruction is the reconstruction sample behavior type. Data attributes are not changed before and after reconstruction, and both are code behavior types. In addition, the reconfiguration manner may include aggregation, disassembly, and modification of the code behavior types, which is not specifically limited in this embodiment of the present application. The aggregation may not be limited to the aggregation of 2 code behavior types, and may be more, and this is not specifically limited in the embodiment of the present application.

It should be noted that, in the embodiment of the present application, at least a part of the sample code behaviors of the behavior record set records are referred to as "reconstructing at least a part of the sample code behaviors", because some sample code behaviors in the sample behavior record set may not be "qualified" to be reconstructed. In connection with the example in the previous embodiment, it is exemplified that the code behavior types covered by the sample behavior record set include a1, a2, B, C, and E, and the code behavior types included in the target behavior type feature set include A, B1, C, and D. Wherein A is formed by polymerization of A1 and A2, and B is formed by polymerization of B1 and B2. Thus, with reference to the target behavior type feature set, only a1, a2, and B may be reconstructed, while C and E are not eligible for reconstruction, primarily because with reference to the target behavior type feature set, there is no actual reconstruction path for C and E.

In the above example, the reconstructed sample behavior types of the sample behavior record set are A, B1, B2, C, and E, respectively, and the code behavior types included in the target behavior type feature set include A, B1, C, and D, so that the computer device can determine that, among all the reconstructed sample behavior types, the sample behavior types belonging to the target behavior type feature set are A, B1 and C, respectively.

In the above embodiment, in the process of obtaining the target behavior type feature set, different update processing manners are adopted to update the code behavior type, and in order to ensure consistency between the input item and the target behavior type feature set, at least a part of the sample code behaviors recorded in the sample behavior record set need to be reconstructed. The updating process of the code behavior type is mainly used for expanding the diversity of the code behavior type. Therefore, the reconstruction process can be matched with the updating process of the code behavior type, so that the obtained target behavior type characteristic set is not limited to the existing code behavior type any more, the possibility of finding a global optimal solution is provided, the training effect of the detection model can be further improved, and the detection accuracy of the subsequent malicious codes is improved.

In some embodiments, reconstructing at least a portion of the sample code behavior recorded in the set of sample behavior records comprises:

acquiring a behavior reconstruction file, wherein the behavior reconstruction file is used for recording a code behavior type and a reconstruction processing mode required by reconstruction when the code behavior is reconstructed to the code behavior type included in the target behavior type feature set; and under the condition that at least one part of sample code behaviors recorded in the sample behavior record set belong to code behavior types required by reconstruction in the behavior reconstruction file, reconstructing at least one part of sample code behaviors according to a corresponding reconstruction processing mode recorded in the behavior reconstruction file.

The behavior reconstruction file may be obtained together with the iterative update processing procedure mentioned in the previous embodiment. Specifically, in the iterative update processing process, the behavior type feature set in the behavior type feature set population is subjected to iterative update processing of the code behavior type. Wherein, the update process may include deletion, addition, aggregation, disassembly and alteration. Obviously, the way of update processing of deletion and addition is independent of reconstruction, and mainly involves aggregation, disassembly and alteration.

During the iterative update process, which code behavior types to aggregate into a new code behavior type may be recorded. At the same time, the update processing mode of the "aggregate" may also be recorded. Conversely, if a certain code behavior type included in the target behavior type feature set needs to be reconstructed subsequently, which code behavior types are needed by the code behavior type can be known based on the previously recorded content. At the same time, it is also known how the reconfiguration process (corresponding to the update process of the previous record) is required to reconstruct the behavior type of the code. Therefore, the behavior reconstruction file can be formed by recording in the iterative updating processing process, and the behavior type feature set formed in the iterative updating processing process corresponds to one behavior reconstruction file. Therefore, the target behavior type feature set also corresponds to a behavior reconstruction file.

For convenience of understanding, in combination with the example in the previous embodiment, the code behavior types covered by the sample behavior record set include a1, a2, B, C and E, and two reconstruction paths are recorded in the behavior reconstruction file corresponding to the target behavior type feature set. The two reconstruction paths are respectively:

"1, reconstruction processing mode: polymerizing; the required code behavior types include a1 and a2, the reconstruction result is the code type a ", and" 2, reconstruction processing mode: disassembling; the required code behavior types include B, with the reconstruction results being code types B1 and B2 ". The content format in the behavior reconstruction file presented above is only one example, and may be set according to requirements in an actual implementation process, which is not specifically limited in this embodiment of the present application.

Thus, after acquiring the behavior reconstruction file, the computer device can determine that the types of code behaviors recorded in the sample behavior record set and belonging to the reconstruction in the behavior reconstruction file are A1, A2 and B. The computer device can aggregate A1 and A2 into A and disassemble B into B1 and B2 according to the corresponding reconstruction processing mode of the behavior reconstruction file records.

In the above embodiment, in the process of obtaining the target behavior type feature set, different update processing manners are adopted to update the code behavior type, and in order to ensure consistency between the subsequent sample behavior feature set as the input item and the target behavior type feature set, a reconstruction process needs to be performed. The updating processing of the code behavior types is mainly performed to expand the diversity of the code behavior types, so that the reconstruction process can be matched with the updating processing process of the code behavior types, the obtained target behavior type characteristic set is not limited to the existing code behavior types, the possibility of finding the global optimal solution is provided, the training effect of the detection model can be further improved, and the detection accuracy of the subsequent malicious codes is improved.

In some embodiments, the sample behavior feature set has feature bits that correspond one-to-one to the code behavior types in the target behavior type feature set; in the sample behavior feature group, the feature bit corresponding to the sample behavior type belonging to the target behavior type feature set is a first value, and the remaining feature bits in the sample behavior feature group with the removed value of the first value feature bit are second values.

The first value and the second value are mainly used to indicate that the first value and the second value are distinguished from each other, and specific values of the first value and the second value in an actual implementation process may be set according to requirements, for example, the first value is 1, and the second value is 0, which is not specifically limited in this embodiment of the application.

For ease of understanding, the sample behavior feature group has 4 feature bits, and information on the feature bits is represented by 1 and 0. The code behavior type corresponding to the first characteristic bit in the target behavior type characteristic set is command execution, the code behavior type corresponding to the second characteristic bit is code injection, the code behavior type corresponding to the third characteristic bit is memory resident, and the code behavior type corresponding to the fourth characteristic bit is credential collection.

If the sample behavior type of the sample code behavior recorded in the sample behavior record set is determined to be command execution and credential collection based on the code behavior type of the sample code behavior recorded in the sample behavior record set, the computer device may determine a feature bit corresponding to the command execution, that is, the value of the first feature bit is 1; and can confirm the characteristic bit that "the evidence collects" and corresponds to, namely the value of the fourth characteristic bit is 1; it can also be determined that the remaining feature bits have a value of 0. Thus, the last formed sample behavior feature group is "1001".

In the above embodiment, the sample behavior feature group including the feature bits can effectively represent which sample code behaviors subordinate to the code behavior type in the target behavior type feature set are finally generated by the code training sample. The sample behavior characteristic group only occupies the storage space of a plurality of bits, so that the occupied storage resources can be reduced as much as possible. Meanwhile, the storage space of a plurality of bits is only occupied, so that the data volume of subsequent processing is reduced, and the processing resource of the computer equipment is saved.

In the above embodiment, mainly during the training process of one detection model, in the actual implementation process, multiple detection models may be trained simultaneously, and then a detection model with a better detection effect is selected from all the trained detection models. Based on the description, in some embodiments, the number of the detection models and the number of the target behavior type feature sets are multiple, and each detection model corresponds to one target behavior type feature set; after training the detection model based on the sample behavior feature group and the sample labels, the method further comprises the following steps:

testing each trained detection model through a code sample to obtain a detection effect evaluation value of a corresponding target behavior type feature set; and screening out a detection model for detecting the malicious codes from all the trained detection models according to the detection effect evaluation value of each target behavior type feature set.

The "code sample" mentioned in the embodiment of the present application may be the same code sample as the "code sample" mentioned in the previous embodiment, or may be a different code sample, and this is not specifically limited in the embodiment of the present application. As can be seen from the above description of the embodiments, each trained detection model corresponds to one target behavior type feature set. Thus, the computer device can calculate the detection effect evaluation value of each target behavior type feature set. As to the calculation process of the detection effect evaluation value, reference may be made to the calculation process of the "detection effect evaluation value of the behavior type feature set when used for detecting the malicious code" in the foregoing embodiment, and details are not described here again. The computer equipment can determine the target behavior type feature set with the best detection effect by comparing the detection effect evaluation value of each target behavior type feature set, and the trained detection model corresponding to the target behavior type feature set is the screened detection model for detecting the malicious codes.

It should be noted that the code training samples used in the previous training process for each of the above-mentioned trained detection models may be the same samples in the same batch, or may be different samples. In addition, the code samples used in the test process mentioned in the embodiment of the present application may be the same code sample or different code samples used for each trained detection model, which is not specifically limited in this embodiment of the present application. Since the code training samples may be generated continuously, the detection model may be iteratively trained periodically in an actual implementation process.

In the above embodiment, because a plurality of detection models can be trained simultaneously, and the detection model for detecting the malicious code can be screened out from all the trained detection models based on the detection effect evaluation value corresponding to each trained detection model, the detection model with the better detection effect can be screened out as much as possible and put into use in the subsequent detection process of the malicious code, and the detection accuracy of the subsequent malicious code can be improved.

The above embodiment is mainly a process of training a detection model of a malicious code, and in an actual implementation process, the detection model can be applied to realize detection of the malicious code. Thus, in some embodiments, as shown in fig. 3, a method for detecting malicious code is provided, which is described by taking an example that the method is applied to a computer device (the computer device may specifically be a terminal or a server in fig. 1), and includes the following steps:

302. and acquiring a behavior record set of the target code, wherein the behavior record set records the code behavior generated after the target code runs.

In particular, the computer device may run object code in a closed, behavior-aware environment. Wherein the behavior awareness environment is configured to record sample code behavior generated after the target code runs. The computer device may compose a behavior record set of the target code based on the code behaviors recorded by the behavior awareness environment.

304. And acquiring a target behavior type characteristic set, wherein the target behavior type characteristic set comprises a code behavior type required when the trained detection model detects the malicious code.

As can be seen from the contents in the related embodiments of the training process, the trained detection models all correspond to a target behavior type feature set. Therefore, when the computer equipment acquires the trained detection model, the corresponding target behavior type characteristic set can be acquired at the same time. How the target behavior type feature set is generated may refer to the content in the related embodiment of the training process, and details are not described here.

306. And determining the target behavior type belonging to the target behavior type feature set based on the code behavior type of the code behavior recorded by the behavior record set by referring to the target behavior type feature set.

308. And generating a behavior feature group of the target code based on the target behavior type belonging to the target behavior type feature group.

Specifically, the implementation process of step 306 may refer to the implementation process of step 206 in the embodiment corresponding to fig. 2, and the implementation process of step 308 may refer to the implementation process of step 208 in the embodiment corresponding to fig. 2, which is not described herein again.

310. And detecting the malicious codes based on the behavior feature group through the trained detection model to obtain the malicious attributes of the target codes.

Specifically, the computer device inputs the behavior feature set to the trained detection model, and then the malicious attribute of the target code can be output. The malicious attribute can have two results, namely malicious code and non-malicious code.

The malicious code detection method includes the steps of obtaining a behavior record set of a target code, referring to a target behavior type characteristic set, generating a behavior characteristic set of the target code based on a code behavior type of a code behavior recorded by the behavior record set, and detecting the target code through a detection model based on the behavior characteristic set. Because the malicious code behaviors can be generated only in the execution process of the malicious codes, that is, whether the code behaviors are malicious or not can directly reflect whether the codes generating the code behaviors are malicious or not, the code behaviors generated in the operation process of the codes are used in the training process of the detection model and are detected by the detection model, and the codes cannot be bypassed like the file characteristics of the codes, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious codes can be improved.

In addition, the behavior feature group is generated by referring to the target behavior type feature group and based on the behavior record set, and the target behavior type feature group is formed by the code behavior type which is possibly malicious, so that the behavior feature group can reflect the malicious degree of the code behavior in the behavior record set as much as possible, and further, the behavior feature group is used as an input item of the detection model, and the detection accuracy of the malicious code can be improved.

Finally, the code is detected by using the detection model, so that the occupied storage resource is continuously increased due to the fact that a malicious code feature library is frequently updated like file feature detection, and only the sample is updated to train the detection model again, namely the detection model does not need to occupy too much storage resource, and therefore the storage resource can be saved. Meanwhile, the malicious code feature library is gradually huge, so that the search times in the subsequent detection process are gradually increased, the detection efficiency is influenced, and the system performance is influenced due to too many processing resources occupied. And by using the detection model, the prediction result can be obtained only by an input and output process, so that the detection efficiency can be improved, and the influence on the system performance can be reduced.

In some embodiments, determining, with reference to the target behavior type feature set, a target behavior type belonging to the target behavior type feature set based on a code behavior type of a code behavior recorded by the behavior record set includes:

reconstructing at least one part of code behaviors recorded in the behavior record set to obtain a reconstructed behavior type; and determining the target behavior type belonging to the target behavior type feature set in the reconstructed behavior type by referring to the target behavior type feature set.

Specifically, when reconstructing at least a part of code behaviors recorded in the behavior record set, the computer device may be implemented by the following processes: acquiring a behavior reconstruction file, wherein the behavior reconstruction file is used for recording a code behavior type and a reconstruction processing mode required by reconstruction when the code behavior is reconstructed to the code behavior type included in the target behavior type feature set; and under the condition that at least one part of code behaviors recorded in the behavior record set belong to the code behavior type required by reconstruction in the behavior reconstruction file, reconstructing at least one part of code behaviors according to the corresponding reconstruction processing mode recorded in the behavior reconstruction file. The content in the related embodiment of the training process may be referred to in the specific reconstruction process, and is not described herein again.

In the above embodiment, in the process of obtaining the target behavior type feature set, different update processing manners are adopted to update the code behavior type, and in order to ensure consistency between the behavior feature set serving as the input item and the target behavior type feature set, a reconstruction process needs to be performed. The updating processing of the code behavior types is mainly performed to expand the diversity of the code behavior types, so that the reconstruction process can be matched with the updating processing process of the code behavior types, the obtained target behavior type characteristic set is not limited to the existing code behavior types, the possibility of finding the global optimal solution is provided, the training effect of the detection model can be further improved, and the detection accuracy of the subsequent malicious codes is improved.

In some embodiments, the behavior feature group has feature bits that correspond one-to-one to the code behavior types in the target behavior type feature set; in the behavior feature group, the feature bit corresponding to the target behavior type belonging to the target behavior type feature set is a first value, and the remaining feature bits except the feature bit with the first value in the behavior feature group are second values.

Specifically, the obtaining manner of the information on the feature bits in the behavior feature group, the distribution and the meaning of the feature bits in the behavior feature group, and the value manner and the meaning of the first value and the second value may refer to the contents in the related embodiments of the training process, and are not described herein again.

In the above embodiment, the behavior feature group including the feature bits can effectively indicate which code behaviors subordinate to the code behavior type in the target behavior type feature set are finally generated by the target code. The behavior feature group only occupies the storage space of a plurality of bits, so that the occupied storage resources can be reduced as much as possible. Meanwhile, the storage space of a plurality of bits is only occupied, so that the data volume is less, the data volume of subsequent processing can be reduced, and the processing resource of computer equipment is saved.

It is considered that data generated in practical application can also be continuously used as code training samples to train the detection model. Therefore, in some embodiments, after obtaining the malicious attribute of the target code, the method further includes, by using the trained detection model, performing malicious code detection based on the behavior feature set:

pushing detection request information for detecting the malicious property of the target code outwards; obtaining a returned checking result based on the checking request information, wherein the checking result is used for determining a sample label when the target code is used as a code training sample; and training the detection model again based on the code training sample as the target code and the corresponding sample label.

The computer device may push the verification request information to the outside, for example, to other computer devices, such as to other mobile terminals. Meanwhile, the push mode may be a short message, an email, or an application notification, and the type and the push mode of the push object are not specifically limited in the embodiment of the present application. The main roles of the verification request information may be: the request message receiver confirms the detection result of the target code. Therefore, the inspection request information can carry at least the detection result of the target code through the detection model before, namely the malicious property of the target code.

After receiving the verification request information, the information receiver can verify the verification request information and return a verification result to the computer equipment. If the information receiver finds that the previous detection result is wrong after the detection result carried in the detection request information is detected, the previous detection result can be corrected, and the detection result carries correction information to be returned to the computer equipment. Otherwise, the information receiver can reply the confirmation information of the previous detection result, and the verification result carries the confirmation information and returns the confirmation information to the computer equipment.

After obtaining the verification result, the computer device may determine the target code as a sample label when the code training sample is based on the verification result. Specifically, if the verification result carries the confirmation information, the previous detection result may be directly used as the sample label of the target code. If the verification result carries the correction information, the correct detection result pointed by the correction information can be used as a sample label of the target code. After obtaining a code training sample as a target code and a corresponding sample label, the computer equipment can add the code training sample and the corresponding sample label to a sample database in an incremental warehousing mode, and then use the sample database for training when needed subsequently, such as periodically training in a daily, weekly or monthly mode; and the detection model can be trained again by directly adopting a supervised training mode. The above-mentioned related embodiments of the training process can be referred to in the training process, and details thereof are not repeated herein.

It should be noted that, in addition to using the target code obtained in the actual detection process as the code training sample, in the actual implementation process, the target code used as the code training sample may be obtained in a manner of manual reverse analysis or honeypot detonation, and the target code is pushed outwards by the computer device to obtain the sample label when the target code is used as the code training sample, so that the obtained target code and the corresponding sample label as the code training sample are added to the sample database in an incremental warehousing manner.

It should be further noted that the target behavior type feature set may be updated in real time, and the updating manner may include expanding the target behavior type feature set. Since the target behavior type feature set determines an input item of the detection model, and the code behavior type set in the target behavior type feature set is generally regarded as a malicious code behavior, it can be understood that, when setting the code behavior type in the target behavior type feature set, the severity of the code behavior type is set, and is related to the frequency of alarms when the detection model detects the code, with reference to the target behavior type feature set subsequently.

If the setting of the code behavior type is severe, some code behaviors which affect the security of the computer equipment to a small extent may be triggered to alarm, the alarm times are more, and the alarm frequency is higher. In practice, the number of times of alarm increases, and thus, the number of false alarms may increase. Therefore, the code behavior type in the target behavior type feature set can be set based on the acceptance degree of false alarm in the actual implementation process.

In the above embodiment, since it is possible to push out verification request information for verifying a malicious attribute of the target code, it is possible to optimize the false detection. Meanwhile, the optimization result of the false detection and the confirmation result of the correct detection can be used as code training samples for automatically training the detection model, so that the detection performance of the detection model can be continuously improved in the attack and defense confrontation.

For convenience of understanding, as shown in fig. 4, a malicious code detection model processing method is provided, which is described by taking an example that the method is applied to a computer device (the computer device may specifically be a terminal or a server in fig. 1), and includes the following steps:

step 402, a sample behavior record set of the code training sample is obtained, and the sample behavior record set records sample code behaviors generated after the corresponding code training sample is operated.

Step 404, obtaining a behavior type feature set population, wherein the behavior type feature set population comprises at least one behavior type feature set.

And 406, performing iterative update processing of the code behavior type on the behavior type feature set in the behavior type feature set population until an update termination condition is reached, and obtaining a finally updated behavior type feature set population.

And step 408, screening a target behavior type feature set from the finally updated behavior type feature set population, wherein the target behavior type feature set comprises a code behavior type for training a detection model.

And step 410, reconstructing at least a part of sample code behaviors recorded in the sample behavior record set to obtain a reconstructed sample behavior type, referring to the target behavior type feature set, and determining the sample behavior type belonging to the target behavior type feature set in the reconstructed sample behavior type.

And step 412, generating a sample behavior characteristic group of the code training sample based on the sample behavior type belonging to the target behavior type characteristic group.

And 414, obtaining a sample label for representing the malicious attribute of the code training sample, and training a detection model based on the sample behavior feature group and the sample label.

In the above embodiment, a sample behavior record set and a target behavior type feature set of a code training sample are obtained, the target behavior type feature set is referred to, a sample behavior feature group of the code training sample is generated based on a code behavior type of a sample code behavior recorded by the sample behavior record set, and a detection model is trained based on the sample behavior feature group and a sample label. Because the malicious code behaviors can be generated only in the execution process of the malicious codes, namely whether the code behaviors are malicious or not can be directly reflected whether the codes generating the code behaviors are malicious or not, the code behaviors generated in the operation process of the codes are used in the training process of the detection model and are detected by the detection model, and the codes cannot be bypassed like the file characteristics of the codes, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious codes is improved.

And secondly, performing iterative update processing of the code behavior type on the behavior type feature set in the behavior type feature set population based on the update termination condition. The updating termination condition can be set by combining with the evaluation index associated with the contribution degree of the malicious code detection, so that the target behavior type characteristic set containing the code behavior type with larger contribution degree to the malicious code detection can be screened out as much as possible subsequently, the training effect of the detection model can be improved, and the detection accuracy rate of the subsequent malicious code can be improved. Meanwhile, the detection performance of the detection model can be continuously improved through a plurality of iterative updating processing processes. Meanwhile, the updating termination condition can also expand the diversity of the code behavior types to be set for consideration, so that the screened target behavior type characteristic set is not limited to the existing code behavior types any more, the possibility of finding the global optimal solution is provided, the training effect of the detection model can be further improved, and the detection accuracy of the subsequent malicious codes is improved.

The embodiment of the application further provides an application scenario, which applies the malicious code detection model processing method, and takes the example that computer equipment relates to a server as an example for explanation. Specifically, the application of the malicious code detection model processing method in the application scenario is as follows:

before executing the method, the detection environment of the malicious code of the terminal can be constructed in advance on the terminal side and the server side together. The detection environment can comprise a log collection client, a data storage platform and a machine learning platform. The specific architecture of the detection environment can refer to fig. 5, and in fig. 5, "host 1", "host 2", "host 3", and "host 4" all represent terminals installed with log collection clients. The terminal collects process running logs through a log collection client installed on the terminal, and imports collected process behavior logs into a data storage platform in the server. The data storage platform may clean, integrate, and tag the process behavior log to generate a code sample. And the machine learning platform in the server can be used for storing the detection model and acquiring the code sample from the data storage platform for training the detection model. Meanwhile, the machine learning platform in the server can acquire the target code to be detected so as to realize the detection of the target code. It should be noted that, in order to overcome the problem that it is difficult to perform automatic iterative update after the detection environment is deployed, the target code in practical application may be used as a code sample again, and is used for continuously performing iterative update optimization on the detection model, so as to improve the detection performance of the model.

Based on the above description about the detection environment, the process of the detection model processing method for malicious code may include: and the server cleans, integrates and labels the process behavior logs by the process behavior logs imported by the data storage platform storage terminal to generate various data required by the training process.

The server obtains a sample behavior record set of the code training sample from the data storage platform and obtains a target behavior type characteristic set.

And the server determines the sample behavior type belonging to the target behavior type feature set based on the code behavior type of the sample code behavior recorded by the sample behavior recording set by referring to the target behavior type feature set. And the server generates a sample behavior characteristic group of the code training sample based on the sample behavior type belonging to the target behavior type characteristic set.

A machine learning platform in the server acquires a sample label for representing the malicious attribute of the code training sample from a data storage platform; and training a detection model by a machine learning platform in the server based on the sample behavior feature group and the sample labels.

The embodiment of the application further provides an application scenario, which applies the malicious code detection method, and takes the example that a computer device relates to a server as an example for explanation. Specifically, the application of the malicious code detection method in the application scenario is as follows:

before the method is executed, a target behavior type feature set and a trained detection model can be deployed to a detection environment. The detection environment may refer to the relevant description in the above application scenario embodiment. The deployment flow chart of the detection model can refer to fig. 6.

The server acquires a behavior record set of the target code from the terminal side; in FIG. 6, the conversational flow is the code behavior produced by the object code, and the conversational flow may be turned to generate a set of behavior records.

The server acquires a target behavior type characteristic set; in fig. 6, the server may obtain the target behavior type feature set from the output of the iterative update processing flow for the code behavior type in the iterative flow, and may obtain the detection model from the iterative training flow for the detection model.

The server determines a target behavior type belonging to the target behavior type feature set based on the code behavior type of the code behavior recorded by the behavior record set by referring to the target behavior type feature set; and generating a behavior characteristic group of the target code based on the target behavior type belonging to the target behavior type characteristic group.

And the server inputs the behavior feature group into the detection model, and performs malicious code detection based on the behavior feature group through the detection model to obtain a detection result of the target code. The detection result is mainly malicious attribute of the target code. Wherein, the detection result in a certain period of time can refer to the following table 2:

TABLE 2

Time	Malicious behavior execution	Whether or not to detect
			10 months and 25 days in 2021	Command execution	Is that
2021, 10 months and 25 days	Inject + command execution	Is that
			10 months and 25 days in 2021	Collection credential + command execution	Is that
10 months and 25 days in 2021	Persistent resident + command execution	Is that
			11/09/2021	Injection + Collection credential	Is that
11/09/2021	Injection + persistence	Is that
			11/09/2021	Inject + Refresh information	Is that

After the detection process is completed, the detection result of the target code may be considered to be checked, and both the detection result of the target code and the target code are used as samples for the iterative training process. The inspection flow and the iterative training flow, and the specific frame flow can refer to fig. 7. The verification process and iterative training process will now be described with reference to fig. 7:

the server pushes detection request information for detecting the malicious attribute of the target code to the terminal side; the verification request information may carry a detection result of the object code. Corresponding to fig. 7, the detection result may be recorded by the model file, and the detection result may be obtained from the model file; and then the detection result is pushed to a terminal used by security personnel through an alarm work order. Wherein the alarm work order may correspond to the inspection request information.

The server obtains the returned checking result based on the checking request information returned by the terminal side, namely the result of tagging returned by the security personnel corresponding to fig. 7.

And the server generates a label sample according to the code training sample as the target code and the corresponding sample label, and adds the label sample to the sample database in an incremental manner. The method can be used for performing iterative training on the detection model subsequently, and can also be used for performing iterative update processing on the code behavior type of the target behavior type feature set. After the new model training is completed, the newly trained model and the newly generated target behavior type feature set can be deployed in the detection environment.

It should be noted that the application scenarios described above are exemplary application scenarios for assisting understanding of the solution of the present application, and are not limited to the actual application scenarios of the present application.

It should be understood that, although the steps in the flowcharts related to the embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.

Based on the same inventive concept, the embodiment of the present application further provides a malicious code detection model processing apparatus for implementing the above-mentioned malicious code detection model processing method. The implementation scheme for solving the problem provided by the apparatus is similar to the implementation scheme described in the above method, so that specific limitations in the following embodiments of the detection model processing apparatus for one or more malicious codes may refer to the above limitations on the detection model processing method for malicious codes, and details are not described herein again.

In some embodiments, as shown in fig. 8, there is provided a malicious code detection model processing apparatus 800, which may adopt a software module or a hardware module, or a combination of the two, as a part of a computer device, and specifically includes: a first obtaining module 802, a second obtaining module 804, a determining module 806, a generating module 808, and a training module 810, wherein:

a first obtaining module 802, configured to obtain a sample behavior record set of the code training sample, where the sample behavior record set records a sample code behavior generated after the corresponding code training sample runs;

a second obtaining module 804, configured to obtain a target behavior type feature set, where the target behavior type feature set includes a code behavior type used for training a detection model;

a determining module 806, configured to determine, with reference to the target behavior type feature set, a sample behavior type belonging to the target behavior type feature set based on the code behavior type of the sample code behavior recorded in the sample behavior record set;

a generating module 808, configured to generate a sample behavior feature group of the code training sample based on the sample behavior type belonging to the target behavior type feature set;

and the training module 810 is used for acquiring a sample label for representing the malicious attribute of the code training sample, and training the detection model based on the sample behavior feature group and the sample label.

In some embodiments, the first obtaining module 802 is configured to run a code training sample in a closed behavior-aware environment, where the behavior-aware environment is configured to record sample code behaviors generated after the code training sample is run; and forming a sample behavior record set of the code training sample based on the sample code behaviors recorded by the behavior perception environment.

In some embodiments, the target behavior type feature set is obtained by the target behavior type feature set construction step; the device also includes:

In some embodiments, the first constructing module is further configured to, for a target code behavior corresponding to each code behavior type in the initial behavior type feature set, obtain a first number of sample behavior record sets in which target code behaviors are recorded among the plurality of sample behavior record sets, and a second number of sample behavior record sets in which target code behaviors are not recorded; determining a third number of malicious code samples and a fourth number of non-malicious code samples in code samples corresponding to a sample behavior record set in which target code behaviors are recorded in a plurality of sample behavior record sets; determining a fifth number of malicious code samples and a sixth number of non-malicious code samples in code samples corresponding to a sample behavior record set in which target code behaviors are not recorded in a plurality of sample behavior record sets; and calculating the conditional entropy of any code behavior type according to the first number, the second number, the third number, the fourth number, the fifth number and the sixth number.

the second construction module is used for acquiring an initial behavior type feature set and a plurality of sample behavior record sets corresponding to the plurality of code samples; according to the target code behaviors corresponding to each code behavior type in the initial behavior type feature set, referring to the sample code behaviors recorded in the plurality of sample behavior record sets, and determining the code sample with the target code behaviors recorded in the sample behavior record set as a target code sample in the plurality of code samples; calculating a first proportion of non-malicious code samples in the target code sample to non-malicious code samples in the plurality of code samples; calculating a second percentage of malicious code samples in the target code sample in malicious code samples in the plurality of code samples; according to the first proportion and the second proportion, acquiring the contribution degree of each code behavior type to malicious code detection; and screening the initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection to obtain a target behavior type feature set.

In some embodiments, the target behavior type feature set is updated by the target behavior type feature set updating step; the device also includes:

In some embodiments, the updating module is further configured to obtain an initial behavior type feature set and a plurality of sample behavior record sets of a plurality of code samples; for each code behavior type in the initial behavior type feature set, calculating the contribution degree of each code behavior type to malicious code detection according to the distribution of the code behavior corresponding to each code behavior type in a plurality of sample behavior record sets and the distribution of malicious code samples and non-malicious code samples in a plurality of code samples; screening an initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection, and obtaining screened code behavior types; and combining the screened code behavior types to obtain a behavior type characteristic set population.

In some embodiments, the updating module is further configured to select at least a part of the behavior type feature sets in the behavior type feature set population to be updated in the current iteration; updating at least part of code behavior types in each selected behavior type feature set; and detecting each behavior type feature set in the behavior type feature set population subjected to the updating processing of the code behavior type based on the trained detection model, and determining the behavior type feature set population subjected to the iterative updating processing according to a corresponding detection result.

In some embodiments, the updating module is further configured to obtain a sample behavior feature group set and a sample label set corresponding to each behavior feature group from the behavior feature group population subjected to the update processing of the code behavior type, where each sample behavior feature group set is generated with reference to a corresponding behavior feature group; inputting each sample behavior feature group in each sample behavior feature group set into the trained detection model to obtain a detection result set corresponding to each sample behavior feature group set; and screening at least part of the behavior type feature sets from the behavior type feature set populations subjected to the updating processing of the code behavior types according to the detection result set and the sample label set corresponding to each behavior type feature set.

In some embodiments, the updating module is further configured to obtain a detection effect evaluation value of each behavior type feature set when the behavior type feature set is used for detecting a malicious code according to the detection result set and the sample label set corresponding to each behavior type feature set; and screening a preset number of behavior type feature sets in a descending order according to the detection effect evaluation value of each behavior type feature set from the behavior type feature set population subjected to the updating processing of the code behavior type.

In some embodiments, the update termination condition includes that a behavior type feature set with a detection effect evaluation value larger than a preset threshold exists in the behavior type feature set population after the update processing of the code behavior type.

In some embodiments, the determining module 806 is configured to reconstruct at least a portion of sample code behaviors recorded in the sample behavior record set to obtain a reconstructed sample behavior type; and determining the sample behavior type belonging to the target behavior type characteristic set in the reconstructed sample behavior type by referring to the target behavior type characteristic set.

In some embodiments, the determining module 806 is further configured to obtain a behavior reconstruction file, where the behavior reconstruction file is configured to record a code behavior type and a reconstruction processing manner required for reconstructing when a code behavior is reconstructed to a code behavior type included in the target behavior type feature set; and under the condition that at least one part of sample code behaviors recorded in the sample behavior record set belong to the code behavior type required by reconstruction in the behavior reconstruction file, reconstructing at least one part of sample code behaviors according to the corresponding reconstruction processing mode recorded in the behavior reconstruction file.

In some embodiments, the number of the detection models and the number of the target behavior type feature sets are multiple, and each detection model corresponds to one target behavior type feature set; the training module 810 is further configured to test each trained detection model through a code sample to obtain a detection effect evaluation value of the corresponding target behavior type feature set; and screening out a detection model for detecting the malicious codes from all the trained detection models according to the detection effect evaluation value of each target behavior type feature set.

The malicious code detection model processing device generates a sample behavior feature group of the code training sample by acquiring a sample behavior record set and a target behavior type feature group of the code training sample, referring to the target behavior type feature group, and based on the code behavior type of the sample code behavior recorded by the sample behavior record set, and trains the detection model based on the sample behavior feature group and the sample label. Because the malicious code behaviors can be generated only in the execution process of the malicious codes, that is, whether the code behaviors are malicious or not can directly reflect whether the codes generating the code behaviors are malicious or not, the code behaviors generated in the operation process of the codes are used in the training process of the detection model and are detected by the detection model, and the codes cannot be bypassed like the file characteristics of the codes, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious codes can be improved.

Finally, the code is detected by using the detection model, so that the occupied storage resource is continuously increased due to frequent updating of the malicious code feature library as in file feature detection, but only the detection model needs to be trained again by updating the sample, namely the detection model does not need to occupy too much storage resource, and the storage resource can be saved. Meanwhile, the malicious code feature library is gradually huge, so that the search times in the subsequent detection process are gradually increased, the detection efficiency is influenced, and the system performance is influenced due to too many processing resources occupied. By using the detection model, the prediction result can be obtained only by the input and output process, so that the detection efficiency can be improved, and the influence on the system performance can be reduced.

For specific limitations of the malicious code detection model processing apparatus, reference may be made to the above limitations of the malicious code detection model processing method, which is not described herein again. The modules in the detection model processing device for malicious codes can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

Based on the same inventive concept, the embodiment of the application also provides a malicious code detection device for realizing the above malicious code detection method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the method, so that specific limitations in the embodiment of the device for detecting one or more malicious codes provided below can be referred to the limitations of the method for detecting the malicious codes in the foregoing, and details are not described herein again.

In some embodiments, as shown in fig. 9, there is provided an apparatus 900 for detecting malicious code, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, and specifically includes: a first acquisition module 902, a second acquisition module 904, a determination module 906, a generation module 908, and a detection module 910, wherein:

a first obtaining module 902, configured to obtain a behavior record set of a target code, where the behavior record set records a code behavior generated after the target code runs;

a second obtaining module 904, configured to obtain a target behavior type feature set, where the target behavior type feature set includes a code behavior type required when a trained detection model detects a malicious code;

a determining module 906, configured to determine, with reference to the target behavior type feature set, a target behavior type belonging to the target behavior type feature set based on the code behavior type of the code behavior recorded in the behavior recording set;

a generating module 908, configured to generate a behavior feature group of the target code based on the target behavior type belonging to the target behavior type feature group;

the detection module 910 is configured to perform malicious code detection based on the behavior feature group through the trained detection model, so as to obtain a malicious attribute of the target code.

In some embodiments, the determining module 906 is configured to reconstruct at least a part of code behaviors recorded in the behavior record set, and obtain a reconstructed behavior type; and determining the target behavior type belonging to the target behavior type feature set in the reconstructed behavior type by referring to the target behavior type feature set.

In some embodiments, the detecting module 910 is further configured to push verification request information for verifying malicious attributes of the target code; obtaining a returned checking result based on the checking request information, wherein the checking result is used for determining a sample label when the target code is used as a code training sample; and training the detection model again based on the code training sample as the target code and the corresponding sample label.

The malicious code detection device generates a behavior feature group of the target code by acquiring a behavior record set of the target code, referring to a target behavior type feature set and based on the code behavior type of the code behavior recorded by the behavior record set, and detects the target code through a detection model based on the behavior feature group. Because the malicious code behaviors can be generated only in the execution process of the malicious codes, namely whether the code behaviors are malicious or not can be directly reflected whether the codes generating the code behaviors are malicious or not, the code behaviors generated in the operation process of the codes are used in the training process of the detection model and are detected by the detection model, and the codes cannot be bypassed like the file characteristics of the codes, so that the training effect of the detection model can be improved, and the detection accuracy of the subsequent malicious codes is improved.

For specific limitations of the detection apparatus for malicious codes, reference may be made to the above limitations of the detection method for malicious codes, which are not described herein again. The modules in the malicious code detection device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal or a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, an Input/Output interface (I/O for short), and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing training data. The input/output interface of the computer device is used for exchanging information between the processor and an external device. The communication interface of the computer device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a malicious code detection model processing method or a malicious code detection method.

Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation. The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent application shall be subject to the appended claims.

Claims

1. A malicious code detection model processing method is characterized by comprising the following steps:

acquiring a sample behavior record set of a code training sample, wherein the sample behavior record set records sample code behaviors generated after the corresponding code training sample runs;

referring to the target behavior type feature set, and determining a sample behavior type belonging to the target behavior type feature set based on a code behavior type of a sample code behavior recorded by the sample behavior recording set;

generating a sample behavior feature group of the code training sample based on the sample behavior type belonging to the target behavior type feature group;

and acquiring a sample label for representing the malicious attribute of the code training sample, and training the detection model based on the sample behavior feature group and the sample label.

2. The method of claim 1, wherein obtaining the set of sample behavior records of the code training sample comprises:

running the code training sample in a closed behavior awareness environment, wherein the behavior awareness environment is configured to record sample code behaviors generated after the code training sample is run;

forming a sample behavior record set of the code training sample based on the sample code behaviors recorded by the behavior awareness environment.

3. The method according to claim 1, wherein the target behavior type feature set is obtained by a target behavior type feature set construction step comprising:

acquiring an initial behavior type feature set and a plurality of sample behavior record sets corresponding to a plurality of code samples;

determining the occurrence probability of each malicious code sample and non-malicious code sample in the plurality of code samples, and calculating the information entropy of the code samples according to the occurrence probability of each malicious code sample;

determining conditional entropy of each code behavior type in the initial behavior type feature set;

determining an information gain value of each code behavior type according to the information entropy and the conditional entropy of each code behavior type; the information gain value is used for indicating the contribution degree of the corresponding code behavior type to the malicious code detection;

and selecting code behavior types corresponding to a preset number of information gain values before in descending order according to the information gain values from the initial behavior type feature set to construct the target behavior type feature set.

4. The method of claim 3, wherein determining the conditional entropy for each code behavior type in the initial behavior type feature set comprises:

acquiring a first number of sample behavior record sets in which the target code behaviors are recorded in the plurality of sample behavior record sets and a second number of sample behavior record sets in which the target code behaviors are not recorded aiming at the target code behaviors corresponding to each code behavior type in the initial behavior type feature set;

determining a third number of malicious code samples and a fourth number of non-malicious code samples in code samples corresponding to a sample behavior record set in which the target code behavior is recorded in the plurality of sample behavior record sets;

determining a fifth number of malicious code samples and a sixth number of non-malicious code samples in code samples corresponding to a sample behavior record set which does not record the target code behavior in the plurality of sample behavior record sets;

and calculating the conditional entropy of any code behavior type according to the first number, the second number, the third number, the fourth number, the fifth number and the sixth number.

5. The method according to claim 1, wherein the target behavior type feature set is obtained by a target behavior type feature set construction step, the target behavior type feature set construction step comprising:

for a target code behavior corresponding to each code behavior type in the initial behavior type feature set, referring to sample code behaviors recorded in the plurality of sample behavior record sets, and determining a code sample in which the target code behavior is recorded in a sample behavior record set in the plurality of code samples and using the code sample as a target code sample;

calculating a first percentage of non-malicious code samples in the target code sample to non-malicious code samples in the plurality of code samples;

calculating a second percentage of malicious code samples in the target code sample in malicious code samples in the plurality of code samples;

according to the first proportion and the second proportion, acquiring the contribution degree of each code behavior type to malicious code detection;

and screening the initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection to obtain the target behavior type feature set.

6. The method according to any one of claims 1 to 5, wherein the target behavior type feature set is updated by a target behavior type feature set updating step, the target behavior type feature set updating step comprising:

acquiring a behavior type feature set population, wherein the behavior type feature set population comprises at least one behavior type feature set;

performing iterative update processing of the code behavior type on the behavior type feature set in the behavior type feature set population until an update termination condition is reached, and obtaining a finally updated behavior type feature set population;

and screening out the target behavior type feature set from the finally updated behavior type feature set population.

7. The method according to claim 6, wherein the behavior type feature set population is obtained by a behavior type feature set population construction step, and the behavior type feature set population construction step comprises:

acquiring an initial behavior type feature set and a plurality of sample behavior record sets of a plurality of code samples;

for each code behavior type in the initial behavior type feature set, calculating the contribution degree of each code behavior type to malicious code detection according to the distribution of the code behavior corresponding to each code behavior type in the plurality of sample behavior record sets and the distribution of malicious code samples and non-malicious code samples in the plurality of code samples;

screening the initial behavior type feature set according to the contribution degree of each code behavior type to malicious code detection, and obtaining screened code behavior types;

and combining the screened code behavior types to obtain a behavior type characteristic set population.

8. The method according to claim 6, wherein the iteratively updating the code behavior type of the behavior type feature set in the behavior type feature set population comprises:

selecting at least part of behavior type characteristic sets in the behavior type characteristic set population to be updated in the iteration;

updating at least part of code behavior types in each selected behavior type feature set;

and detecting each behavior type feature set in the behavior type feature set population subjected to the updating processing of the code behavior type based on the trained detection model, and determining the behavior type feature set population subjected to the iteration updating processing according to a corresponding detection result.

9. The method according to claim 8, wherein the detecting each behavior type feature set in the behavior type feature set population after the update processing of the code behavior type based on the trained detection model, and determining the behavior type feature set population after the update processing of the current iteration according to the corresponding detection result comprises:

acquiring a sample behavior feature group set and a sample label set corresponding to each behavior feature group from the updated behavior feature group of the code behavior type, wherein each sample behavior feature group set is generated by referring to the corresponding behavior feature group;

inputting each sample behavior feature group in each sample behavior feature group set to the trained detection model to obtain a detection result set corresponding to each sample behavior feature group set;

and screening at least part of the behavior type feature sets from the behavior type feature set populations subjected to the updating processing of the code behavior types according to the detection result set and the sample label set corresponding to each behavior type feature set.

10. The method according to claim 9, wherein the step of screening at least part of behavior type feature sets from the behavior type feature set population subjected to the update processing of the code behavior type according to the detection result set and the sample label set corresponding to each behavior type feature set comprises:

acquiring a detection effect evaluation value of each behavior type feature set when the behavior type feature set is used for detecting malicious codes according to a detection result set and a sample label set corresponding to each behavior type feature set;

and screening a preset number of behavior type feature sets before the behavior type feature sets are sorted according to the descending order of the detection effect evaluation value in the behavior type feature set population subjected to the updating processing of the code behavior type according to the detection effect evaluation value of each behavior type feature set.

11. The method according to any one of claims 1 to 5, wherein the determining, with reference to the target behavior type feature set, a sample behavior type belonging to the target behavior type feature set based on a code behavior type of a sample code behavior recorded by the sample behavior recording set comprises:

reconstructing at least a part of sample code behaviors recorded in the sample behavior record set to obtain a reconstructed sample behavior type;

and referring to the target behavior type feature set, and determining a sample behavior type belonging to the target behavior type feature set in the reconstructed sample behavior type.

12. The method of claim 11, wherein reconstructing at least a portion of the sample code behavior recorded in the set of sample behavior records comprises:

acquiring a behavior reconstruction file, wherein the behavior reconstruction file is used for recording a code behavior type and a reconstruction processing mode required by reconstruction when a code behavior is reconstructed to the code behavior type included in the target behavior type feature set;

and under the condition that at least one part of sample code behaviors recorded in the sample behavior record set belong to the code behavior type required by reconstruction in the behavior reconstruction file, reconstructing the at least one part of sample code behaviors according to the corresponding reconstruction processing mode recorded in the behavior reconstruction file.

13. The method according to any one of claims 1 to 5, wherein the number of the detection models and the number of the target behavior type feature sets are multiple, and each detection model corresponds to one target behavior type feature set; after training the detection model based on the sample behavior feature set and the sample labels, the method further includes:

testing each trained detection model through a code sample to obtain a detection effect evaluation value of a corresponding target behavior type feature set;

and screening out a detection model for detecting the malicious codes from all the trained detection models according to the detection effect evaluation value of each target behavior type feature set.

14. A method for detecting malicious code, the method comprising:

acquiring a behavior record set of a target code, wherein the behavior record set records code behaviors generated after the target code runs;

referring to the target behavior type feature set, and determining a target behavior type belonging to the target behavior type feature set based on a code behavior type of a code behavior recorded by the behavior recording set;

and detecting malicious codes based on the behavior feature group through the trained detection model to obtain the malicious attributes of the target codes.

15. The method according to claim 14, wherein the determining, with reference to the target behavior type feature set, a target behavior type belonging to the target behavior type feature set based on a code behavior type of a code behavior recorded in the behavior record set comprises:

reconstructing at least one part of code behaviors recorded in the behavior record set to obtain a reconstructed behavior type;

and determining a target behavior type belonging to the target behavior type feature set in the reconstructed behavior type by referring to the target behavior type feature set.

16. The method of claim 15, wherein after obtaining malicious attributes of the target code by performing malicious code detection based on the behavior feature set through the trained detection model, the method further comprises:

pushing detection request information for detecting the malicious property of the target code outwards;

obtaining a returned checking result based on the checking request information, wherein the checking result is used for determining a sample label when the target code is used as a code training sample;

and retraining the detection model based on the code training sample as the target code and the corresponding sample label.

17. An apparatus for processing a malicious code detection model, the apparatus comprising:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a sample behavior record set of a code training sample, and the sample behavior record set records sample code behaviors generated after a corresponding code training sample runs;

the second acquisition module is used for acquiring a target behavior type characteristic set, wherein the target behavior type characteristic set comprises a code behavior type used for training a detection model;

the determining module is used for determining a sample behavior type belonging to the target behavior type feature set on the basis of the code behavior type of the sample code behavior recorded by the sample behavior record set by referring to the target behavior type feature set;

and the training module is used for acquiring a sample label for representing the malicious attribute of the code training sample and training the detection model based on the sample behavior feature group and the sample label.

18. An apparatus for detecting malicious code, the apparatus comprising:

the first acquisition module is used for acquiring a behavior record set of the target code, wherein the behavior record set records code behaviors generated after the target code runs;

the determining module is used for determining a target behavior type belonging to the target behavior type feature set on the basis of the code behavior type of the code behavior recorded by the behavior recording set by referring to the target behavior type feature set;

the generating module is used for generating a behavior feature group of the target code based on the target behavior type belonging to the target behavior type feature set;

and the detection module is used for detecting malicious codes based on the behavior feature group through the trained detection model to obtain the malicious attributes of the target codes.

19. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 16.

20. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 16.