CN117786685A - Virus identification method, device, computer equipment and storage medium - Google Patents

Virus identification method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN117786685A
CN117786685A CN202311611129.1A CN202311611129A CN117786685A CN 117786685 A CN117786685 A CN 117786685A CN 202311611129 A CN202311611129 A CN 202311611129A CN 117786685 A CN117786685 A CN 117786685A
Authority
CN
China
Prior art keywords
behavior
behaviors
operating system
network interaction
entropy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311611129.1A
Other languages
Chinese (zh)
Inventor
刘庆功
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan Inspur Data Technology Co Ltd
Original Assignee
Jinan Inspur Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan Inspur Data Technology Co Ltd filed Critical Jinan Inspur Data Technology Co Ltd
Priority to CN202311611129.1A priority Critical patent/CN117786685A/en
Publication of CN117786685A publication Critical patent/CN117786685A/en
Pending legal-status Critical Current

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention relates to the technical field of computers, and discloses a virus identification method, a device, computer equipment and a storage medium, wherein the method comprises the following steps: the method comprises the steps of obtaining the total number of behaviors of each of a plurality of system behaviors of an operating system. And acquiring the behavior times of the network interaction behavior in the current period, wherein the network interaction behavior is one of a plurality of system behaviors. And determining the network interaction frequency of the operating system according to the behavior times of the network interaction behavior in the current period and a preset time threshold. And determining the system behavior entropy of the operating system according to the total number of behaviors of each of the plurality of system behaviors, wherein the system behavior entropy is used for indicating the behavior confusion of the operating system. And determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy. And identifying whether viruses exist in the operating system according to the behavior complexity. The invention can discover viruses in time and avoid the problems of data loss and the like.

Description

Virus identification method, device, computer equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for identifying viruses, a computer device, and a storage medium.
Background
In the technical field of computers, virus killing software is generally adopted to monitor viruses, and when the viruses are monitored, virus killing treatment is carried out.
However, virus killing software is based on a virus library to prevent virus invasion, the update of viruses is often faster, the virus library is difficult to record updated viruses or virus characteristics in time, and therefore the viruses cannot be found in time, and the problems of data loss and the like are caused.
Disclosure of Invention
In view of the above, the present invention provides a virus identification method, apparatus, computer device and storage medium, so as to solve the problems of data loss caused by failure to find viruses in time.
In a first aspect, the present invention provides a method for identifying a virus, comprising:
acquiring the total number of behaviors of each of a plurality of system behaviors of an operating system;
acquiring the behavior times of network interaction behaviors in a current period, wherein the network interaction behaviors are one of a plurality of system behaviors;
determining the network interaction frequency of the operating system according to the behavior times of the network interaction behavior in the current period and a preset time threshold;
determining system behavior entropy of the operating system according to the total number of behaviors of each of a plurality of system behaviors, wherein the system behavior entropy is used for indicating the behavior confusion of the operating system;
Determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy;
and identifying whether viruses exist in the operating system according to the behavior complexity.
The virus identification method provided by the invention has the following advantages:
since most viruses invade computer devices for the purpose of obtaining benefits, for example, encrypting files by viruses requires users to pay money and then decrypt the files to the users, or directly steal the files, etc. In order to achieve the purpose, the virus operates the file through frequent system behaviors, and transmits the acquired related information through frequent network interactions, namely, the behavior confusion degree becomes high and the network interaction frequency becomes high. Therefore, the real-time system behavior characteristics (behavior complexity) can be determined through the real-time network interaction frequency and the system behavior entropy of the computer equipment, and further, whether viruses invade or not can be accurately identified through the system behavior characteristics. In addition, although the updating of viruses is quick, the behavior characteristics of the system after virus invasion are greatly different from those of the system in a normal working state, and the behavior characteristics of the system are unchanged. Therefore, the scheme grasps the key point design scheme and does not need to consider the problem of virus update.
In an alternative embodiment, the determining the system behavior entropy of the operating system according to the total number of behaviors of each of the plurality of system behaviors includes:
determining the total system behavior times of the operating system according to the total behavior times of each system behavior in a plurality of system behaviors, wherein the total system behavior times areC (i) is the total number of behaviors corresponding to the ith system behavior, i is a positive integer greater than or equal to 1, and k is a positive integer greater than or equal to 2;
determining a system call frequency corresponding to each system behavior according to the total number of behaviors of each system behavior and the total number of behaviors of the system, wherein the system call frequency corresponding to the ith system behavior is P (i) =ci/C;
determining a subsystem behavior entropy corresponding to a first system behavior according to the system call frequency corresponding to the first system behavior, wherein the subsystem behavior entropy corresponding to an ith system behavior is si=p (i) log P (i) and any one of a plurality of system behaviors of the first system behavior;
determining the system behavior entropy according to subsystem behavior entropy respectively corresponding to multiple system behaviors, wherein the system behavior entropy is
Specifically, since the virus invades the computer device to acquire the relevant information, the virus will frequently encrypt and decrypt the file, read the file, modify the file, and the like, and perform frequent network connection, process creation, network interaction, and the like for transmitting the relevant information. At this time, the system behavior performed by the computer device is confused with the normal operation process. In summary, the system behavior entropy is used as one of the conditions for judging whether viruses exist, so that the result of virus identification can be more accurate.
In an alternative embodiment, the determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy includes:
determining a behavior mode index of the operating system according to the network interaction frequency and the system behavior entropy;
and determining the behavior complexity according to the behavior mode index and the system behavior entropy.
In particular, because different viruses slightly differ in the specific manner in which data in a computer device is manipulated, however, network interaction behavior must be very frequent regardless of the virus intrusion. In addition, the network interaction frequency may reflect a network interaction mode of the operating system, and the system behavior entropy may reflect a current overall behavior mode of the operating system. Therefore, the behavior pattern of the operating system can be accurately represented by the network interaction behavior characteristics and the overall system behavior characteristics which are most indispensable to various system behaviors. Further, the behavior pattern is used as one of the conditions for judging whether viruses exist, so that the result of virus identification can be more accurate.
In an alternative embodiment, the identifying whether the virus exists in the operating system according to the behavior complexity includes:
determining whether the behavior complexity is greater than or equal to a preset behavior complexity threshold;
when the behavior complexity is determined to be greater than or equal to a preset behavior complexity threshold, identifying that viruses exist in the operating system;
or,
and when the behavior complexity is determined to be smaller than a preset behavior complexity threshold, identifying that no virus exists in the operating system.
Specifically, after most viruses invade the computer equipment, the files are operated through frequent system behaviors, and the acquired related information is transmitted through frequent network interactions, namely, the behaviors are high in confusion and the network interaction frequency is high. This is a major difference from the system behavior characteristics of computer devices in normal operating conditions. Therefore, through the real-time network interaction frequency and the system behavior entropy of the computer equipment, the real-time system behavior characteristics (behavior complexity) can be determined, and further, whether the operating system has viruses or not can be identified by comparing the calculated behavior complexity with a preset behavior complexity threshold.
In an optional implementation manner, the network interaction frequency of the operating system is determined according to the number of times of the network interaction behavior in the current period and a preset number of times threshold, and the following expression is adopted:
NFI=ΔN/N……(1)
wherein NFI is the network interaction frequency, Δn is the number of behaviors of the network interaction behavior in the current period, and N is the preset number threshold.
Specifically, after the virus invades the computer device, in order to acquire the relevant information, frequent network interaction is performed between the target end (the invaded virus initiating end) and the computer device so as to steal the relevant information, namely, after the virus invades the computer device, the frequency of network interaction behavior is increased. Therefore, the network interaction frequency is used as one of the conditions for judging whether viruses exist, so that the result of virus identification can be more accurate.
In an alternative embodiment, said determining a behavior pattern index of said operating system based on said network interaction frequency and said system behavior entropy,
the following expression is used:
NBD=α*NFI+β*S……(2)
wherein NBD is the behavior mode index, alpha is a first preset coefficient, NFI is the network interaction frequency, beta is a second preset coefficient, and S is the behavior entropy.
In particular, because different viruses slightly differ in the specific manner in which data in a computer device is manipulated, however, network interaction behavior must be very frequent regardless of the virus intrusion. In addition, the network interaction frequency may reflect a network interaction mode of the operating system, and the system behavior entropy may reflect a current overall behavior mode of the operating system. Therefore, the behavior pattern of the operating system can be accurately represented by the network interaction behavior characteristics and the overall system behavior characteristics which are most indispensable to various system behaviors. Further, the behavior pattern is used as one of the conditions for judging whether viruses exist, so that the result of virus identification can be more accurate.
In an alternative embodiment, said determining said behavioral complexity based on said behavioral pattern indicator and said system behavioral entropy,
the following expression is used:
RVAS=NBD+γ*NBD*S……(3)
RVAS is the behavior complexity, gamma is a third preset coefficient, NBD is the behavior mode index, and S is the system behavior entropy.
Specifically, in the above formula, the behavior complexity is mainly composed of the independent influence (NBD) of the two indexes of the network interaction frequency and the system behavior entropy, and the mutual influence (γ×nbd×s) of the two indexes of the network interaction frequency and the system behavior entropy. Therefore, by combining the independent influence of each index and the mutual influence between indexes, the behavior characteristics of the operating system can be more accurately determined.
In a second aspect, the present invention provides a virus identification device comprising:
the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring the total number of behaviors of each of a plurality of system behaviors of an operating system; acquiring the behavior times of network interaction behaviors in a current period, wherein the network interaction behaviors are one of a plurality of system behaviors;
the determining module is used for determining the network interaction frequency of the operating system according to the behavior times of the network interaction behavior in the current period and a preset time threshold; determining system behavior entropy of the operating system according to the total number of behaviors of each of a plurality of system behaviors, wherein the system behavior entropy is used for indicating the behavior confusion of the operating system; determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy;
and the identification module is used for identifying whether viruses exist in the operating system according to the behavior complexity.
In a third aspect, the present invention provides a computer device comprising: the virus identification device comprises a memory and a processor, wherein the memory and the processor are in communication connection, the memory stores computer instructions, and the processor executes the computer instructions, so that the virus identification method of the first aspect or any corresponding implementation mode is executed.
In a fourth aspect, the present invention provides a computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the virus identification method of the first aspect or any of its corresponding embodiments.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an operating system according to an embodiment of the present invention;
FIG. 2 is a flow chart of a virus identification method according to an embodiment of the present invention;
FIG. 3 is a flow chart of another virus identification method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the structure of an apparatus for a virus identification method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a hardware structure of a computer device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the field of computer technology, an operating system, for example, a Linux system, a Windows system, and the like, is generally installed on a computer device. The Linux system may include an application layer, a kernel layer, and the like, as shown in fig. 1. The kernel layer may be used to control hardware resources of the computer device, such as allocation of memory resources, processing resources of a coordination center processor (Central Processing Unit, CPU), etc. The application layer may be configured to respond to an operation instruction of a user, and directly process the operation instruction of the user. The application layer and the kernel layer interact through a system call function. The method provided by the embodiment of the invention is mainly executed by a kernel layer of the Linux system.
Viruses generally encrypt or steal files stored in a data center, and the problems of secret leakage, data loss and the like can be caused. Therefore, the computer equipment needs to monitor whether viruses exist in real time, and intercept processing is performed in time when the viruses are monitored.
The embodiment of the invention provides a virus identification method, which can accurately identify viruses and avoid the problem of data loss by determining whether viruses exist or not according to the behavior characteristics of an operating system.
In accordance with an embodiment of the present invention, a virus identification method embodiment is provided, it being noted that the steps shown in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown or described herein.
In this embodiment, a virus identification method is provided, which may be used in the above-mentioned computer device, such as a server, a terminal, etc., and fig. 2 is a flowchart of a virus identification method according to an embodiment of the present invention, and as shown in fig. 2, the flowchart includes the following steps:
step S201, the total number of behaviors of each of a plurality of system behaviors of an operating system is obtained.
The various system actions may include, among other things, opening, reading, modifying, deleting files stored in the computer device, as well as network connection actions, process creation actions, network interaction actions, etc.
Specifically, a technician may add hooks to related system behavior functions in an operating system in advance, where the hooks are Hook functions in kernel code, and may capture event information.
For example, for network interaction, all network packets may be sent sequentially through ndo _start_xmit () function and dev_queue_xmit () function, and all network packets may be received sequentially through napi_gro_receive () function, so a technician may add hooks to these functions to capture network interaction. For file open behavior, the technician may add hooks in the sys_open () function in the entry_SYSCALL_64 of the arch/x86/entry/common. C file; for network connection behavior, the technician may add hooks in the sys_connect () function in the entry_syscall_64 of the arch/x86/entry/common. C file; for process creation behavior, a technician may add hooks to the do_fork () function in the entry_SYSCALL_64 of the arch/x86/entry/common. C file.
In addition, a technician may set a plurality of global variables in a kernel layer of the operating system, where each global variable in the plurality of global variables is used to record the number of behaviors of a specific system behavior. Alternatively, the technician may set a global array in the kernel layer of the operating system, where the global array may record the correspondence between the system behavior and the behavior times. The computer device increments the number of actions corresponding to the system action by one each time the system action is captured in the system action function by the hook. For example, corresponding to the network interaction behavior example described above, each time the computer device captures a network interaction behavior through any of the three functions described above, the global variable corresponding to the network interaction behavior is incremented.
After each turn on of the computer device, the computer device may update the global variable or global array to an initial state (i.e., reset the number of actions to zero).
Thus, for each system behavior, the computer device may obtain the total number of behaviors corresponding to the system behavior from the global variable corresponding to the system behavior each time when the periodically triggered moment is reached. Or, each time a periodically triggered moment is reached, the computer device may obtain the total number of behaviors corresponding to all the system behaviors from the global array.
Step S202, the behavior times of the network interaction behavior in the current period are obtained.
Specifically, for the network interaction behavior, the technician may set a timer, and trigger every first preset time period, for example, the first preset time period may be one minute. The computer device may record a number of behaviors of the network interaction behavior that continues within a first preset duration from the trigger time.
Step S203, determining the network interaction frequency of the operating system according to the behavior times of the network interaction behavior in the current period and a preset time threshold.
Specifically, the preset number of times threshold may be a normal number of network interactions of the installed computer device within a second preset time period. The second preset time period is longer than the first preset time period. For example, the second preset duration may be one hour. Further, the ratio of the number of times of the network interaction behavior in the current period to the preset number of times threshold is determined as the network interaction frequency of the operating system.
In addition, for different situations, for example, different time periods, different types of computer devices, different networking states (for example, connecting a subnet, connecting an external network), and different operating systems, the number of times of the network interaction behavior in the normal working state is different, that is, the preset number of times threshold is different. Therefore, the technician needs to perform a testing process to determine the correspondence between the time period, the computer device model, the networking state, the operating system type, and the preset frequency threshold. In the process of executing the method, the computer equipment can select the corresponding preset frequency threshold value from the corresponding relation according to the current time period of the computer equipment, the machine type of the computer equipment, the networking state and the type of the operating system. Therefore, according to the characteristics and real-time conditions of the computer equipment, the corresponding preset frequency threshold is selected, and the network interaction frequency of the operating system can be more accurately determined.
Step S204, determining the system behavior entropy of the operating system according to the total number of behaviors of each of the plurality of system behaviors.
The system behavior entropy is used for indicating the behavior confusion of the operating system.
Specifically, the computer device may first determine the duty cycle of each system behavior in all system behaviors according to the total number of behaviors of each system behavior. Further, according to the corresponding duty ratio of each system behavior, the system behavior entropy of the operating system is determined.
Step S205, determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy.
Specifically, the computer device may directly sum the network interaction frequency and the system behavior entropy to obtain the behavior complexity of the operating system.
Step S206, identifying whether viruses exist in the operating system according to the behavior complexity.
Specifically, after calculating the behavior complexity, the computer device can determine whether the behavior complexity is within a preset range, if so, the computer device indicates that the operating system has viruses, and the computer device performs virus interception operation. The virus interception operation may include interrupting the network connection and other ongoing processes, and scanning the full disk data of the computer device for deletion of virus files. If not, the operating system is indicated to be free of viruses. In addition, the computer device may display a pop-window after virus interception to alert the user that a virus has been intercepted.
In the virus identification method provided in this embodiment, most viruses invade the computer device to obtain benefits, for example, the viruses encrypt the file, so that the user is required to pay money and then decrypt the file, or the file is directly stolen, etc. In order to achieve the purpose, the virus operates the file through frequent system behaviors, and transmits the acquired related information through frequent network interactions, namely, the behavior confusion degree becomes high and the network interaction frequency becomes high. Therefore, the real-time system behavior characteristics (behavior complexity) can be determined through the real-time network interaction frequency and the system behavior entropy of the computer equipment, and further, whether viruses invade or not can be accurately identified through the system behavior characteristics. In addition, although the updating of viruses is quick, the behavior characteristics of the system after virus invasion are greatly different from those of the system in a normal working state, and the behavior characteristics of the system are unchanged. Therefore, the scheme grasps the key point design scheme and does not need to consider the problem of virus update.
In this embodiment, a virus identification method is provided, which may be used in the above-mentioned computer device, such as a server, a terminal, etc., and fig. 3 is a flowchart of a virus identification method according to an embodiment of the present invention, and as shown in fig. 3, the flowchart includes the following steps:
Step S301, obtaining the total number of behaviors of each of the multiple system behaviors of the operating system.
Step S302, the behavior times of the network interaction behavior in the current period are obtained.
The specific processing of step S301 to step S302 may be similar to that of step S201 to step S202, and will not be described here.
Step S303, determining the network interaction frequency of the operating system according to the behavior times of the network interaction behavior in the current period and a preset time threshold.
Specifically, the above step S303 may employ the following expression:
NFI=ΔN/N……(1)
wherein NFI is network interaction frequency, Δn is the number of times of network interaction in the current period, and N is a preset number of times threshold.
Because the virus invades the computer equipment, in order to acquire the related information, the target end (the invaded virus initiating end) and the computer equipment can perform frequent network interaction so as to steal the related information, namely, after the virus invades the computer equipment, the frequency of the network interaction behavior of the virus can be improved. Thus, the network interaction frequency may be one of the conditions for determining whether a virus is present on the computer device.
Step S304, determining the system behavior entropy of the operating system according to the total number of behaviors of each system behavior in the plurality of system behaviors.
Specifically, the step S304 includes:
step S3041, determining the total system behavior times of the operating system according to the total behavior times of each system behavior in the plurality of system behaviors.
Specifically, the computer device may sum the total number of behaviors corresponding to all the system behaviors, to obtain the total number of system behaviors.
Step S3042, determining the system call frequency corresponding to each system behavior according to the total number of behaviors and the total number of behaviors.
Specifically, for each system behavior, the computer device may determine, as a system call frequency corresponding to the system behavior, a ratio of a total number of behaviors corresponding to the system behavior to a total number of system behaviors.
Step S3043, determining the subsystem behavior entropy corresponding to the first system behavior according to the system call frequency corresponding to the first system behavior.
Wherein the first system behavior is any one of a plurality of system behaviors.
Specifically, for each system behavior, the computer device may first calculate a logarithm of a system call frequency corresponding to the system behavior, and further determine a product of the logarithm and the system call frequency as a subsystem behavior entropy corresponding to the system behavior.
Step S3044, determining the system behavior entropy according to the subsystem behavior entropy corresponding to the various system behaviors.
Specifically, the computer device may sum the subsystem behavior entropies corresponding to all the system behaviors respectively to obtain the system behavior entropies of the operating system.
The above step S304 may employ the following expression:
P(i)=Ci/C…… (3)
Si=P(i)*logP(i)……(4)
wherein, C is the total system behavior times, C (i) is the total behavior times corresponding to the ith system behavior, i is a positive integer greater than 1 or equal to 1, k is a positive integer greater than or equal to 2, P (i) is the system call frequency corresponding to the ith system behavior, S is the system behavior entropy, and Si is the subsystem behavior entropy corresponding to the ith system behavior.
For example, for a group of data including three system behaviors, the three system behaviors are A, B and C, respectively, and the total number of behaviors corresponding to the a system behaviors is 20, the total number of behaviors corresponding to the b system behaviors is 30, the total number of behaviors corresponding to the C system behaviors is 50, and the total number of system behaviors is 100, and substituting formula (2) yields: s= - (20/100) log (20/100) - (30/100) log (30/100) - (50/100) log (50/100) =1.52.
Since the virus invades the computer device to acquire the relevant information, the virus will frequently encrypt and decrypt the file, read the file, modify the file, and perform frequent network connection, process creation, network interaction, and the like for transmitting the relevant information. At this time, the system behavior performed by the computer device is confused with the normal operation process. In summary, the degree of confusion in the system behavior of the operating system may be used as one of the conditions for determining whether a virus exists in the computer device.
Step S305, determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy.
Specifically, the step S305 includes:
step S3051, determining a behavior mode index of the operating system according to the network interaction frequency and the system behavior entropy.
In some possible implementations, the above step S3051 may use the following expression:
NBD=α*NFI+β*S……(6)
wherein NBD is a behavior mode index, alpha is a first preset coefficient, NFI is a network interaction frequency, beta is a second preset coefficient, and S is a system behavior entropy.
Because different viruses slightly differ in the specific operation mode of the data in the computer equipment, however, no matter which viruses invade, the network interaction behavior is necessarily very frequent. In addition, the network interaction frequency may reflect a network interaction mode of the operating system, and the system behavior entropy may reflect a current overall behavior mode of the operating system. Therefore, the behavior pattern of the operating system can be accurately represented by the network interaction behavior characteristics and the overall system behavior characteristics which are most indispensable to various system behaviors. Further, the behavior pattern is used as one of the conditions for judging whether viruses exist, so that the result of virus identification can be more accurate.
Because the behavior characteristics are different when the behavior of different computer devices is processed, for the computer devices of different models, technicians can determine a first preset coefficient and a second preset coefficient which accord with the characteristics of the computer devices of different models through multiple tests. Therefore, according to different situations, the behavior characteristics of the operating system can be more accurately represented by adopting different first preset coefficients and second preset coefficients.
Step S3052, determining the behavior complexity according to the behavior mode index and the system behavior entropy.
In some possible implementations, the above step S3052 may use the following expression:
RVAS=NBD+γ*NBD*S……(7)
RVAS is the behavior complexity, gamma is a third preset coefficient, NBD is a behavior mode index, and S is the system behavior entropy.
In the above formula (7), the behavior complexity is mainly composed of the independent influence (NBD, see formula (6)) of two indexes of the network interaction frequency and the system behavior entropy, and the mutual influence (γ×nbd×s) of the two indexes of the network interaction frequency and the system behavior entropy. Therefore, by combining the independent influence of each index and the mutual influence between indexes, the behavior characteristics of the operating system can be more accurately determined.
Because the behavior characteristics are different when the behavior of different computer devices is processed, for the computer devices of different models, technicians can determine a third preset coefficient which accords with the characteristics of the computer devices of different models through multiple tests. Therefore, for different situations, the behavior characteristics of the operating system can be more accurately represented by adopting different third preset coefficients.
Step S306, whether viruses exist in the operating system or not is identified according to the behavior complexity.
Specifically, the step S306 includes:
step S3061, determining whether the behavior complexity is greater than or equal to a preset behavior complexity threshold.
The preset behavior complexity may be the lowest behavior complexity of the determined operating system in the virus intrusion test process.
Specifically, the computer device may compare the behavior complexity calculated in the above process with a preset behavior complexity threshold to determine whether the operating system has a virus.
Because the behavior characteristics are different when the business processing is carried out between the computer devices of different machine types, for the different computer devices, technicians can determine the preset behavior complexity threshold which accords with the characteristics of the different computer devices through multiple tests. Therefore, for different situations, different preset behavior complexity thresholds are adopted, so that whether viruses exist in the operating system can be determined more accurately.
Step S3062, when the behavior complexity is determined to be greater than or equal to a preset behavior complexity threshold, identifying that viruses exist in the operating system.
Specifically, when the behavior complexity calculated in the above process is determined to be greater than or equal to the preset behavior complexity threshold, it is indicated that the behavior characteristics of the current operating system are greatly different from those of the normal operating state, and the behavior characteristics are most likely to be the behavior after virus intrusion. The virus interception operation may include interrupting the network connection and other ongoing processes, and scanning the full disk data of the computer device for deletion of virus files.
Step S3063, when the behavior complexity is determined to be smaller than the preset behavior complexity threshold, identifying that no virus exists in the operating system.
Specifically, when the behavior complexity calculated in the above process is determined to be smaller than the preset behavior complexity threshold, it is indicated that the difference between the behavior characteristics of the current operating system and the behavior characteristics in the normal working state is small, that is, the current operating system is working normally, and at this time, it can be determined that no virus exists in the operating system. According to the virus identification method provided by the embodiment, most viruses invade the computer equipment, the files are operated through frequent system behaviors, and the acquired related information is transmitted through frequent network interactions, so that the behavior confusion degree is high, and the network interaction frequency is high. This is a major difference from the system behavior characteristics of computer devices in normal operating conditions. Therefore, through the real-time network interaction frequency and the system behavior entropy of the computer equipment, the real-time system behavior characteristics (behavior complexity) can be determined, and further, whether the operating system has viruses or not can be identified by comparing the calculated behavior complexity with a preset behavior complexity threshold.
In some possible implementations, the specific process of identifying whether the operating system has a virus may further include:
and under the conditions of computer equipment of the same machine type, operating systems of the same type and the same networking state, carrying out multiple virus intrusion tests. The computer equipment can calculate the behavior complexity corresponding to each test according to the steps in each virus intrusion test process, and further uses the model, the type of an operating system, the networking state, the network interaction frequency, the system behavior entropy and the behavior complexity of the computer equipment corresponding to each test process as one piece of data. In this way, a plurality of pieces of data can be obtained. Further, inputting a plurality of pieces of data into the feature extraction model to obtain behavior features of the operating system after virus invasion under the same computer equipment, the same type of operating system and the same networking state. Different types of computer devices, different types of operating systems, different networking states may correspond to different behavioral characteristics. A library of behavioral characteristics can be obtained during the test. The behavior feature library may include a correspondence of a model of the computer device, a type of operating system, a networking state, and behavior features.
In the implementation process of the scheme, the computer equipment collects a piece of real-time data (also comprising the model of the computer equipment, the type of an operating system, the networking state, the network interaction frequency, the system behavior entropy and the behavior complexity), and obtains real-time behavior characteristics according to the characteristic extraction model. In addition, the computer equipment can select the corresponding behavior characteristics from the behavior characteristics library according to the model of the computer equipment, the type of the operating system and the real-time networking state. Finally, the computer device can calculate the similarity between the behavior features corresponding to the real-time data and the selected behavior features, identify that viruses exist in the operating system when the similarity is greater than or equal to a preset similarity threshold, and identify that viruses do not exist in the operating system when the similarity is less than or equal to the preset similarity threshold.
In this embodiment, a virus identification device is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
The present embodiment provides a virus identification device, as shown in fig. 4, including:
an obtaining module 401, configured to obtain a total number of behaviors of each of a plurality of system behaviors of an operating system; the behavior times of the network interaction behavior in the current period are obtained, wherein one of multiple system behaviors of the network interaction behavior is obtained;
a determining module 402, configured to determine a network interaction frequency of the operating system according to a number of behavior times of the network interaction behavior in a current period and a preset number of times threshold; determining system behavior entropy of the operating system according to the total number of behaviors of each system behavior in a plurality of system behaviors, wherein the system behavior entropy is used for indicating the behavior confusion of the operating system; determining the behavior complexity of an operating system according to the network interaction frequency and the system behavior entropy;
the identifying module 403 is configured to identify whether a virus exists in the operating system according to the behavior complexity.
In an alternative embodiment, the determining module 402 is configured to:
determining the total system behavior times of the operating system according to the total behavior times of each of the multiple system behaviors, wherein the total system behavior times are as followsC (i) is the total number of behaviors corresponding to the ith system behavior, i is a positive integer greater than or equal to 1, and k is a positive integer greater than or equal to 2;
Determining a system call frequency corresponding to each system behavior according to the total number of behaviors of each system behavior and the total number of behaviors of the system, wherein the system call frequency corresponding to the ith system behavior is P (i) =ci/C;
determining a subsystem behavior entropy corresponding to a first system behavior according to a system call frequency corresponding to the first system behavior, wherein the subsystem behavior entropy corresponding to an ith system behavior is Si=P (i) log P (i);
determining system behavior entropy according to subsystem behavior entropy respectively corresponding to multiple system behaviors, wherein the system behavior entropy is
In an alternative embodiment, the determining module 402 is configured to:
determining a behavior mode index of an operating system according to the network interaction frequency and the system behavior entropy;
and determining the behavior complexity according to the behavior mode index and the system behavior entropy.
In an alternative embodiment, the identification module 403 is configured to:
determining whether the behavior complexity is greater than or equal to a preset behavior complexity threshold;
when the behavior complexity is determined to be greater than or equal to a preset behavior complexity threshold, identifying that viruses exist in the operating system;
Or,
and when the behavior complexity is determined to be smaller than the preset behavior complexity threshold, identifying that no virus exists in the operating system.
In an alternative embodiment, according to the number of times of the network interaction behavior in the current period and a preset number of times threshold, determining the network interaction frequency of the operating system, and adopting the following expression:
NFI=ΔN/N……(1)
wherein NFI is network interaction frequency, Δn is the number of times of network interaction in the current period, and N is a preset number of times threshold.
In an alternative embodiment, the behavior mode index of the operating system is determined according to the network interaction frequency and the system behavior entropy,
the following expression is used:
NBD=α*NFI+β*S……(2)
wherein NBD is a behavior mode index, alpha is a first preset coefficient, NFI is network interaction frequency, beta is a second preset coefficient, and S is a behavior entropy.
In an alternative embodiment, the behavior complexity is determined based on the behavior pattern index and the system behavior entropy,
the following expression is used:
RVAS=NBD+γ*NBD*S……(3)
RVAS is the behavior complexity, gamma is a third preset coefficient, NBD is a behavior mode index, and S is the system behavior entropy. Further functional descriptions of the above respective modules and units are the same as those of the above corresponding embodiments, and are not repeated here.
The virus recognition means in this embodiment are presented in the form of functional units, here referred to as ASIC (Application Specific Integrated Circuit ) circuits, processors and memories executing one or more software or fixed programs, and/or other devices that can provide the above described functionality.
The embodiment of the invention also provides computer equipment, which is provided with the virus identification device shown in the figure 4.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a computer device according to an alternative embodiment of the present invention, as shown in fig. 5, the computer device includes: one or more processors 10, memory 20, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are communicatively coupled to each other using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the computer device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In some alternative embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 10 is illustrated in fig. 5.
The processor 10 may be a central processor, a network processor, or a combination thereof. The processor 10 may further include a hardware chip, among others. The hardware chip may be an application specific integrated circuit, a programmable logic device, or a combination thereof. The programmable logic device may be a complex programmable logic device, a field programmable gate array, a general-purpose array logic, or any combination thereof.
Wherein the memory 20 stores instructions executable by the at least one processor 10 to cause the at least one processor 10 to perform the methods shown in implementing the above embodiments.
The memory 20 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the computer device, etc. In addition, the memory 20 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some alternative embodiments, memory 20 may optionally include memory located remotely from processor 10, which may be connected to the computer device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Memory 20 may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as flash memory, hard disk, or solid state disk; the memory 20 may also comprise a combination of the above types of memories.
The computer device further comprises input means 30 and output means 40. The processor 10, memory 20, input device 30, and output device 40 may be connected by a bus or other means, for example in fig. 5.
The input device 30 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the computer apparatus, such as a touch screen, keypad, mouse, touch pad, pointer stick, etc. The output means 40 may comprise a display device or the like. Such display devices include, but are not limited to, liquid crystal displays, light emitting diodes, displays and plasma displays. In some alternative implementations, the display device may be a touch screen.
The embodiments of the present invention also provide a computer readable storage medium, and the method according to the embodiments of the present invention described above may be implemented in hardware, firmware, or as a computer code which may be recorded on a storage medium, or as original stored in a remote storage medium or a non-transitory machine readable storage medium downloaded through a network and to be stored in a local storage medium, so that the method described herein may be stored on such software process on a storage medium using a general purpose computer, a special purpose processor, or programmable or special purpose hardware. The storage medium can be a magnetic disk, an optical disk, a read-only memory, a random access memory, a flash memory, a hard disk, a solid state disk or the like; further, the storage medium may also comprise a combination of memories of the kind described above. It will be appreciated that a computer, processor, microprocessor controller or programmable hardware includes a storage element that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the methods illustrated by the above embodiments.
Although embodiments of the present invention have been described in connection with the accompanying drawings, various modifications and variations may be made by those skilled in the art without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope of the invention as defined by the appended claims.

Claims (10)

1. A method of virus identification, the method comprising:
acquiring the total number of behaviors of each of a plurality of system behaviors of an operating system;
acquiring the behavior times of network interaction behaviors in a current period, wherein the network interaction behaviors are one of a plurality of system behaviors;
determining the network interaction frequency of the operating system according to the behavior times of the network interaction behavior in the current period and a preset time threshold;
determining system behavior entropy of the operating system according to the total number of behaviors of each of a plurality of system behaviors, wherein the system behavior entropy is used for indicating the behavior confusion of the operating system;
determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy;
and identifying whether viruses exist in the operating system according to the behavior complexity.
2. The method of claim 1, wherein determining the system behavior entropy of the operating system based on the total number of behaviors of each of the plurality of system behaviors comprises:
determining the total system behavior times of the operating system according to the total behavior times of each system behavior in a plurality of system behaviors, wherein the total system behavior times areC (i) is the total number of behaviors corresponding to the ith system behavior, i is a positive integer greater than or equal to 1, and k is a positive integer greater than or equal to 2;
determining a system call frequency corresponding to each system behavior according to the total number of behaviors of each system behavior and the total number of behaviors of the system, wherein the system call frequency corresponding to the ith system behavior is P (i) =ci/C;
determining a subsystem behavior entropy corresponding to a first system behavior according to the system call frequency corresponding to the first system behavior, wherein the subsystem behavior entropy corresponding to an ith system behavior is si=p (i) log P (i) and any one of a plurality of system behaviors of the first system behavior;
subsystem rows respectively corresponding to multiple system behaviors Determining the system behavior entropy as entropy, wherein the system behavior entropy is
3. The method according to claim 1 or 2, wherein said determining the behavioral complexity of the operating system based on the network interaction frequency and the system behavioral entropy comprises:
determining a behavior mode index of the operating system according to the network interaction frequency and the system behavior entropy;
and determining the behavior complexity according to the behavior mode index and the system behavior entropy.
4. The method according to claim 1 or 2, wherein said identifying whether a virus is present in the operating system based on the behavioral complexity comprises:
determining whether the behavior complexity is greater than or equal to a preset behavior complexity threshold;
when the behavior complexity is determined to be greater than or equal to a preset behavior complexity threshold, identifying that viruses exist in the operating system;
or,
and when the behavior complexity is determined to be smaller than a preset behavior complexity threshold, identifying that no virus exists in the operating system.
5. The method according to claim 1 or 2, wherein the network interaction frequency of the operating system is determined according to the number of times the network interaction acts in the current period and a preset number of times threshold,
The following expression is used:
NFI=ΔN/N……(1)
wherein NFI is the network interaction frequency, Δn is the number of behaviors of the network interaction behavior in the current period, and N is the preset number threshold.
6. The method of claim 3, wherein said determining a behavior pattern index of said operating system based on said network interaction frequency and said system behavior entropy,
the following expression is used:
NBD=α*NFI+β*S……(2)
wherein NBD is the behavior mode index, alpha is a first preset coefficient, NFI is the network interaction frequency, beta is a second preset coefficient, and S is the behavior entropy.
7. The method of claim 6, wherein said determining said behavioral complexity is based on said behavioral pattern indicators and said system behavioral entropy,
the following expression is used:
RVAS=NBD+γ*NBD*S……(3)
RVAS is the behavior complexity, gamma is a third preset coefficient, NBD is the behavior mode index, and S is the system behavior entropy.
8. A virus identification device, the device comprising:
the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring the total number of behaviors of each of a plurality of system behaviors of an operating system; acquiring the behavior times of network interaction behaviors in a current period, wherein the network interaction behaviors are one of a plurality of system behaviors;
The determining module is used for determining the network interaction frequency of the operating system according to the behavior times of the network interaction behavior in the current period and a preset time threshold; determining system behavior entropy of the operating system according to the total number of behaviors of each of a plurality of system behaviors, wherein the system behavior entropy is used for indicating the behavior confusion of the operating system; determining the behavior complexity of the operating system according to the network interaction frequency and the system behavior entropy;
and the identification module is used for identifying whether viruses exist in the operating system according to the behavior complexity.
9. A computer device, comprising:
a memory and a processor, the memory and the processor being communicatively connected to each other, the memory having stored therein computer instructions, the processor executing the computer instructions to perform the virus identification method of any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the virus identification method of any one of claims 1 to 7.
CN202311611129.1A 2023-11-28 2023-11-28 Virus identification method, device, computer equipment and storage medium Pending CN117786685A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311611129.1A CN117786685A (en) 2023-11-28 2023-11-28 Virus identification method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311611129.1A CN117786685A (en) 2023-11-28 2023-11-28 Virus identification method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117786685A true CN117786685A (en) 2024-03-29

Family

ID=90395302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311611129.1A Pending CN117786685A (en) 2023-11-28 2023-11-28 Virus identification method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117786685A (en)

Similar Documents

Publication Publication Date Title
US9954882B2 (en) Automatic baselining of anomalous event activity in time series data
CN111178760B (en) Risk monitoring method, risk monitoring device, terminal equipment and computer readable storage medium
CN110162976B (en) Risk assessment method and device and terminal
US20210081539A1 (en) Inferring security incidents from observational data
EP3488346B1 (en) Anomaly detection using sequences of system calls
EP3707632B1 (en) Dynamic security policy
CN113489713A (en) Network attack detection method, device, equipment and storage medium
US11847216B2 (en) Analysis device, analysis method and computer-readable recording medium
JP2015153210A (en) User operation log recording method, its program, and device
CN111464513A (en) Data detection method, device, server and storage medium
CN112653693A (en) Industrial control protocol analysis method and device, terminal equipment and readable storage medium
CN110543756B (en) Device identification method and device, storage medium and electronic device
CN108156127B (en) Network attack mode judging device, judging method and computer readable storage medium thereof
US10318731B2 (en) Detection system and detection method
US11775653B2 (en) Security configuration determination
CN113419971B (en) Android system service vulnerability detection method and related device
CN112165498B (en) Intelligent decision-making method and device for penetration test
CN114006727A (en) Alarm correlation analysis method, device, equipment and storage medium
CN112769595A (en) Abnormality detection method, abnormality detection device, electronic device, and readable storage medium
CN117786685A (en) Virus identification method, device, computer equipment and storage medium
CN114666136A (en) Network attack behavior detection method and device
US20220207188A1 (en) Automatically Determining Storage System Data Breaches Using Machine Learning Techniques
CN113656314A (en) Pressure test processing method and device
WO2019017879A1 (en) Activity detection based on time difference metrics
JP7168010B2 (en) Action plan estimation device, action plan estimation method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination