CN114254321A

CN114254321A - Information detection method and device, electronic equipment and computer readable storage medium

Info

Publication number: CN114254321A
Application number: CN202111585926.8A
Authority: CN
Inventors: 梅凯; 白淳升
Original assignee: Antiy Technology Group Co Ltd
Current assignee: Antiy Technology Group Co Ltd
Priority date: 2021-12-20
Filing date: 2021-12-20
Publication date: 2022-03-29

Abstract

The invention provides an information detection method and device, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: identifying a target algorithm in the sample file; acquiring algorithm parameters and parameter addresses of the target algorithm at the algorithm addresses according to the algorithm addresses of the target algorithm in the sample file; simulating and operating the target algorithm based on the algorithm type, the algorithm parameters and the parameter address of the target algorithm to obtain a simulation result; and detecting whether the simulation result is abnormal or not by adopting a preset detection mode. The technical scheme of the invention can quickly and effectively detect the security of the sample file, is convenient for accurately identifying malicious codes and protects the security of a computer network.

Description

Information detection method and device, electronic equipment and computer readable storage medium

[ technical field ] A method for producing a semiconductor device

The present invention relates to the field of computer network security technologies, and in particular, to an information detection method and apparatus, an electronic device, and a computer-readable storage medium.

[ background of the invention ]

With the rapid development of the internet, the network security events are layered endlessly, and the computer network security is at risk. In this regard, malicious code may be detected by identifying malicious code signatures and/or by heuristic detection techniques. However, in the case of complicated situations such as dynamic acquisition of a function address by malicious codes and intentional confusion of function names by the malicious codes, it is difficult to effectively identify the malicious codes. In addition, in order to avoid the malicious code being detected, an attacker often disguises the malicious code through an encryption algorithm so as to avoid conventional security detection. However, if the situation is to be detected comprehensively, manual detection intervention is often required, which is time-consuming and labor-consuming.

Therefore, how to comprehensively and effectively detect malicious codes becomes a technical problem to be solved urgently at present.

[ summary of the invention ]

The embodiment of the invention provides an information detection method and device, electronic equipment and a computer readable storage medium, and aims to solve the technical problem that malicious codes cannot be comprehensively and effectively detected in a malicious code detection mode in the related technology.

In a first aspect, an embodiment of the present invention provides an information detection method, including: identifying a target algorithm in the sample file; acquiring algorithm parameters and parameter addresses of the target algorithm at the algorithm addresses according to the algorithm addresses of the target algorithm in the sample file; simulating and operating the target algorithm based on the algorithm type, the algorithm parameters and the parameter address of the target algorithm to obtain a simulation result; and detecting whether the simulation result is abnormal or not by adopting a preset detection mode.

In the above embodiment of the present invention, optionally, the target algorithm in the sample file is identified based on a preset algorithm feature library, where the algorithm feature library includes YARA rules, function entry feature values of the preset algorithm in an open source algorithm library, and association codes of the preset algorithm.

In the above embodiment of the present invention, optionally, before the identifying the target algorithm in the sample file based on the preset algorithm feature library, the method further includes: for each preset algorithm, generating YARA rules corresponding to the preset algorithm according to the algorithm characteristic value of the preset algorithm, the algorithm characteristics of the preset algorithm used by a known malicious software family and machine codes generated by the preset algorithm under different compiling conditions when the preset algorithm is open.

In the above embodiment of the present invention, optionally, the step of identifying the target algorithm in the sample file based on the preset algorithm feature library includes: detecting whether the sample file conforms to YARA rules in the algorithm feature library; when the sample file conforms to any YARA rule in the algorithm feature library, determining that the target algorithm in the sample file is a preset algorithm corresponding to the YARA rule; and/or; the step of identifying the target algorithm in the sample file based on the preset algorithm feature library comprises the following steps: detecting whether the sample file has a target characteristic value matched with the function entry characteristic value in the algorithm characteristic library; when a target characteristic value matched with the function entry characteristic value is detected, determining the preset algorithm to which the function entry characteristic value belongs as the target algorithm in the sample file; and/or; the step of identifying the target algorithm in the sample file based on the preset algorithm feature library comprises the following steps: simulating and executing the associated codes of the preset algorithm in the algorithm feature library on the sample file; and if the result obtained by executing the association code is valid data with identifiable content, determining the preset algorithm to which the association code belongs as the target algorithm in the sample file.

In the foregoing embodiment of the present invention, optionally, the detecting, by using a predetermined detection manner, whether the simulation result is abnormal includes: if the data type of the simulation result comprises a character string type, when the simulation result is detected to have a sensitive field, or when the weighted sum of a plurality of sensitive fields in the simulation result is detected to be greater than or equal to a specified threshold value, determining that the simulation result is abnormal.

In the foregoing embodiment of the present invention, optionally, the detecting, by using a predetermined detection manner, whether the simulation result is abnormal includes: if the data type of the simulation result comprises a code type, determining that the simulation result is abnormal when the simulation result is detected to comprise a preset malicious code feature code; if the data type of the simulation result comprises a code type, when the simulation result is detected to comprise a preset malicious code feature code and the simulation result comprises an encryption algorithm, executing the step of identifying the target algorithm in the sample file, so as to identify the encryption algorithm in the simulation result.

In the foregoing embodiment of the present invention, optionally, the step of detecting whether the simulation result is abnormal by using a predetermined detection manner includes: and if the data type of the simulation result comprises the PE type, taking the simulation result as a sample file, and executing the step of identifying the target algorithm in the sample file.

In the foregoing embodiment of the present invention, optionally, the step of detecting whether the simulation result is abnormal by using a predetermined detection manner includes: and if the data type of the simulation result comprises an address type, jumping to a target address indicated by the simulation result, and executing the step of identifying a target algorithm in the sample file by taking the data at the target address as the sample file.

In a second aspect, an embodiment of the present invention provides an information detecting apparatus, including: the target algorithm identification unit is used for identifying a target algorithm in the sample file; the disassembling unit is used for acquiring algorithm parameters and parameter addresses of the target algorithm at the algorithm addresses according to the algorithm addresses of the target algorithm in the sample file; the simulation detection unit is used for simulating and operating the target algorithm based on the algorithm type, the algorithm parameters and the parameter address of the target algorithm to obtain a simulation result; and the abnormity judging unit is used for detecting whether the simulation result is abnormal or not by adopting a preset detection mode.

In the above embodiment of the present invention, optionally, the method further includes: and the YARA rule generating unit is used for generating YARA rules corresponding to the preset algorithms according to the algorithm characteristic values of the preset algorithms, the algorithm characteristics of the preset algorithms used by known malicious software families and machine codes generated by the preset algorithms under different compiling conditions when the preset algorithms are open source, for each preset algorithm before the target algorithm identifying unit identifies the target algorithms in the sample file.

In the above embodiment of the present invention, optionally, the target algorithm identifying unit is configured to: detecting whether the sample file conforms to YARA rules in the algorithm feature library; when the sample file conforms to any YARA rule in the algorithm feature library, determining that the target algorithm in the sample file is a preset algorithm corresponding to the YARA rule; and/or; detecting whether the sample file has a target characteristic value matched with the function entry characteristic value in the algorithm characteristic library; when a target characteristic value matched with the function entry characteristic value is detected, determining the preset algorithm to which the function entry characteristic value belongs as the target algorithm in the sample file; and/or; simulating and executing the associated codes of the preset algorithm in the algorithm feature library on the sample file; and if the result obtained by executing the association code is valid data with identifiable content, determining the preset algorithm to which the association code belongs as the target algorithm in the sample file.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit is configured to: if the data type of the simulation result comprises a character string type, when the simulation result is detected to have a sensitive field, or when the weighted sum of a plurality of sensitive fields in the simulation result is detected to be greater than or equal to a specified threshold value, determining that the simulation result is abnormal.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit is configured to: if the data type of the simulation result comprises a code type, determining that the simulation result is abnormal when the simulation result is detected to comprise a preset malicious code feature code; if the data type of the simulation result comprises a code type, when the simulation result is detected to comprise a preset malicious code feature code and the simulation result comprises an encryption algorithm, executing the target algorithm identification unit, and identifying the encryption algorithm in the simulation result through the target algorithm identification unit.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit is configured to: and if the data type of the simulation result comprises the PE type, the simulation result is used as a sample file, the target algorithm identification unit is executed, and the target algorithm in the sample file is identified through the target algorithm identification unit.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit is configured to: and if the data type of the simulation result comprises an address type, jumping to a target address indicated by the simulation result, executing the target algorithm identification unit by taking the data at the target address as a sample file, and identifying the target algorithm in the sample file through the target algorithm identification unit.

In a third aspect, an embodiment of the present invention provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being arranged to perform the method of any of the first aspects above.

In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium storing computer-executable instructions for performing the method flow described in any one of the first aspect.

According to the technical scheme, aiming at the technical problem that the malicious codes cannot be comprehensively and effectively detected in the malicious code detection mode in the related technology, the malicious codes can be identified from the sample file, and simulation operation is performed outside the sample file to obtain a simulation result, so that whether the sample file is abnormal or not and whether the sample file comprises the malicious codes or not can be determined through judgment of the simulation result in the subsequent steps. Therefore, no matter whether the algorithm and the related data in the sample file are encrypted or not and how to encrypt, the security of the sample file can be detected in a mode of calling the algorithm and simulating operation, the situation that malicious codes cannot be effectively detected due to the limitation of the sample file is avoided, the security of the sample file can be quickly and effectively detected, the malicious codes can be accurately identified conveniently, and the security of a computer network is protected.

[ description of the drawings ]

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 shows a flow diagram of an information detection method according to one embodiment of the invention;

FIG. 2 shows a block diagram of an information detection apparatus according to an embodiment of the present invention;

FIG. 3 shows a block diagram of an electronic device according to an embodiment of the invention.

[ detailed description ] embodiments

For better understanding of the technical solutions of the present invention, the following detailed descriptions of the embodiments of the present invention are provided with reference to the accompanying drawings.

It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

Fig. 1 shows a flow chart of an information detection method according to an embodiment of the invention.

As shown in fig. 1, a flow of an information detection method according to an embodiment of the present invention includes:

step 102, identifying a target algorithm in the sample file.

In this embodiment, a target algorithm in a sample file is identified based on a preset algorithm feature library, and algorithm features of multiple algorithms are stored in the preset algorithm feature library, and the algorithms are easy to use by malicious codes and difficult to identify by a conventional malicious code detection method. These algorithms include, but are not limited to, AES, RC4, Blowfish, etc. encryption algorithms.

And matching and verifying the content of the sample file based on the algorithm features in the algorithm feature library, and being helpful for identifying whether the sample file uses the algorithm corresponding to the algorithm features. If the sample file is identified to include the algorithm features in the algorithm feature library, the sample file is determined to use the algorithm corresponding to the algorithm features, and the possibility of carrying malicious codes is provided, so that a further detection step can be performed to check the safety of the sample file.

The algorithm feature library comprises algorithm features such as YARA rules, function entry feature values of the preset algorithm in the open source algorithm library, and associated codes of the preset algorithm. Of course, before step 102, algorithm features corresponding to a plurality of preset algorithms may be set for the algorithm feature library.

In one possible design, the YARA rules are set for the algorithm feature library in a manner that includes: for each preset algorithm, generating YARA rules corresponding to the preset algorithm according to the algorithm characteristic value of the preset algorithm, the algorithm characteristics of the preset algorithm used by a known malicious software family and machine codes generated by the preset algorithm under different compiling conditions when the preset algorithm is open.

YARA rules are used to describe rules within an algorithm, while algorithms that are susceptible to being used by malicious code often have their own unique YARA rules, so taking YARA rules of algorithms that are susceptible to being used by malicious code as algorithm features helps to identify whether a sample file relates to the algorithm, thereby facilitating determination of whether the sample file contains malicious code.

For any preset algorithm required to be stored in the algorithm feature library, the YARA rule of the preset algorithm may be generated based on data, which can represent the actual rule of the preset algorithm, of the algorithm feature value of the preset algorithm, the algorithm feature of the preset algorithm used by a known malware family, and the machine code generated by the preset algorithm under different compiling conditions when the preset algorithm is open. The final YARA rule is composed of a series of character strings and a Boolean expression.

On the basis, the step 102 comprises: detecting whether the sample file conforms to YARA rules in the algorithm feature library; and when the sample file conforms to any YARA rule in the algorithm feature library, determining that the target algorithm in the sample file is a preset algorithm corresponding to the YARA rule.

When the target algorithm in the sample file is identified based on the preset algorithm feature library, the YARA rule can be used as one of the detection standards, specifically, whether the sample file meets the YARA rule in the algorithm feature library is detected, if any YARA rule in the algorithm feature library is met, the sample file is indicated to include the preset algorithm corresponding to the YARA rule, and at this time, the preset algorithm is determined as the target algorithm. Therefore, the YARA rules in the algorithm feature library can be used for effectively identifying the target algorithm which is easy to be used by the malicious codes in the sample file, and whether the sample file is abnormal or not and whether the sample file has the malicious codes or not can be further detected.

In one possible design, when the target algorithm in the sample file is identified based on a preset algorithm feature library, a function entry feature value of the preset algorithm in the open source algorithm library may be used as one of the detection criteria.

Specifically, step 102 includes: detecting whether the sample file has a target characteristic value matched with the function entry characteristic value in the algorithm characteristic library; when a target characteristic value matched with the function entry characteristic value is detected, determining the preset algorithm to which the function entry characteristic value belongs as the target algorithm in the sample file.

If the target characteristic value matched with the function entry characteristic value is detected, the sample file is indicated to have the algorithm characteristic given in the algorithm characteristic library, and the target algorithm corresponding to the algorithm characteristic is included. At the moment, the target algorithm which is easily used by the malicious code in the sample file is effectively identified, so that whether the sample file is abnormal or not and whether the sample file has the malicious code or not can be further detected.

In one possible design, when the target algorithm in the sample file is identified based on a preset algorithm feature library, the associated code of the preset algorithm may be used as one of the detection criteria.

Specifically, step 102 includes: simulating and executing the associated codes of the preset algorithm in the algorithm feature library on the sample file; and if the result obtained by executing the association code is valid data with identifiable content, determining the preset algorithm to which the association code belongs as the target algorithm in the sample file.

The association code of the preset algorithm refers to a code in the preset algorithm for executing a specific function, and the specific function is often easily utilized by malicious code, for example, the association code can be set to a code used for performing a majority operation. Therefore, when the sample file is detected to have the association code of the preset algorithm, the association code can be executed, and if the obtained content is not messy code, but valid data with identifiable content indicates that the association code is really used in the sample file, so that the sample file does have the preset algorithm corresponding to the association code. Thus, it is indicated that the sample file has a possibility of being utilized by malicious code to perform a specific function of the preset algorithm, and therefore, it is required to further detect whether the sample file is abnormal.

In the above, any one of algorithm features such as the function entry feature value of the preset algorithm in the YARA rule, the open source algorithm library, and the association code of the preset algorithm may be used as a condition for identifying the target algorithm in the sample file, or two or three of the algorithm features may be combined to be used as a condition for identifying the target algorithm in the sample file, so as to obtain a more accurate target algorithm identification result.

And 104, acquiring algorithm parameters and parameter addresses of the target algorithm at the algorithm addresses according to the algorithm addresses of the target algorithm in the sample file.

In the embodiment, the algorithm parameters and parameter addresses of the target algorithm are acquired at the algorithm addresses by using a disassembling engine. The disassembly engine is used to reverse the code using the assembly instructions, which is equivalent to decoding. Therefore, after the target algorithm is determined in the sample file, the content of the target algorithm at the algorithm address in the sample file can be processed by the disassembling engine, and the algorithm parameter and the parameter address of the target algorithm indicated by the content at the algorithm address can be analyzed.

And 106, simulating and operating the target algorithm based on the algorithm type, the algorithm parameters and the parameter address of the target algorithm to obtain a simulation result.

And 108, detecting whether the simulation result is abnormal or not by adopting a preset detection mode corresponding to the data type of the simulation result.

At this time, the algorithm type of the target algorithm is known, and the algorithm parameters and parameter addresses of the target algorithm are also known, so that the target algorithm can be simulated and run, and a simulation result of the running result of the target algorithm in the sample file is obtained.

For algorithms which are easy to use by malicious codes and difficult to be identified by a conventional malicious code detection mode, especially encryption algorithms, the algorithms can be identified from the sample file by the technical scheme, and simulation operation is performed outside the sample file to obtain a simulation result, so that whether the sample file is abnormal or not and whether the sample file comprises the malicious codes can be determined by judging the simulation result in subsequent steps. Therefore, no matter whether the algorithm and the related data in the sample file are encrypted or not and how to encrypt, the security of the sample file can be detected in a mode of calling the algorithm and simulating operation, the situation that malicious codes cannot be effectively detected due to the limitation of the sample file is avoided, the security of the sample file can be quickly and effectively detected, the malicious codes can be accurately identified conveniently, and the security of a computer network is protected.

In one possible design, if the data type of the simulation result includes a character string type, when the simulation result is detected to have a sensitive field, or when the weighted sum of a plurality of sensitive fields in the simulation result is detected to be greater than or equal to a specified threshold, it is determined that the simulation result is abnormal.

At this time, a field used by malicious code or used frequently, or a field obtained after execution of malicious code may be set as a sensitive field.

If the simulation result is detected to have the sensitive field, the simulation result can be determined to be from the malicious code, at the moment, the simulation result is determined to be abnormal, the sample file is determined to contain the malicious code, and the safety is insufficient.

In addition, due to the diversity of the fields, in some cases, the absolute abnormality of the simulation result cannot be explained by simply detecting a plurality of sensitive fields, at this time, a higher weight can be set for the sensitive fields with higher malicious code use frequency, then, the sensitive fields in the simulation result are weighted and summed, and if the sum is greater than or equal to a specified threshold, the simulation result is determined to be abnormal, and the sample file is determined to contain the malicious code, so that the safety is insufficient. And the specified threshold is the lowest value of the weighted sum of the sensitive fields when the simulation result is abnormal.

In a possible design, if the data type of the simulation result includes a code type, when it is detected that the simulation result includes a preset malicious code feature code, it is determined that the simulation result is abnormal. The preset malicious code feature codes are feature codes which are in the malicious codes or are carried by the malicious codes inevitably after the execution of the malicious codes, if the simulation result comprises the preset malicious code feature codes, the simulation result is indicated to be the malicious codes or is obtained by the execution of the malicious codes, so that the simulation result is determined to be abnormal, the sample file is determined to contain the malicious codes, and the safety is insufficient.

In another possible design, if the data type of the simulation result includes a code type, when it is detected that the simulation result includes a preset malicious code feature code and the simulation result includes an encryption algorithm, returning to the step of identifying a target algorithm in the sample file, so as to identify the encryption algorithm in the simulation result based on the preset algorithm feature library.

At this time, if the simulation result includes the preset malicious code feature code, it is indicated that the simulation result may be a malicious code or may be obtained by executing the malicious code. And then, entering a further judgment step, namely judging whether the simulation result comprises an encryption algorithm, if so, returning to the step 104, and reapplying the technical scheme to the encryption algorithm, namely, equivalently, taking the simulation result as a new sample file for verification until a conclusion that whether the simulation result is abnormal is obtained.

In a possible design, if the data type of the simulation result includes a pe (portable executable) type, the simulation result is used as a sample file, and the step of identifying the target algorithm in the sample file is returned.

Pe (portable executable) type files refer to portable executable files that have functions of continuing execution and migration, and are likely to be utilized by malicious code, since they are likely to be utilized by malicious code, there may be various algorithms such as encryption algorithms to perform the functions required by the malicious code. Therefore, when the data type of the simulation result includes the PE type, the process returns to step 104, and the simulation result is used as a new sample file to be verified until a conclusion is reached whether the simulation result is abnormal.

In a possible design, if the data type of the simulation result includes an address type, jumping to a target address indicated by the simulation result, taking data at the target address as a sample file, and returning to the step of identifying a target algorithm in the sample file.

And skipping to the target address indicated by the simulation result, wherein the target address may store a new file, and at this time, the data at the target address is used as a new sample file for verification until a conclusion that whether the simulation result is abnormal is obtained.

In conclusion, the security of the sample file can be detected in a mode of calling the content of the sample file out and performing simulation operation, the situation that malicious codes cannot be effectively detected due to the limitation of the sample file is avoided, the security of the sample file can be quickly and effectively detected, the malicious codes can be conveniently and accurately identified, and the computer network security is protected.

Fig. 2 shows a block diagram of an information detection apparatus according to an embodiment of the present invention.

As shown in fig. 2, an information detecting apparatus 200 according to an embodiment of the present invention includes: a target algorithm identifying unit 202 for identifying a target algorithm in the sample file; a disassembling unit 204, configured to obtain, according to the algorithm address of the target algorithm in the sample file, an algorithm parameter and a parameter address of the target algorithm at the algorithm address; a simulation detection unit 206, configured to perform simulation operation on the target algorithm based on the algorithm type, the algorithm parameter, and the parameter address of the target algorithm to obtain a simulation result; an anomaly determination unit 208, configured to detect whether the simulation result is abnormal by using a predetermined detection method.

In the above embodiment of the present invention, optionally, the method further includes: a YARA rule generating unit, configured to, for each preset algorithm, generate a YARA rule corresponding to the preset algorithm according to an algorithm feature value of the preset algorithm, an algorithm feature of the preset algorithm used by a known malware family, and machine codes generated by the preset algorithm under different compiling conditions when the preset algorithm is open, before the target algorithm identifying unit 202 identifies a target algorithm in a sample file; and/or; detecting whether the sample file conforms to YARA rules in the algorithm feature library; when the sample file conforms to any YARA rule in the algorithm feature library, determining that the target algorithm in the sample file is a preset algorithm corresponding to the YARA rule; and/or; detecting whether the sample file has a target characteristic value matched with the function entry characteristic value in the algorithm characteristic library; when a target characteristic value matched with the function entry characteristic value is detected, determining the preset algorithm to which the function entry characteristic value belongs as the target algorithm in the sample file.

In the above embodiment of the present invention, optionally, the target algorithm identifying unit 202 is configured to: simulating and executing the associated codes of the preset algorithm in the algorithm feature library on the sample file; and if the result obtained by executing the association code is valid data with identifiable content, determining the preset algorithm to which the association code belongs as the target algorithm in the sample file.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit 208 is configured to: if the data type of the simulation result comprises a character string type, when the simulation result is detected to have a sensitive field, or when the weighted sum of a plurality of sensitive fields in the simulation result is detected to be greater than or equal to a specified threshold value, determining that the simulation result is abnormal.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit 208 is configured to: if the data type of the simulation result comprises a code type, determining that the simulation result is abnormal when the simulation result is detected to comprise a preset malicious code feature code; if the data type of the simulation result comprises a code type, when the simulation result is detected to comprise a preset malicious code feature code and the simulation result comprises an encryption algorithm, returning to the target algorithm identification unit, and identifying the encryption algorithm in the simulation result through the target algorithm identification unit.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit 208 is configured to: if the data type of the simulation result comprises the PE type, the simulation result is used as a sample file, the sample file is returned to the target algorithm identification unit, and the target algorithm in the sample file is identified through the target algorithm identification unit.

In the foregoing embodiment of the present invention, optionally, the abnormality determining unit 208 is configured to: and if the data type of the simulation result comprises an address type, jumping to a target address indicated by the simulation result, taking the data at the target address as a sample file, returning to the target algorithm identification unit, and identifying the target algorithm in the sample file through the target algorithm identification unit.

The information detecting apparatus 200 uses the solution described in any of the above embodiments, and therefore, has all the technical effects described above, and is not described herein again.

FIG. 3 shows a block diagram of an electronic device of one embodiment of the invention.

As shown in FIG. 3, an electronic device 300 of one embodiment of the invention includes at least one memory 302; and a processor 304 communicatively coupled to the at least one memory 302; wherein the memory stores instructions executable by the at least one processor 304, the instructions being configured to perform the scheme described in any of the above embodiments. Therefore, the electronic device 300 has the same technical effects as any of the above embodiments, and will not be described herein again.

The electronic device of embodiments of the present invention exists in a variety of forms, including but not limited to:

(1) mobile communication devices, which are characterized by mobile communication capabilities and are primarily targeted at providing voice and data communications. Such terminals include smart phones (e.g., iphones), multimedia phones, functional phones, and low-end phones, among others.

(2) The ultra-mobile personal computer equipment belongs to the category of personal computers, has calculation and processing functions and generally has the characteristic of mobile internet access. Such terminals include PDA, MID, and UMPC devices, such as ipads.

(3) Portable entertainment devices such devices may display and play multimedia content. Such devices include audio and video players (e.g., ipods), handheld game consoles, electronic books, as well as smart toys and portable car navigation devices.

(4) The server is similar to a general computer architecture, but has higher requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because of the need of providing highly reliable services.

(5) And other electronic devices with data interaction functions.

In addition, an embodiment of the present invention provides a computer-readable storage medium, which stores computer-executable instructions for executing the method flow described in any of the above embodiments.

The technical scheme of the invention is described in detail in the above with reference to the drawings, and can be identified from the sample file and be subjected to simulation operation outside the sample file to obtain a simulation result, so that whether the sample file is abnormal or not and whether malicious codes are included can be determined by judging the simulation result in the subsequent steps. Therefore, no matter whether the algorithm and the related data in the sample file are encrypted or not and how to encrypt, the security of the sample file can be detected in a mode of calling the algorithm and simulating operation, the situation that malicious codes cannot be effectively detected due to the limitation of the sample file is avoided, the security of the sample file can be quickly and effectively detected, the malicious codes can be accurately identified conveniently, and the security of a computer network is protected.

It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.

In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions in actual implementation, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a Processor (Processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An information detection method, comprising:

identifying a target algorithm in the sample file;

acquiring algorithm parameters and parameter addresses of the target algorithm at the algorithm addresses according to the algorithm addresses of the target algorithm in the sample file;

simulating and operating the target algorithm based on the algorithm type, the algorithm parameters and the parameter address of the target algorithm to obtain a simulation result;

and detecting whether the simulation result is abnormal or not by adopting a preset detection mode.

2. The information detection method according to claim 1, wherein the target algorithm in the sample file is identified based on a preset algorithm feature library;

the algorithm feature library comprises YARA rules, function entry feature values of the preset algorithm in the open source algorithm library and associated codes of the preset algorithm.

3. The information detection method according to claim 2, wherein before the identifying the target algorithm in the sample file based on the preset algorithm feature library, the method further comprises:

for each preset algorithm, generating YARA rules corresponding to the preset algorithm according to the algorithm characteristic value of the preset algorithm, the algorithm characteristics of the preset algorithm used by a known malicious software family and machine codes generated by the preset algorithm under different compiling conditions when the preset algorithm is open.

4. The information detection method according to claim 3, wherein the step of identifying the target algorithm in the sample file based on the preset algorithm feature library comprises:

detecting whether the sample file conforms to YARA rules in the algorithm feature library;

when the sample file conforms to any YARA rule in the algorithm feature library, determining that the target algorithm in the sample file is a preset algorithm corresponding to the YARA rule;

and/or;

detecting whether the sample file has a target characteristic value matched with the function entry characteristic value in the algorithm characteristic library;

when a target characteristic value matched with the function entry characteristic value is detected, determining the preset algorithm to which the function entry characteristic value belongs as the target algorithm in the sample file;

and/or;

simulating and executing the associated codes of the preset algorithm in the algorithm feature library on the sample file;

and if the result obtained by executing the association code is valid data with identifiable content, determining the preset algorithm to which the association code belongs as the target algorithm in the sample file.

5. The information detection method according to claim 1, wherein the detecting whether the simulation result is abnormal by using a predetermined detection method includes:

if the data type of the simulation result comprises a character string type, when the simulation result is detected to have a sensitive field, or when the weighted sum of a plurality of sensitive fields in the simulation result is detected to be greater than or equal to a specified threshold value, determining that the simulation result is abnormal.

6. The information detection method according to claim 1, wherein the detecting whether the simulation result is abnormal by using a predetermined detection method includes:

if the data type of the simulation result comprises a code type, determining that the simulation result is abnormal when the simulation result is detected to comprise a preset malicious code feature code;

if the data type of the simulation result comprises a code type, when the simulation result is detected to comprise a preset malicious code feature code and the simulation result comprises an encryption algorithm, executing the step of identifying the target algorithm in the sample file, so as to identify the encryption algorithm in the simulation result.

7. The information detection method according to claim 1, wherein the detecting whether the simulation result is abnormal by using a predetermined detection method includes:

and if the data type of the simulation result comprises the PE type, taking the simulation result as a sample file, and executing the step of identifying the target algorithm in the sample file.

8. The information detection method according to claim 1, wherein the detecting whether the simulation result is abnormal by using a predetermined detection method includes:

and if the data type of the simulation result comprises an address type, jumping to a target address indicated by the simulation result, and executing the step of identifying a target algorithm in the sample file by taking the data at the target address as the sample file.

9. An information detecting apparatus, characterized by comprising:

the target algorithm identification unit is used for identifying a target algorithm in the sample file;

the disassembling unit is used for acquiring algorithm parameters and parameter addresses of the target algorithm at the algorithm addresses according to the algorithm addresses of the target algorithm in the sample file;

the simulation detection unit is used for simulating and operating the target algorithm based on the algorithm type, the algorithm parameters and the parameter address of the target algorithm to obtain a simulation result;

and the abnormity judging unit is used for detecting whether the simulation result is abnormal or not by adopting a preset detection mode.

10. An electronic device, comprising: at least one processor; and a memory communicatively coupled to the at least one processor;

wherein the memory stores instructions executable by the at least one processor, the instructions being arranged to perform the method of any of the preceding claims 1 to 8.

11. A computer-readable storage medium having stored thereon computer-executable instructions for performing the method flow of any of claims 1-8.