Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for detecting a malicious file, an electronic device, and a storage medium, which can perform security detection on a shortcut file.
In a first aspect, an embodiment of the present invention provides a method for detecting a malicious file, where the method includes: acquiring a shortcut file in a system; extracting a script file included in the shortcut file; acquiring address information included in the script file; and when the address information included by the script file is not stored in an address database, determining that the shortcut file is a malicious file, wherein legal address information is stored in the address database.
Optionally, the obtaining the shortcut file in the system includes: and traversing files in the system, and acquiring shortcut files in the system according to file header information of the files.
Optionally, the obtaining address information included in the script file includes: determining the type of the script file; preprocessing the script file according to the type of the script file to obtain an original script file; analyzing the syntactic structure of the original script file, and determining an operation rule of the original script file; according to the operation rule, correspondingly processing the original script file to obtain a processed original script file; and extracting address information from the processed original script file through a regular expression.
Optionally, the determining the type of the script file includes: performing keyword matching processing on the script file; when the script file comprises pre-stored keywords, determining the type of the script file according to the keywords contained in the script file, wherein the pre-stored keywords have a corresponding relationship with the type of the script file.
Optionally, the parsing the original script file, and determining the operation rule of the original script file includes: carrying out syntactic analysis and/or semantic analysis on the original script file to determine that the original script file can be subjected to character string addition operation; wherein, according to the operation rule, correspondingly processing the original script file to obtain a processed original script file comprises: and according to the operation capable of performing character string addition, combining the character strings capable of performing character addition operation in the original script file to obtain a processed original script file.
Optionally, the method further comprises: and when the address information included by the script file is stored in the address database, determining that the shortcut file is a legal file.
Optionally, the address information specifically includes a URL address, an IP address, or a domain name address.
In a second aspect, an embodiment of the present invention provides an apparatus for detecting a malicious file, which is applied to an electronic device, and the apparatus includes: the first acquisition unit is used for acquiring a shortcut file in the system; an extracting unit, configured to extract a script file included in the shortcut file; the second acquisition unit is used for acquiring the address information included by the script file; a first determining unit, configured to determine that the shortcut file is a malicious file if address information included in the script file is not stored in an address database, where legal address information is stored in the address database.
Optionally, the first obtaining unit is specifically configured to traverse files in the system, and obtain shortcut files in the system according to file header information of the files.
Optionally, the second obtaining unit includes: the type determining module is used for determining the type of the script file; the first processing module is used for preprocessing the script file according to the type of the script file to obtain an original script file; the second processing module is used for carrying out syntactic structure analysis on the original script file and determining an operation rule for the original script file; the third processing module is used for correspondingly processing the original script file according to the operation rule to obtain a processed original script file; and the address information extraction module is used for extracting address information from the processed original script file through a regular expression.
Optionally, the type determining module includes: the matching sub-module is used for performing keyword matching processing on the script file; and the type determining sub-module is used for determining the type of the script file according to the keywords included in the script file when the script file includes the pre-stored keywords, and the pre-stored keywords and the type of the script file have corresponding relations.
Optionally, the second processing module is specifically configured to perform syntax analysis and/or semantic analysis on the original script file, and determine that an operation of adding character strings can be performed in the original script file; the third processing module is specifically configured to, according to the operation capable of performing character string addition, perform merging processing on a character string capable of performing word addition operation in the original script file to obtain a processed original script file.
Optionally, the first determining unit is further configured to: and when the address information included by the script file is stored in the address database, determining that the shortcut file is a legal file.
Optionally, the address information specifically includes a URL address, an IP address, or a domain name address.
In a third aspect, an embodiment of the present invention provides an electronic device, where the electronic device includes: the device comprises a shell, a processor, a memory, a circuit board and a power circuit, wherein the circuit board is arranged in a space enclosed by the shell, and the processor and the memory are arranged on the circuit board; a power supply circuit for supplying power to each circuit or device of the electronic apparatus; the memory is used for storing executable program codes; the processor executes a program corresponding to the executable program code by reading the executable program code stored in the memory for performing the method of any one of the preceding claims.
In a fourth aspect, embodiments of the invention provide a computer readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement a method as described in any of the preceding.
According to the method, the device, the electronic equipment and the storage medium for detecting the malicious file, provided by the embodiment of the invention, the script file included in the shortcut file is extracted according to the acquired shortcut file in the system, and the address information included in the script file is acquired. When the address information included in the script file is not stored in the address database, the shortcut file is determined to be a malicious file, so that security detection can be performed on the shortcut file.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following describes a scheme provided in an embodiment of the present invention in detail with reference to fig. 1, where fig. 1 is a flowchart of a method for detecting a malicious file provided in an embodiment of the present invention, and an implementation subject in the embodiment of the present invention is an electronic device. The electronic device may be a terminal device, such as a personal computer, a desktop computer, or the like. The electronic device may also be a server. As shown in fig. 1, the method of this embodiment specifically includes the following steps:
and step 110, acquiring a shortcut file in the system.
In this embodiment, the file in the system may be a file to be detected that is loaded in the electronic device but is not running. The electronic device obtains files in at least one system.
It can be understood that when the electronic device does not detect a file in the system, the electronic device does not run the file to be detected.
In this embodiment, the file in each system includes header information, which can be used to determine the type of the file. After the electronic equipment acquires the file in the system, the file header information is acquired from the file in the system. According to the file header information, the electronic equipment determines whether the file is a shortcut file. If the file is a shortcut file, step 120 is performed.
If the file is not a shortcut file, the electronic device continues to acquire the next file for determination, i.e., step 110 is executed again.
And 120, extracting the script file included in the shortcut file.
In this embodiment, the malicious shortcut file generally includes a script file. If the file in the system is a shortcut file, the electronic device extracts the script file from the shortcut file, according to the determination of step 110. And if the electronic equipment does not extract the script file from the shortcut file, determining that the shortcut file is a normal shortcut file, namely a non-malicious shortcut file.
And step 130, acquiring address information included in the script file.
Step 140, when the address information included in the script file is not stored in the address database, the electronic device determines that the shortcut file is a malicious file.
In this embodiment, the script file may include address information. The electronic equipment acquires the address information from the script file and judges whether the address information is the same as the address information stored in the address database. If the script file is different from the malicious script file, the electronic equipment judges that the address information included in the script file is not stored in the address database, and determines that the script file including the address information is a malicious script file and the shortcut file including the script file is a malicious shortcut file. And the address database stores legal address information.
If the shortcut file is loaded and started by a subsequent user, the electronic equipment can be directly loaded and started.
It should be noted that, in this step, the address information specifically includes any one of a Uniform Resource Locator (URL) address, an IP address, or a domain name address.
The address database in this step is a white list address database, that is, the URL address, or the IP address, or the domain name address stored in the address database is legal address information. If the address information included in the script file is not stored in the address information base, the electronic equipment determines that the address information is a malicious address, the script file including the address information is a malicious script file, and the shortcut file including the script file is a malicious shortcut file.
In this embodiment, the electronic device extracts the script file included in the shortcut file according to the obtained shortcut file in the system, and obtains the address information included in the script file. And when the address information included in the script file is not stored in the address database storing legal address information, determining that the shortcut file is a malicious file, thereby being capable of carrying out security detection on the shortcut file.
The malicious file detection method provided by the invention can detect the malicious shortcut file and supplement the blind zone of the existing security software detection. Compared with the existing detection mode matching, the embodiment of the invention can accurately extract the embedded script file for the shortcut file. And the legality of the shortcut file is judged by identifying the address information contained in the shortcut file, a plurality of interference items are removed, and compared with the existing detection mode, the detection efficiency is improved.
Fig. 2 is a flowchart of another malicious file detection method provided in the embodiment of the present invention, and an implementation subject in the embodiment of the present invention is an electronic device. The electronic device may be a terminal device, such as a personal computer, a desktop computer, or the like. The electronic device may also be a server. As shown in fig. 2, the method of this embodiment specifically includes the following steps:
and 200, the electronic equipment acquires files in the system.
Step 201, according to the file header information of the files in the system, the electronic device determines whether the files in the system are shortcut files.
Step 202, if the file is the shortcut file, the electronic device obtains a script file included in the shortcut file.
In this embodiment, the implementation process of steps 200 to 202 is similar to that of steps 110 to 120 of the above method embodiment, and is not described here again.
And 203, the electronic equipment performs keyword matching processing on the script file.
In this embodiment, the electronic device has pre-stored common keywords for identifying the script file. For example, the keywords may specifically be: PowerShell, cmd, vbs, and the like. And the pre-stored keywords and the types of the script files have corresponding relations. The keyword matching processing of the script file by the electronic equipment specifically comprises the following steps: the electronic device identifies whether the script file includes pre-stored keywords. If the script file includes pre-stored keywords, go to step 204; if the script file does not include pre-stored keywords, step 205 is performed.
And 204, when the script file comprises pre-stored keywords, determining the type of the script file by the electronic equipment according to the keywords contained in the script file.
In this embodiment, according to the judgment in step 203, if the script file includes the pre-stored keyword, the electronic device determines the type of the script file according to the keyword included in the script file.
In one example, if the script file includes a pre-stored keyword that is PowerShell, the electronic device determines that the type of the script file is PowerShell.
Step 205, when the script file does not include the pre-stored keyword, the electronic device determines that the shortcut file is a legal file.
In this embodiment, according to the judgment of step 203, if the script file does not include the pre-stored keyword, the electronic device determines that the script file may not exist in the shortcut file. At this time, the electronic device determines that the shortcut file is a legitimate file.
And step 206, preprocessing the script file by the electronic equipment according to the type of the script file to obtain an original script file.
In this embodiment, after the electronic device determines the type of the script file, different script files are preprocessed according to different types to obtain an original script file.
Further, after the electronic device determines the type of the script file, the electronic device may also determine the language type of the script file. For example, if the type of the script file is PowerShell, the language type of the script file is also PowerShell.
The preprocessing in the embodiment of the present invention generally includes decrypting the encrypted script file to restore the original script file, and may also include decompressing the compressed script file to restore the original script file.
In one example, the electronic device determines the type of the script file to be PowerShell. The preprocessing operation corresponding to the type is to decrypt the operation encrypted by using fromsase 64String in the script file to obtain the original script file.
It is understood that the correspondence between the type of the script file and the preprocessing operation is prestored in the electronic device. That is, a certain type of script file corresponds one-to-one to a preprocessing operation.
And step 207, the electronic equipment analyzes the syntactic structure of the original script file and determines an operation rule of the original script file.
In this embodiment, as an example, the electronic device may perform lexical and semantic analysis on the original script file to obtain an operation rule for the original script file.
In one example, the PowerShell original script file is specifically:
$ip1=′175.2′
$ip2=′23.4.34′
$ip=$ip1+$ip2。
through lexical analysis of the original script file, the variables in the obtained code are as follows: $ ip1, $ ip2, $ ip, the resulting value is: '175.2','23.4.34'.
And performing semantic analysis on the original script file to obtain an operation rule of the original script file, namely the operation of combining the character strings.
And 208, according to the operation rule, the electronic equipment correspondingly processes the original script file to obtain a processed original script file.
In this embodiment, according to the operation rule, the electronic device performs corresponding processing on the original script file to obtain a processed original script file.
According to the foregoing example, if the operation rule obtained by the electronic device is an operation for merging character strings, the electronic device merges the character strings according to the operation rule, and the obtained processed original script file is:
$ip=′175.223.4.34′
it is to be understood that the operation rules are not limited to merging operations on strings. For example, the operation rule may also include a replacement operation of a character string, a deletion operation, and the like. The specific operation rule is determined according to the original script file.
It should be noted that there are some types of script files that do not need to be preprocessed, and in this case, the electronic device directly performs step 209 after determining the type of the script file, instead of performing steps 206 to 208.
In one example, when the types of the script files are VBScript and Javascript, the script languages corresponding to the two script files do not have a self-contained encryption and decryption function at present, and most of the script files are realized through the script files themselves. Therefore, step 209 can be performed directly without performing steps 206-208.
And 209, extracting address information from the processed original script file by the electronic equipment through a regular expression, and storing the address information.
In this embodiment, after obtaining the processed original script file, the electronic device extracts address information from the processed original script file through the regular expression, and stores the address information.
In one example, extracting, by using a regular expression, the IP address information specifically includes:
((25[0-5])|(2[0-4]\d)|(1\d\d)|([1-9]\d)|\d)(\.((25[0-5])|(2[0-4]\d)|(1\d\d)|([1-9]\d)|\d)){3}。
in one example, extracting, by using a regular expression, domain name address information specifically includes: [ a-zA-Z0-9] [ -a-zA-Z0-9] {0, 62} (\\\[ a-zA-Z0-9] [ -a-zA-Z0-9] {0, 62}) +? .
Step 210, when the address information included in the script file is not stored in the address database, the electronic device determines that the shortcut file is an illegal file, or is called a malicious file.
In this embodiment, the electronic device obtains the address information from the script file, and determines whether the address information is the same as the address information already stored in the address database. If the two types of the address information are different, the electronic equipment judges that the address information included by the script file is not stored in the address database, and determines that the script file including the address information is an illegal script file, and the shortcut file including the script file is an illegal file or a malicious file.
Step 211, when the address information included in the script file is stored in the address database, the electronic device determines that the shortcut file is a legal file.
In this embodiment, the electronic device extracts the script file included in the shortcut file according to the obtained shortcut file in the system, and obtains the address information included in the script file. And when the address information included in the script file is not stored in the address database storing legal address information, determining that the shortcut file is a malicious file, thereby being capable of carrying out security detection on the shortcut file.
The malicious file detection method provided by the invention can detect the malicious shortcut file and supplement the blind zone of the existing security software detection. Compared with the existing detection mode matching, the embodiment of the invention can accurately extract the embedded script file for the shortcut file. And the legality of the shortcut file is judged by identifying the address information contained in the shortcut file, a plurality of interference items are removed, and compared with the existing detection mode, the detection efficiency is improved.
Fig. 3 is a schematic structural diagram of a malicious file detection apparatus according to an embodiment of the present invention, and as shown in fig. 3, the malicious file detection apparatus according to the embodiment is applied to an electronic device, and the malicious file detection apparatus may include: a first acquisition unit 310, an extraction unit 320, a second acquisition unit 330, and a first determination unit 340.
The first obtaining unit 310 is configured to obtain a shortcut file in a system; an extracting unit 320 for extracting a script file included in the shortcut file; a second obtaining unit 330, configured to obtain address information included in the script file; a first determining unit 340, configured to determine that the shortcut file is a malicious file when the address information included in the script file is not stored in an address database, where legal address information is stored in the address database.
The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.
Fig. 4 is a schematic structural diagram of another malicious file detection device according to an embodiment of the present invention, as shown in fig. 4, based on the detection device structure shown in fig. 3, the detection device of this embodiment further includes a first obtaining unit 310, which is specifically configured to traverse files in the system and obtain shortcut files in the system according to file header information of the files.
Optionally, the second obtaining unit 330 includes: a type determining module 3301, a first processing module 3302, a second processing module 3303, a third processing module 3304, and an address information extracting module 3305.
The type determining module 3301 is configured to determine a type of the script file; the first processing module 3302 is configured to pre-process the script file according to the type of the script file to obtain an original script file; the second processing module 3303 is configured to perform syntactic structure analysis on the original script file, and determine an operation rule for the original script file; the third processing module 3304 is configured to perform corresponding processing on the original script file according to the operation rule, so as to obtain a processed original script file; the address information extraction module 3305 is configured to extract address information from the processed original script file through a regular expression.
Optionally, the type determining module 3301 may further include: a matching sub-module and a type determination sub-module. The matching sub-module is used for performing keyword matching processing on the script file; the type determining sub-module is used for determining the type of the script file according to the keywords included in the script file when the script file includes the pre-stored keywords, and the pre-stored keywords and the type of the script file have corresponding relations.
Optionally, the second processing module 3303 is specifically configured to perform syntax analysis and/or semantic analysis on the original script file, and determine that an operation of adding character strings can be performed in the original script file; the third processing module 3304 is specifically configured to, according to the operation capable of performing character string addition, perform merging processing on a character string capable of performing word addition operation in the original script file, to obtain a processed original script file.
Optionally, the first determining unit 340 is further configured to determine that the shortcut file is a legal file when the address information included in the script file is stored in the address database.
Optionally, the address information specifically includes a URL address, an IP address, or a domain name address.
The apparatus of this embodiment may be used to implement the technical solutions of the method embodiments shown in fig. 1 and fig. 2, and the implementation principles and technical effects are similar, which are not described herein again.
Correspondingly, the detection device for the malicious file provided by the embodiment of the invention can also be realized by another structure. Fig. 5 is a schematic structural diagram of an embodiment of an electronic device provided by the present invention, which can implement the processes of the embodiments shown in fig. 1-2 of the present invention, and as shown in fig. 5, the electronic device may include: a housing 41, a processor 42, a memory 43, a circuit board 44, and a power circuit 45. Wherein, the circuit board 44 is arranged inside the space enclosed by the housing 41, and the processor 42 and the memory 43 are arranged on the circuit board 44; a power supply circuit 45 for supplying power to each circuit or device of the electronic apparatus; the memory 43 is used for storing executable program code; the processor 42 executes a program corresponding to the executable program code by reading the executable program code stored in the memory 43, for executing the method described in the foregoing embodiment.
The specific execution process of the above steps by the processor 42 and the steps further executed by the processor 42 by running the executable program code may refer to the description of the embodiment shown in fig. 1-2 of the present invention, and are not described herein again.
The electronic device: the device for providing computing services, the electronic device comprises a processor, a hard disk, a memory, a system bus and the like, the electronic device is similar to a general computer architecture, but the device has high requirements on processing capability, stability, reliability, safety, expandability, manageability and the like because high-reliability services need to be provided.
Embodiments of the present invention also provide a computer-readable storage medium storing one or more programs, which are executable by one or more processors to implement the method described in the foregoing embodiments.
It should be noted that, in this document, terms such as "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. The term "comprising", without further limitation, means that the element so defined is not excluded from the group consisting of additional identical elements in the process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments.
In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof.
In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
For convenience of description, the above devices are described separately in terms of functional division into various units/modules. Of course, the functionality of the units/modules may be implemented in one or more software and/or hardware implementations of the invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.