CN116611068B - File scanning method based on confusion path, electronic equipment and storage medium - Google Patents

File scanning method based on confusion path, electronic equipment and storage medium Download PDF

Info

Publication number
CN116611068B
CN116611068B CN202310900470.2A CN202310900470A CN116611068B CN 116611068 B CN116611068 B CN 116611068B CN 202310900470 A CN202310900470 A CN 202310900470A CN 116611068 B CN116611068 B CN 116611068B
Authority
CN
China
Prior art keywords
path
file
confusion
folder
characteristic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310900470.2A
Other languages
Chinese (zh)
Other versions
CN116611068A (en
Inventor
张东旭
奚乾悦
孙洪伟
肖新光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Antiy Network Technology Co Ltd
Original Assignee
Beijing Antiy Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Antiy Network Technology Co Ltd filed Critical Beijing Antiy Network Technology Co Ltd
Priority to CN202310900470.2A priority Critical patent/CN116611068B/en
Publication of CN116611068A publication Critical patent/CN116611068A/en
Application granted granted Critical
Publication of CN116611068B publication Critical patent/CN116611068B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database

Abstract

The application provides a file scanning method, electronic equipment and storage medium based on a confusion path, and relates to the field of malicious file detection, wherein the method comprises the following steps: obtaining a confusion path list B; loading B into the memory space; in the memory space, an original path list A is restored according to the B, so that the A and the B exist in the memory space at the same time; transmitting the current path D to be scanned to a memory space to judge whether the path D belongs to A or B in the memory space; if D epsilon A, skipping the security detection of the file corresponding to D; the method can avoid leakage of the original path, ensure that the confusion path is not recognized, further reduce the calculation occupation during file scanning and improve the execution efficiency of file scanning.

Description

File scanning method based on confusion path, electronic equipment and storage medium
Technical Field
The present application relates to the field of malicious file detection, and in particular, to a method for scanning a file based on a confusion path, an electronic device, and a storage medium.
Background
At present, most antivirus software sets a trusted folder on a user terminal, a user places some trusted files in the trusted folder, and when the antivirus software scans viruses on the terminal, the files in the trusted folder can be skipped, so that the scanning efficiency of the antivirus software is improved.
However, in general, the path information of the trusted folder is stored in a plaintext form on the user terminal, and if an attacker obtains the path information of the trusted folder, a malicious file is placed in the trusted folder of the user terminal according to the path information of the trusted folder, so as to avoid scanning of antivirus software.
Disclosure of Invention
Based on the above, the application provides a file scanning method, electronic equipment and storage medium based on a confusion path, so as to solve the problem that the path information of a trust folder is easy to be acquired by an attacker, and the attacker places a malicious file in the trust folder of a user terminal according to the path information of the trust folder so as to avoid the scanning of virus-proof software, and adopts the following technical scheme:
according to a first aspect of the present application, there is provided a method of scanning a document based on a confusion path, the method comprising the steps of:
obtain the confusion path list b= (B) 1 ,B 2 ,…,B i ,…,B n ) I=1, 2, …, n; wherein B is i For the ith confusion path, n is the number of confusion paths; b (B) i According to the sum B i Obtaining a corresponding original path;
loading B into the memory space;
in the memory space, the original path list a= (a) is restored according to B 1 ,A 2 ,…,A i ,…,A n ) So that A and B exist in the memory space at the same time; wherein A is i Is B i A corresponding original path;
transmitting the current path D to be scanned to a memory space to judge whether the path D belongs to A or B in the memory space;
and if D epsilon A, skipping the security detection of the file corresponding to D.
Alternatively, B is obtained by:
obtain original path list a= (a 1 ,A 2 ,…,A i ,…,A n );
Traversing A, pair A i Performing confusion processing to obtain a confusion path list B= (B) 1 ,B 2 ,…,B i ,…,B n ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein B is i Is A i And a confusion path generated after confusion processing.
Alternatively, B i The method comprises the following steps of:
according to the preset character pair A i Splitting to obtain A i Corresponding substring list Z i =(Z i 1 ,Z i 2 ,…,Z i h ,…,Z i g(i) ) H=1, 2, …, g (i); wherein Z is i h Is A i The corresponding h sub-string, g (i) is A i The number of corresponding substrings; g (i) =q i +1,Q i Is A i The number of characters is preset;
if Z i 1 Identical to the preset character string, determining Z i a+1 Number of medium characters NUM i a+1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein a = 1,2, …, g (i) -1; z is Z i a+1 Is the (a+1) th substring;
according to a preset replacement character string list L corresponding to the preset character string 1 Will Z i a+1 Replaced by L 1 NUM of column a of (a) i a+1 Presetting a replacement character string to obtain B i ;L 1 The method comprises r columns, wherein each column comprises a plurality of preset replacement character strings; r=g (i) -1;
if Z i 1 Different from the preset character string, Z is set according to the preset replacement character mapping table i 1 The first character in the list is replaced by Z in a preset replacement character mapping table i 1 A preset replacement character corresponding to the first character in the list to obtain B i
Optionally, the method further comprises:
at B i Creating camouflage folder C under path i
At C i Build several camouflage files in to obtain C i Corresponding camouflage file list H i =(H i,1 ,H i,2 ,…,H i,j ,…,H i,f(i) ) J=1, 2, …, f (i); wherein H is i,j Is C i A j-th camouflage file in the file; f (i) is C i The number of internal camouflage files;
acquiring a first feature information list r= (R 1 ,R 2 ,…,R i ,…,R n ) A second characteristic information list G i =(G i,1 ,G i,2 ,…,G i,j ,…,G i,f(i) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is i To C i C obtained by extracting features i Corresponding first characteristic information G i,j To H i,j H obtained by extracting features i,j Corresponding second characteristic information; the first characteristic information is used for distinguishing each disguised folder, and the second characteristic information is used for distinguishing each disguised file.
Optionally, the method further comprises:
if D epsilon B, acquiring the number DM of files in the folder corresponding to D;
if dm=dp, extracting the features of the folder corresponding to D to obtain the first feature information R of the folder corresponding to D D The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, executing an alarm action; DP is the number of camouflage files under the confusion path corresponding to D;
if R is D ∉ R, an alarm action is performed.
Optionally, the method further comprises:
if R is D E, R, obtaining third characteristic information of all files in the folder corresponding to D to obtain a third characteristic information list F= (F) 1 ,F 2 ,…,F k ,…,F DM ) K=1, 2, …, DM; wherein F is k The third characteristic information of the kth file in the folder corresponding to the D is the same as the type of the third characteristic information;
traversing F, F k Comparing the second characteristic information of each camouflage file in the camouflage file corresponding to the D;
if F k Judging F if the second characteristic information of each disguised file in the disguised file folder corresponding to D is different k The corresponding file is an abnormal file.
Optionally, the first characteristic information is an MD5 value or a hash value; the second characteristic information is an MD5 value or a hash value.
Optionally, f (i) = ⌈ c+i+d ⌉, where c is a preset scaling factor, d is a preset base number of camouflage files, and ⌈ ⌉ is a preset rounding-up function.
According to another aspect of the present application, there is also provided a non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, the at least one instruction or the at least one program being loaded and executed by a processor to implement the above-described obfuscated path-based file scanning method.
According to another aspect of the present application, there is also provided an electronic device comprising a processor and the above-described non-transitory computer-readable storage medium.
The application has at least the following beneficial effects:
according to the file scanning method based on the confusion path, the confusion path list is stored in the user terminal, and the confusion path is obtained according to the original path, namely, the original path does not exist in the user terminal; therefore, when an attacker wants to acquire a trust path on the user terminal, the attacker cannot acquire the original path because the original path list (namely, the truly trusted file path) is not stored on the user terminal, and the attacker can acquire only the confusing path; thus, the method can avoid leakage of a file path (namely an original path) which is truly trusted.
Furthermore, when the file is scanned, the confusion path is loaded into the memory space, and then the confusion path is restored to obtain the original path, at the moment, the original path and the confusion path exist in the memory space at the same time, and the data in the memory space cannot be acquired by an attacker, so that the method can ensure that the original path cannot be leaked; meanwhile, the attacker cannot know the existence of the original path, so that the attacker cannot analyze that the confusion path is generated by the confusion processing of the original path, and the confusion path is prevented from being recognized.
Further, when the current path D to be scanned is scanned, firstly, D is sent to a memory space, and whether the D belongs to A or B is judged in the memory space; in the memory space, because the A and the B exist in a plaintext form, the judgment of whether the D is a trust path can be realized under the condition of ensuring the data safety without confusing the D; therefore, the calculation force occupation during file scanning is reduced, and the execution efficiency of file scanning is improved; meanwhile, if the D belongs to the original path, the file corresponding to the D is a trust file, and the original path is not acquired by an attacker on the user terminal, so that the security detection of all the files corresponding to the D is skipped; so as to further reduce the occupation of calculation force during the file scanning and improve the execution efficiency of the file scanning.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for scanning a file based on a confusion path according to an embodiment of the present application;
FIG. 2 is an application scenario diagram of a method for scanning a file based on a confusion path according to an embodiment of the present application;
fig. 3 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
It is noted that various aspects of the embodiments are described below within the scope of the following claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the present disclosure, one skilled in the art will appreciate that one aspect described herein may be implemented independently of any other aspect, and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. In addition, such apparatus may be implemented and/or such methods practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
Referring to fig. 1, a flowchart of a method for scanning a file based on a confusion path according to the present embodiment is provided, where the method includes the following steps:
s100, obtaining a confusion path list B= (B) 1 ,B 2 ,…,B i ,…,B n ) I=1, 2, …, n; wherein B is i For the ith confusion path, n is the number of confusion paths; b (B) i According to the sum B i The corresponding original path is obtained.
In this embodiment, for a user terminal, there may be a plurality of original paths, where the original paths are trusted paths, and for files in a folder corresponding to the trusted paths, antivirus software skips scanning of each file in the folder when scanning the file; it should be noted that, the confusion path and the confusion path list are actually existing on the user terminal, while the original path list is not existing on the user terminal; therefore, an attacker cannot acquire the original path information on the user terminal, and only can acquire the confusion path column information, so that the trust path information on the user terminal is prevented from being revealed.
S200, loading the B into the memory space.
In this embodiment, before the anti-virus software on the user terminal performs security detection on the path to be scanned, B needs to be loaded into the memory space.
S300, in the memory space, recovering an original path list A= (A) according to B 1 ,A 2 ,…,A i ,…,A n ) So that A and B exist in the memory space at the same time; wherein A is i Is B i The corresponding original path.
In this embodiment, the storage space on the user terminal does not store the original path list a, but restores a according to B in the memory space, and since the data attacker in the memory space cannot obtain the data attacker, the original path list a is not revealed in the memory space.
S400, the current path D to be scanned is sent to the memory space, so that whether the path D belongs to A or B is judged in the memory space.
In this embodiment, for the acquired path D to be scanned, D needs to be sent to a memory space, and whether D belongs to a or B is determined in the memory space; the judgment of the A is executed in the memory space, so that the data used in the whole judgment process can be prevented from being leaked, and the safety of the A is ensured; meanwhile, in the memory space, because A and B exist in a plaintext form, the judgment of D can be realized under the condition of ensuring the data safety without the operations of encryption, feature extraction and the like on D; therefore, the calculation force occupation during file scanning is reduced, and the execution efficiency of file scanning is improved.
S500, if D epsilon A, skipping the security detection of the file corresponding to D.
In this embodiment, a way of comparing each original path in D and a one by one may be adopted to determine whether D belongs to a; if D epsilon A, the D is represented as an original path, namely a true trust path, then all files corresponding to D can be judged as trust files, and the security detection process can be skipped for all files corresponding to D; therefore, the computational effort occupation of file scanning can be reduced, and the execution efficiency of file scanning can be improved.
In the file scanning method based on the confusion path, the confusion path list is stored in the user terminal, and the confusion path is obtained according to the original path, namely, the original path does not exist in the user terminal; therefore, when an attacker wants to acquire a trust path on the user terminal, the attacker cannot acquire the original path because the original path list (namely, the truly trusted file path) is not stored on the user terminal, and the attacker can acquire only the confusing path; thus, the method can avoid leakage of a file path (namely an original path) which is truly trusted.
Furthermore, when the file is scanned, the confusion path is loaded into the memory space, and then the confusion path is restored to obtain the original path, at the moment, the original path and the confusion path exist in the memory space at the same time, and the data in the memory space cannot be acquired by an attacker, so that the method can ensure that the original path cannot be leaked; meanwhile, the attacker cannot know the existence of the original path, so that the attacker cannot analyze that the confusion path is generated by the confusion processing of the original path, and the confusion path is prevented from being recognized.
Further, when the current path D to be scanned is scanned, firstly, D is sent to a memory space, and whether the D belongs to A or B is judged in the memory space; in the memory space, because the A and the B exist in a plaintext form, the judgment of whether the D is a trust path can be realized under the condition of ensuring the data safety without confusing the D; therefore, the calculation force occupation during file scanning is reduced, and the execution efficiency of file scanning is improved; meanwhile, if the D belongs to the original path, the file corresponding to the D is a trust file, and the original path is not acquired by an attacker on the user terminal, so that the security detection of all the files corresponding to the D is skipped; so as to further reduce the occupation of calculation force during the file scanning and improve the execution efficiency of the file scanning.
In an exemplary embodiment, B is obtained by:
s110, acquiring an original path list A= (A) 1 ,A 2 ,…,A i ,…,A n ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein A is i For the i-th original path, n is the number of original paths.
In this embodiment, for a user terminal, there may be a plurality of original paths, where the original paths are trusted paths, and for files in a folder corresponding to the trusted paths, antivirus software skips scanning of each file in the folder when scanning the file; it should be noted that, each trust path exists in reality on the user terminal, but the original path list a is not stored on the user terminal; thus, it can be ensured that the trust path information on the user terminal is not revealed.
S120, traversing A, for A i Performing confusion processing to obtain a confusion path list B= (B) 1 ,B 2 ,…,B i ,…,B n ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein B is i Is A i And a confusion path generated after confusion processing.
In the present embodiment, according to A i Path information of (a), pair A i And performing confusion processing, namely: changing the real original path into a preset path; it should be noted that, the confusion path is also actually present on the user terminal, and according to the confusion path, a corresponding folder under the confusion path can be queried, where the folder can be considered as controllable; even if an attacker acquires the information of the confusion path, the attacker inquires the folder under the confusion path and cannot inquire the folder under the original path.
Further, B i The method comprises the following steps of:
s111, according to the preset character pair A i Splitting to obtain A i Corresponding substring list Z i =(Z i 1 ,Z i 2 ,…,Z i h ,…,Z i g(i) ) H=1, 2, …, g (i); wherein Z is i h Is A i The corresponding h sub-string, g (i) is A i The number of corresponding substrings; g (i) =q i +1,Q i Is A i The number of characters is preset.
In this embodiment, the preset character may be "\", for example, splitting C: \program Files (x 86) \Microsoft. NET\RedistList, to obtain "C:", "Program Files (x 86)", "Microsoft. NET" and "RedistList"; z is Z i 1 Is "C:", Z i 2 For ProgrammFiles (x 86), Z i 3 Is Microsoft. NET, Z i 4 Is RedistList; g (i) is Q i The value of the correlation, typically g (i) =q i +1。
S112, if Z i 1 Identical to the preset character string, determining Z i a+1 Number of medium characters NUM i a+1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein a = 1,2, …, g (i) -1; z is Z i a+1 Is the (a+1) th substring.
In this embodiment, the preset character string is "C:", if Z i 1 If "C:", A is described i The probability that the corresponding folder is a system folder is high, and the system folder generally has obvious naming characteristics, wherein A is i The corresponding disguised folder is not suitable for being placed in another disc, thus, A can be avoided i The corresponding disguised folder is found.
S113, according to the preset replacement character string list L corresponding to the preset character string 1 Will Z i a+1 Replaced by L 1 NUM of column a of (a) i a+1 Presetting a replacement character string to obtain B i ;L 1 The method comprises r columns, wherein each column comprises a plurality of preset replacement character strings; r=g (i) -1.
In the present embodiment, L 1 Beginning in column 2, each column corresponding to A i Is a hierarchy of (a) i Is of the level A i Each substring of the character string is A i Corresponding to the hierarchy of the group; for example, "C" corresponds to a hierarchy, and "Program Files (x 86)" corresponds to a hierarchy; first, Z is determined i a+1 The corresponding column of the preset replacement character string is a, and then the NUM in the column a i a+1 The individual substring is determined as Z i a+1 Corresponding preset replacement character string, and using the preset sub character string to make Z i a+1 Replacement is carried out to obtain B i
In this embodiment, after confusion processing is performed on all the original paths, B may be generated, and after B is generated, a may be deleted from the storage space of the user terminal, so that leakage of the original paths caused by that a is stored in the user terminal and acquired by an attacker is avoided.
S114, if Z i 1 Different from the preset character string, Z is set according to the preset replacement character mapping table i 1 The first character in the list is replaced by Z in a preset replacement character mapping table i 1 A preset replacement character corresponding to the first character in the list to obtain B i
In this embodiment, the preset character string may be "C:" if Z i 1 Different from the preset character string, the character string indicates A i The corresponding folder is not the system folder, and at this time, Z is replaced only according to the preset character mapping table i 1 The first character in the list is replaced by a corresponding preset replacement character in the preset replacement character mapping table. For example, if Z, the preset replacement character having the mapping relation D in the preset replacement character mapping table is E i 1 The first character in the character string is D, D is replaced by E, Z i 1 The post substring is unchanged, thereby generating B i
In this embodiment, A i When the corresponding folder is not the system folder, only Z is needed to be added i 1 The first character in the list is replaced by other characters except C, so that the confusion purpose can be realized without aiming at Z i 1 The sub character strings are replaced, so that the overall calculated amount is small in the confusion process, and the execution efficiency of the confusion operation is improved; in addition, B is generated by the confusion method i Even if an attacker breaks B i As a confusing path, due to the fact that in pair A i When confusion is carried out, the character mapping table is replaced according to the preset scheme L 1 Preset Z i a+1 Is confused with the substitution rule of (c) if the attacker obtains B i Without presetting the replacement character mapping table, L 1 Preset Z i a+1 In the case of the substitution rule of (a), it is still impossible to perform the substitution of B i Reduction is performed, thereby, B can also be avoided i Is restored to ensure the safety of the original path.
Further, after the step S100 and before the step S200, the method further includes the following steps:
s130, at B i Creating camouflage folder C under path i
In this embodiment, after obtaining each confusion path, generating a corresponding disguised folder according to information of each confusion path; after the disguised folder is generated, when an attacker performs path analysis and verification on the confusion path, the attacker can inquire the disguised folder under the confusion path, namely, the confusion path corresponds to the folder which exists actually, and the attacker cannot find that the confusion path is false, so that the confusion path is prevented from being identified.
Furthermore, on the premise that the confusing path is not recognized, an attacker cannot perceive the fact that the original path exists, so that the establishment of the disguised folder can also ensure the safety of the original path.
S140, at C i Build several camouflage files in to obtain C i Corresponding camouflage file list H i =(H i,1 ,H i,2 ,…,H i,j ,…,H i,f(i) ) J=1, 2, …, f (i); wherein H is i,j Is C i A j-th camouflage file in the file; f (i) is C i Number of internal camouflage files.
In this embodiment, the number of camouflage files in each camouflage folder is different and is related to i, for example, f (i) = ⌈ c×i+d ⌉, where c is a preset scaling factor, d is a preset basic number of camouflage files, and ⌈ ⌉ is a preset rounding-up function.
When an attacker counts the number of files in the disguised folders, the number of disguised files in each disguised folder is different, so that the disguised files in the disguised folders are more in line with the actual situation, the authenticity of the confusion path is further improved, and the difficulty of recognizing the confusion path is improved; the masquerading file is a preset file, such as a text file, an executable file and the like.
The method comprises the steps that a preset number of disguised files are built in the disguised folders, and in the first aspect, files exist in each disguised folder, so that an attacker cannot doubt because the disguised folders are empty folders when analyzing the disguised folders; in the second aspect, the number of the disguised files in each disguised folder is unchanged under normal conditions, the number of the disguised files in the disguised folder can be used as a reference value for judging whether malicious files are implanted in the disguised folder, and whether the malicious files are implanted in the disguised folder is judged according to whether the number of the disguised files in the disguised folder is changed.
S150, obtaining a first characteristic information list R= (R) 1 ,R 2 ,…,R i ,…,R n ) A second characteristic information list G i =(G i,1 ,G i,2 ,…,G i,j ,…,G i,f(i) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is i To C i C obtained by extracting features i Corresponding first characteristic information G i,j To H i,j H obtained by extracting features i,j Corresponding second characteristic information; the first characteristic information is used for distinguishing each disguised folder, and the second characteristic information is used for distinguishing each disguised file.
In this embodiment, the number and types of the camouflage files established in each camouflage folder have a large difference, so that the first characteristic information, such as a hash value or an MD5 value, extracted from the camouflage folder under each confusion path is different, and the first characteristic information is extracted from the camouflage folder under each confusion path, so that R can be obtained; the second characteristic information, such as a hash value or an MD5 value, of each camouflage file is also different, and the second characteristic information is extracted from each camouflage file in each camouflage file to obtain R'; since the first feature information and the second feature information are not changed in normal conditions, each of the camouflage folders can be distinguished by the first feature information corresponding to each of the camouflage folders, and each of the camouflage files can be distinguished by the second feature information of each of the camouflage files.
Further, after the step S500, the method further includes the steps of:
s600, if D epsilon B, acquiring the number DM of files in the folder corresponding to D.
In this embodiment, an attacker may acquire the confusion path B, and after acquiring B, consider each confusion path in B as a true trust path, so when D e B, security detection needs to be performed on the folder corresponding to D; at this time, the number DM of files in the folder corresponding to D is first acquired.
S610, if dm=dp, extracting the features of the folder corresponding to D to obtain the first feature information R of the folder corresponding to D D The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, executing an alarm action; which is a kind ofIn the description, DP is the number of camouflage files under the confusion path corresponding to D.
In this embodiment, when an attacker implants a malicious file into a folder corresponding to D, two situations exist:
firstly, an attacker directly implants malicious files into the folder corresponding to the D, and the method can lead to the quantity of the files in the folder corresponding to the D to be changed; therefore, when DM is not equal to DP, it can be determined that the attacker has embedded malicious files in the folder corresponding to D, and at this time, an alarm action such as popup or detection of the folder corresponding to D is executed; therefore, whether malicious files are implanted in the folder corresponding to the D can be judged according to the number of the files in the folder corresponding to the D, and the characteristic information extraction operation with larger calculation effort is not required to be executed, so that the execution efficiency of file scanning is improved.
Secondly, an attacker uses malicious files to replace a certain file in the folder corresponding to D, so that the number of the files in the folder corresponding to D is unchanged, namely DM=DP; in this case, although the number of files in the folder corresponding to D is unchanged, since the malicious files implanted by the attacker are different from the disguised files to be replaced, the file replacing operation of the attacker changes the first characteristic information of the folder corresponding to D, so that the first characteristic information R of the folder corresponding to D can be obtained D Further judging whether the folder corresponding to the D is an abnormal folder or not; compared with the method for acquiring the characteristic information of each file in the folder corresponding to the D, the method can greatly reduce the calculation force occupation and improve the execution efficiency.
S620, if R D ∉ R, an alarm action is performed.
In the present embodiment, if R D ∉ R, which indicates that the folder corresponding to D has file variation, at this time, the folder corresponding to D can be judged to be an abnormal folder; when the folder corresponding to the D is an abnormal folder, the folder corresponding to the D can be detected entirely without scanning all files in the folder corresponding to the D, the purpose of clearing the abnormal files can be achieved, and the method can reduce calculation force occupation and improve execution efficiency.
Further, after determining that the folder corresponding to the D is an abnormal folder, security detection can be performed on each file in the folder corresponding to the D so as to determine the abnormal file in the folder corresponding to the D; therefore, the purpose of accurately determining the abnormal file can be achieved.
Further, after the step S620, the method further includes the steps of:
s630, if R D E, R, obtaining third characteristic information of all files in the folder corresponding to D to obtain a third characteristic information list F= (F) 1 ,F 2 ,…,F k ,…,F DM ) K=1, 2, …, DM; wherein F is k And the third characteristic information of the kth file in the folder corresponding to the D is the same as the second characteristic information in type.
In the present embodiment, if R D E, R, considering that the folder is not an abnormal folder, and in order to ensure that malicious files do not exist in the folder corresponding to D, further security detection is needed for all files in the folder corresponding to D; the second feature information and the third feature information may be MD5 values or hash values, so that the second feature information and the third feature information are the same type of feature information.
S640, traversing F, and comparing F k And comparing the second characteristic information of each disguised file in the disguised file folder corresponding to the D.
In this embodiment, each third feature information in F corresponds to a file to be detected uniquely, and each third feature information in F is compared with the second feature information of each camouflage file in the camouflage folder corresponding to D, so as to determine whether any third feature information in F belongs to the second feature information list corresponding to D.
S650, if F k Judging F if the second characteristic information of each disguised file in the disguised file folder corresponding to D is different k The corresponding file is an abnormal file.
In the present embodiment, if F k If the second characteristic information of each camouflage file in the camouflage folder corresponding to the D is different from the second characteristic information of each camouflage file in the camouflage folder corresponding to the DRepresents F k A second list of characteristic information not corresponding to D, i.e. F k The corresponding file is not a pre-established disguised file, and F can be judged k The corresponding file is an abnormal file; otherwise, all files in the folder corresponding to the D can be judged to be disguised files, namely malicious files are not implanted by an attacker in the folder corresponding to the D; therefore, the malicious files in the folder corresponding to the D can be detected through the method.
Optionally, the preset character string is 'C:'.
Optionally, the first characteristic information is an MD5 value or a hash value; the second characteristic information is an MD5 value or a hash value.
Optionally, f (i) =c×i+d, where c is a preset scaling factor, d is a preset basic number of camouflage files, and ⌈ ⌉ is a preset upper rounding function; thus, the number of the camouflage files in each camouflage folder is ensured to be different, and the camouflage folders are prevented from being recognized.
Further, when the path to be scanned D is scanned, the number DP of camouflage files in the camouflage folder under the corresponding confusion path in B needs to be determined, and when DP is determined, only the number of confusion paths in B, i.e. the value of i, needs to be determined, and then the corresponding DP can be directly determined according to f (i) =c×i+d; compared with the method of determining DP by traversing statistics, the method has higher execution efficiency.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
Embodiments of the present application also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. Referring to FIG. 2, the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Referring to fig. 3, an embodiment of the present application also provides an electronic device including a processor and the aforementioned non-transitory computer-readable storage medium.
An electronic device according to this embodiment of the application. The electronic device is merely an example, and should not impose any limitations on the functionality and scope of use of embodiments of the present application.
The electronic device is in the form of a general purpose computing device. Components of an electronic device may include, but are not limited to: the at least one processor, the at least one memory, and a bus connecting the various system components, including the memory and the processor.
Wherein the memory stores program code that is executable by the processor to cause the processor to perform steps according to various exemplary embodiments of the application described in the "exemplary methods" section of this specification.
The storage may include readable media in the form of volatile storage, such as Random Access Memory (RAM) and/or cache memory, and may further include Read Only Memory (ROM).
The storage may also include a program/utility having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus may be one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures.
The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any device (e.g., router, modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface. And, the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through a network adapter. The network adapter communicates with other modules of the electronic device via a bus. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Embodiments of the present application also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the application as described in the specification, when said program product is run on the electronic device.
While certain specific embodiments of the application have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the application. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the application. The scope of the application is defined by the appended claims.

Claims (9)

1. A method for scanning a document based on a confusion path, the method comprising the steps of:
obtain the confusion path list b= (B) 1 ,B 2 ,…,B i ,…,B n ) I=1, 2, …, n; wherein B is i For the ith confusion path, n is the number of confusion paths; b (B) i According to the sum B i Obtaining a corresponding original path;
at B i Creating camouflage folder C under path i
At C i Build several camouflage files in to obtain C i Corresponding camouflage file list H i =(H i,1 ,H i,2 ,…,H i,j ,…,H i,f(i) ) J=1, 2, …, f (i); wherein H is i,j Is C i A j-th camouflage file in the file; f (i) is C i The number of internal camouflage files;
acquiring a first feature information list r= (R 1 ,R 2 ,…,R i ,…,R n ) A second characteristic information list G i =(G i,1 ,G i,2 ,…,G i,j ,…,G i,f(i) ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is i To C i C obtained by extracting features i Corresponding first characteristic information G i,j To H i,j H obtained by extracting features i,j Corresponding second characteristic information; the first characteristic information is used for distinguishing each disguised folder, and the second characteristic information is used for distinguishing each disguised file; the first characteristic information and the second characteristic information are used for judging whether malicious files exist in the folder corresponding to the D;
loading B into the memory space;
in the memory space, the original path list a= (a) is restored according to B 1 ,A 2 ,…,A i ,…,A n ) So that A and B exist in the memory space at the same time; wherein A is i Is B i A corresponding original path;
transmitting the current path D to be scanned to a memory space to judge whether the path D belongs to A or B in the memory space;
if D epsilon A, skipping the security detection of the file corresponding to D;
and if D epsilon B, carrying out security detection on the folder corresponding to D.
2. The method of claim 1, wherein B is obtained by:
obtain original path list a= (a 1 ,A 2 ,…,A i ,…,A n );
Traversing A, pair A i Performing confusion processing to obtain a confusion path list B= (B) 1 ,B 2 ,…,B i ,…,B n ) The method comprises the steps of carrying out a first treatment on the surface of the Wherein B is i Is A i And a confusion path generated after confusion processing.
3. The method of claim 2, wherein B i The method comprises the following steps of:
according to the preset character pair A i Splitting to obtain A i Corresponding substring list Z i =(Z i 1 ,Z i 2 ,…,Z i h ,…,Z i g(i) ) H=1, 2, …, g (i); wherein Z is i h Is A i The corresponding h sub-string, g (i) is A i The number of corresponding substrings; g (i) =q i +1,Q i Is A i The number of characters is preset;
if Z i 1 Identical to the preset character string, determining Z i a+1 Number of medium characters NUM i a+1 The method comprises the steps of carrying out a first treatment on the surface of the Wherein a = 1,2, …, g (i) -1; z is Z i a+1 Is the (a+1) th substring;
according to the presetPreset replacement character string list L corresponding to character string 1 Will Z i a+1 Replaced by L 1 NUM of column a of (a) i a+1 Presetting a replacement character string to obtain B i ;L 1 The method comprises r columns, wherein each column comprises a plurality of preset replacement character strings; r=g (i) -1;
if Z i 1 Different from the preset character string, Z is set according to the preset replacement character mapping table i 1 The first character in the list is replaced by Z in a preset replacement character mapping table i 1 A preset replacement character corresponding to the first character in the list to obtain B i
4. The method of obfuscated path-based file scanning according to claim 1, further comprising:
if D epsilon B, acquiring the number DM of files in the folder corresponding to D;
if dm=dp, extracting the features of the folder corresponding to D to obtain the first feature information R of the folder corresponding to D D The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, executing an alarm action; DP is the number of camouflage files under the confusion path corresponding to D;
if R is D ∉ R, an alarm action is performed.
5. The method of obfuscated path based file scanning according to claim 4, further comprising:
if R is D E, R, obtaining third characteristic information of all files in the folder corresponding to D to obtain a third characteristic information list F= (F) 1 ,F 2 ,…,F k ,…,F DM ) K=1, 2, …, DM; wherein F is k The third characteristic information of the kth file in the folder corresponding to the D is the same as the type of the third characteristic information;
traversing F, F k Comparing the second characteristic information of each camouflage file in the camouflage file corresponding to the D;
if F k To D pair ofJudging F if the second characteristic information of each disguised file in the disguised file folder is different k The corresponding file is an abnormal file.
6. The method according to claim 4 or 5, wherein the first characteristic information is an MD5 value or a hash value; the second characteristic information is an MD5 value or a hash value.
7. The method of claim 1, wherein f (i) = ⌈ c+i+d ⌉, wherein c is a predetermined scaling factor, d is a predetermined base number of camouflage files, and ⌈ ⌉ is a predetermined rounding-up function.
8. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the obfuscated path-based file scanning method according to any of claims 1-7.
9. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 8.
CN202310900470.2A 2023-07-21 2023-07-21 File scanning method based on confusion path, electronic equipment and storage medium Active CN116611068B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310900470.2A CN116611068B (en) 2023-07-21 2023-07-21 File scanning method based on confusion path, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310900470.2A CN116611068B (en) 2023-07-21 2023-07-21 File scanning method based on confusion path, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116611068A CN116611068A (en) 2023-08-18
CN116611068B true CN116611068B (en) 2023-09-29

Family

ID=87684057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310900470.2A Active CN116611068B (en) 2023-07-21 2023-07-21 File scanning method based on confusion path, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116611068B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007123492A1 (en) * 2006-04-25 2007-11-01 Khee Seng Chua Method of safeguarding against malicious software (malware)
US7640583B1 (en) * 2005-04-01 2009-12-29 Microsoft Corporation Method and system for protecting anti-malware programs
CN109933986A (en) * 2019-03-08 2019-06-25 北京椒图科技有限公司 Malicious code detecting method and device
CN110881044A (en) * 2019-12-05 2020-03-13 北京宏达隆和科技有限公司 Computer firewall dynamic defense security platform
CN114692222A (en) * 2022-03-29 2022-07-01 马上消费金融股份有限公司 Image processing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7640583B1 (en) * 2005-04-01 2009-12-29 Microsoft Corporation Method and system for protecting anti-malware programs
WO2007123492A1 (en) * 2006-04-25 2007-11-01 Khee Seng Chua Method of safeguarding against malicious software (malware)
CN109933986A (en) * 2019-03-08 2019-06-25 北京椒图科技有限公司 Malicious code detecting method and device
CN110881044A (en) * 2019-12-05 2020-03-13 北京宏达隆和科技有限公司 Computer firewall dynamic defense security platform
CN114692222A (en) * 2022-03-29 2022-07-01 马上消费金融股份有限公司 Image processing method and device

Also Published As

Publication number Publication date
CN116611068A (en) 2023-08-18

Similar Documents

Publication Publication Date Title
US9104872B2 (en) Memory whitelisting
US8819835B2 (en) Silent-mode signature testing in anti-malware processing
KR101607951B1 (en) Dynamic cleaning for malware using cloud technology
CN107066883B (en) System and method for blocking script execution
US7478428B1 (en) Adapting input to find integer overflows
US10997307B1 (en) System and method for clustering files and assigning a property based on clustering
US8959624B2 (en) Executable download tracking system
CA2545916A1 (en) Apparatus method and medium for detecting payload anomaly using n-gram distribution of normal data
US20160036832A1 (en) System, method and computer program product for sending information extracted from a potentially unwanted data sample to generate a signature
CN116303290B (en) Office document detection method, device, equipment and medium
US11288368B1 (en) Signature generation
US9544360B2 (en) Server-based system, method, and computer program product for scanning data on a client using only a subset of the data
JP2016029567A (en) Detecting malicious code
CN116305129B (en) Document detection method, device, equipment and medium based on VSTO
JP7320462B2 (en) Systems and methods for performing tasks on computing devices based on access rights
CN116611068B (en) File scanning method based on confusion path, electronic equipment and storage medium
CN116566739A (en) Security detection system, electronic equipment and storage medium
CN116015861A (en) Data detection method and device, electronic equipment and storage medium
CN115495740A (en) Virus detection method and device
US11436331B2 (en) Similarity hash for android executables
CN111240696A (en) Method for extracting similar modules of mobile malicious program
US20240089270A1 (en) Detecting malicious behavior from handshake protocols using machine learning
CN116305291B (en) Office document secure storage method, device, equipment and medium
RU108870U1 (en) SYSTEM FOR INCREASING THE NUMBER OF DETECTIONS OF MALICIOUS OBJECTS
CN116708008A (en) Method for determining malicious files in transformer substation system, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant