WO2013017004A1 - 文件的扫描方法、系统、客户端及服务器 - Google Patents

文件的扫描方法、系统、客户端及服务器 Download PDF

Info

Publication number
WO2013017004A1
WO2013017004A1 PCT/CN2012/078387 CN2012078387W WO2013017004A1 WO 2013017004 A1 WO2013017004 A1 WO 2013017004A1 CN 2012078387 W CN2012078387 W CN 2012078387W WO 2013017004 A1 WO2013017004 A1 WO 2013017004A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
scanned
scanning
scan
attribute value
Prior art date
Application number
PCT/CN2012/078387
Other languages
English (en)
French (fr)
Inventor
梅书慧
梁安武
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to US14/130,665 priority Critical patent/US9069956B2/en
Priority to EP12820533.3A priority patent/EP2741227B1/en
Priority to RU2014102898/08A priority patent/RU2581560C2/ru
Priority to BR112014002425-1A priority patent/BR112014002425B1/pt
Publication of WO2013017004A1 publication Critical patent/WO2013017004A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/567Computer malware detection or handling, e.g. anti-virus arrangements using dedicated hardware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/564Static detection by virus signature recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/565Static detection by checking file integrity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2101Auditing as a secondary aspect
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2139Recurrent verification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Definitions

  • the present invention relates to data processing technologies, and in particular, to a file scanning method, system, client, and server.
  • the files used by people can be downloaded from the Internet, or they can be obtained from mobile storage media. It is obtained by establishing a connection with other users to realize mutual transmission. Therefore, for the user, there is a possibility that a file obtained by various channels and a terminal device such as a computer or a mobile phone used by the user may have a suspicious file that is harmful. High, which in turn leads to a large flood of virus program files and Trojan files in suspicious files, which is a serious hazard to users' use files.
  • the client engine is the antivirus engine, and the virus stored in the local virus database is used to find suspicious files.
  • the feature code is limited, and the number of virus program files and Trojan files that exist in various files is increasing rapidly, far exceeding the update speed of the local virus database, so that the local virus database can only passively speed up the update.
  • a method for scanning a file comprising the following steps:
  • Corresponding relationship between the file to be scanned, the attribute value, and the category is formed according to the feature code that is consistent with the attribute value and the category to which the feature code belongs, and the corresponding relationship is recorded in the first scan result.
  • a method for scanning a file comprising the following steps:
  • a file scanning method comprising:
  • Corresponding relationship between the file to be scanned, the attribute value, and the category is formed according to the feature code that is consistent with the attribute value and the category to which the feature code belongs, and the corresponding relationship is recorded in the first scan result.
  • a file scanning system including a client and a server
  • the client includes:
  • An enumeration module for enumerating files to be scanned
  • An attribute value obtaining module configured to acquire an attribute value of the file to be scanned one by one from the enumerated file to be scanned, and transmit the attribute value to the server;
  • the server includes:
  • a database for storing a signature code and a category to which the signature code belongs
  • a comparison module configured to compare the attribute value with the stored feature code, to obtain a feature code consistent with the attribute value and a category to which the feature code belongs;
  • a correspondence forming module configured to form a correspondence between the file to be scanned, an attribute value, and a category according to a feature code that is consistent with the attribute value and a category to which the feature code belongs, and record the correspondence relationship In the first scan result.
  • a scanning client for files including:
  • An enumeration module for enumerating files to be scanned
  • an attribute value obtaining module configured to acquire attribute values of the file to be scanned one by one from the enumerated files to be scanned, and transmit the attribute values to the server.
  • a scanning server for files including:
  • a database for storing a signature code and a category to which the signature code belongs
  • a comparison module configured to compare the attribute value of the file to be scanned with the stored feature code, to obtain a feature code consistent with the attribute value and a category to which the feature code belongs;
  • a correspondence forming module configured to form a correspondence between the file to be scanned, an attribute value, and a category according to a feature code that is consistent with the attribute value and a category to which the feature code belongs, and record the correspondence relationship In the first scan result.
  • the scanning method, system, client and server of the above file transmit the attribute value of the file to be scanned to the server, and compare the signature and the category stored in the server to realize the security and danger of the file.
  • Identification because the server that breaks through the storage limit of the client can store a large number of signatures, and the server can update the signatures in the fastest and most timely manner, so that the signatures in the server are comprehensive, which greatly improves the scanning efficiency of the files.
  • FIG. 1 is a flow chart of a method for scanning a file in an embodiment
  • FIG. 2 is a flow chart of a method of scanning a file in another embodiment
  • FIG. 3 is a flow chart of a method of scanning a file in another embodiment
  • FIG. 4 is a flow chart of a method of scanning a file in another embodiment
  • FIG. 5 is a flowchart of a method for determining a file to be scanned for performing local scanning according to a first scan result in FIG. 4;
  • FIG. 6 is a schematic structural diagram of a scanning system of a file in an embodiment
  • FIG. 7 is a schematic structural diagram of a client in an embodiment
  • FIG. 8 is a schematic structural diagram of a scan file determining module in FIG. 7;
  • FIG. 9 is a schematic structural diagram of a client in another embodiment.
  • FIG. 10 is a schematic structural diagram of a client in another embodiment
  • FIG. 11 is a schematic structural diagram of a client in another embodiment.
  • FIG. 1 shows a method flow of file scanning in an embodiment, including the following steps:
  • Step S110 enumerating the files to be scanned.
  • the user when the scan engine of the virus killing software or the Trojan killing software is turned on, the user generates a scan request through the scan engine's killing page, and sends the generated scan request by the IPC module (interprocess communication module) to The underlying hardware of the system, and then the scan request is sent to the server through the underlying hardware of the system, and the scan engine and the server obtain the file to be scanned through the received scan request, so as to target the scanned file according to the scan request.
  • Ground scan The IPC module is defined between the scan engine's killing page and the underlying hardware, and is used to implement communication between the killing page and the underlying hardware, thereby implementing network connectivity between the scanning engine and the server.
  • the scan request includes a task ID, a scan hierarchy, and a method of enumerating folders, wherein the scan hierarchy is related to the user's selection of a quick scan, a full scan, and a custom scan in the killing page, for example, in a fast In the scanning mode, the scanning speed is faster, but the scanning level is shallower.
  • the file specified by the user to be scanned is obtained.
  • the users specify the file to be scanned as the file to be scanned, and the plurality of files to be scanned are according to the set queue length. Enumerate and distribute the enumeration queues that form a specific length to wait for the scan.
  • the file length is 20000.
  • Step S130 Acquire attribute values of the file to be scanned one by one from the enumerated files to be scanned, and transfer the attribute values to the server.
  • an attribute value of the file to be scanned is obtained, and the attribute value is uniquely identified by the file to be scanned, and can be used to ensure the integrity of the file to be scanned.
  • the attribute value of the file to be scanned may be an MD5 value.
  • the attribute value of each file to be scanned is obtained one by one from the plurality of files to be scanned, and a query request including information of the attribute value, the file name of the file to be scanned, and the like is generated, and the generated query request is transmitted to the server.
  • the server can be a cloud platform built by multiple servers. The number of servers in the cloud platform can be arbitrarily increased or decreased as needed, or it can be a large server cluster.
  • the set time can be 100 milliseconds.
  • the scanning method of the above file runs independently on the client where the scan engine is located, and is used to implement file scanning in the client.
  • Step S150 Comparing the attribute value with the feature code stored in the server, and obtaining a feature code that is consistent with the attribute value and a category to which the feature code belongs.
  • the attribute value may be an MD5 value or a hash value obtained by performing encryption calculation on the file to be scanned, and the attribute value corresponding to each file to be scanned is unique. If the file to be scanned is incomplete, the corresponding value is corresponding. The attribute value will change, which is inconsistent with the attribute value corresponding to the complete file to be scanned.
  • the server stores a large number of signatures and the categories to which the signatures belong. There is a corresponding relationship between the feature code and the category stored by the server, and each feature code has a corresponding category.
  • the category is the category to which the attribute value of the file to be scanned belongs, indicating whether the file to be scanned is a normal file or a virus program file or a Trojan file.
  • the category to which it belongs is a blacklist
  • the file whose category is blacklist is a virus program file or a Trojan program file
  • the signature of a normal file the category to which it belongs is a whitelist
  • the category is a whitelist.
  • the file is determined to be a file that does not contain a virus program file or a Trojan file. You can safely run the file.
  • the category to which it belongs is graylisted. Files with graylisted categories cannot be identified as virus program files or Trojans. Files, but files that are active in the virus-sensitive parts of the system.
  • the file corresponding to the value is a virus program file or a Trojan file, or a normal file, or a suspicious file. If there is no signature in the server that matches the attribute value, the number of signatures stored in the server is missed. The file corresponding to the attribute value can be classified into the miss list.
  • Step S170 Form a correspondence between the file to be scanned, the attribute value, and the category according to the feature code that is consistent with the attribute value and the category to which the feature code belongs, and record the correspondence in the first scan result.
  • the comparison process between the attribute value and the feature code can obtain the category of the file to be scanned, and then obtain the scan result of the file, and return the scan result to the user.
  • Step S210 determining a file to be scanned for performing local scanning according to the first scan result.
  • the scan engine can also be used to locally scan the file.
  • the scan engine can also be used to locally scan the file.
  • it should be fully combined with the file scanning performed by the server to implement local scanning of files.
  • the first scan result returned by the server can know which files to be scanned are suspicious files and files to be scanned that do not find the feature code corresponding to the attribute value in the server, and the scan result is guaranteed. Accuracy, the first scan result is required to be a suspicious file and a file to be scanned that does not find a signature corresponding to the attribute value in the server as a file to be scanned for local scanning.
  • the set time is 100 milliseconds.
  • Step S230 performing local scanning on the determined file to be scanned to obtain a second scan result.
  • the attribute value is obtained from the determined file to be scanned, and the same feature code and its belonging category are searched from the virus database stored in the local virus database according to the obtained attribute value, and then according to the found
  • the category knows whether the corresponding file is a normal file or a virus program file or a Trojan file.
  • Step S250 integrating the second scan result and the first scan result to form a third scan result.
  • the third scan result of the file is formed according to the first scan result, and the category recorded by the first scan result If it is a gray list or a miss list, the file needs to be locally scanned to obtain a second scan result, and the second scan result is used as the third scan result of the file.
  • the user may be presented, and according to the third scan result, the user is prompted to have a blacklisted file, and the file is blacklisted. Clear processing.
  • Step S310 the file to be scanned is removed from the enumeration queue according to the corresponding file item in the third scan result.
  • the scanned file should be removed from the enumerated file to be scanned, that is, the corresponding file item in the third scanning result is completed.
  • the scanned files which in turn remove the scanned files from the enumerated multiple scan files.
  • step S330 it is determined whether there is a gap in the enumeration queue. If yes, the process proceeds to step S350, and if not, the process ends.
  • the enumerated files to be scanned form an enumeration queue of a specific length, not all files to be scanned are located in the enumeration queue, so the vacancy is searched in the enumeration queue so as to be The files to be scanned in the enumeration queue are added to the enumeration queue.
  • the file to be scanned in the enumeration queue is still in the original position, and the movement or position adjustment of the file is not removed due to the removal of a file. For example, if the file in the first queue of the enumeration queue is scanned and moved out of the queue, the file to be scanned in the second position of the enumeration queue will not move forward to fill the gap in the first position of the enumeration queue. The enumeration pointer in the enumeration queue will find the empty space from the first bit of the enumeration queue.
  • the file to be scanned that is not in the enumeration queue needs to be scanned is added to the enumeration queue. If no space is found, the search is continued to wait for the space to be added before the addition of the file to be scanned to the enumeration queue.
  • Step S350 adding the file to be scanned that is not in the enumeration queue to the enumeration queue.
  • the scanning method of the above file includes the following steps:
  • step S401 the file to be scanned is enumerated.
  • step S402 it is determined whether the length of the file to be scanned enumerated reaches the first threshold. If yes, the process goes to step S403, and if no, the process returns to step S401.
  • the server is triggered to scan the file.
  • the first threshold may be 50, that is, the scanning of the file by the server is triggered immediately when 50 files to be scanned are enumerated.
  • Step S403 searching for the to-be-scanned file that meets the preset condition from the enumerated files to be scanned.
  • the file to be scanned that can be scanned by the server is searched from the plurality of files to be scanned.
  • the preset condition may be a PE file of less than 3 megabytes (Portable) Execute, executable file). Pre-set conditions can be flexibly modified according to actual processing capabilities and user needs.
  • Step S405 obtaining attribute values of the file to be scanned one by one from the enumerated files to be scanned, and transmitting the attribute values to the server.
  • Step S406 comparing the attribute value with the feature code stored in the server, to obtain a feature code that is consistent with the attribute value and a category to which the feature code belongs.
  • Step S407 Form a correspondence between the file to be scanned, the attribute value, and the category according to the feature code that is consistent with the attribute value and the category to which the feature code belongs, and record the correspondence in the first scan result.
  • step S408 it is determined whether the length of the file to be scanned enumerated reaches the second threshold. If yes, the process goes to step S409, and if no, the process returns to step S401.
  • the second threshold may be 5000, ie a local scan may be triggered when 5,000 files to be scanned are enumerated.
  • the second threshold should be greater than the first threshold because the file scanning performed by the server requires network connection and data transmission, and in contrast, it takes more time than local scanning. Because of the most comprehensive feature code stored in the server, the final scan result can improve the accuracy of the scan based on the first scan result formed by the server scanning the file, and is beneficial to reduce the total scan time.
  • Step S409 determining a file to be scanned for performing local scanning according to the first scan result.
  • the step S405 further includes the step of marking a file to be scanned for transmitting the attribute value.
  • the file to be scanned scanned by the server is marked.
  • step S409 is as follows:
  • Step S501 Obtain, according to the first scan result, a file item to be scanned twice in the enumerated file to be scanned.
  • the file item that has been scanned by the server file and the corresponding category are obtained from the first scan result, and the category corresponding to the file item in the first scan result is graylisted or missed.
  • the list indicates that the file item may be a suspicious file or a feature code that is not the same as the attribute value of the file item is not found in the signature stored in the server, and the file item cannot be determined, so the file item needs to be re-subscribed twice. scanning.
  • Step S503 Select an unmarked file to be scanned from the enumerated files to be scanned, and form a file to be scanned for performing local scanning by the file item to be scanned twice and the unmarked file to be scanned.
  • the to-be-scanned files of the plurality of to-be-scanned files that are not scanned by the server are also scanned by the local scan engine.
  • Step S410 Perform local scanning on the determined file to be scanned to obtain a second scan result.
  • the specific process of locally scanning the determined file to be scanned to obtain the second scan result is: sequentially scanning the file items for the second scan and the unmarked files to be scanned according to the set priority.
  • the file item subjected to the second scan is first scanned locally, and after the file item scanning for the second scan is completed, the file is not belonged to the PE file.
  • the file to be scanned is scanned, and finally the PE file that does not meet the preset condition is locally scanned.
  • the preferred level of scanning can be flexibly adjusted.
  • Step S411 integrating the second scan result and the first scan result to form a third scan result.
  • Step S412 Acquire a category corresponding to the file item that performs the second scan in the second scan result.
  • the search may be performed from the second scan result to obtain the file item corresponding to the second scan.
  • Category according to which it is known whether the file item subjected to the second scan is a virus program file or a Trojan file.
  • step S413 it is determined whether the file item subjected to the second scan is dangerous according to the acquired category. If yes, the process proceeds to step S414, and if no, the process proceeds to step S415.
  • the file item subjected to the second scan is dangerous. For example, if the category is a blacklist, it indicates that a virus program file or a Trojan program exists in the corresponding file item that has undergone the second scan. The file is dangerous. Since this category is obtained by local scanning, it means that the large number of signatures stored on the server is not comprehensive enough and needs to be updated. At this time, the corresponding scan of this category is performed twice. The attribute value corresponding to the file item is stored as a signature.
  • Step S414 uploading an attribute value corresponding to the file item for performing the second scan.
  • Step S415 scanning a file item that is subjected to the second scan to obtain a corresponding suspicious degree.
  • the file item subjected to the second scan may be a suspicious file, and therefore the file item subjected to the second scan needs to be performed again. Scan to get the suspiciousness of the file item.
  • step S416 it is determined whether the corresponding suspiciousness exceeds the suspicious threshold. If yes, the process returns to step S414, and if not, the process ends.
  • the possibility of suspicious file security may be determined according to the set suspicious threshold.
  • the suspicious threshold may be set to 30%, and if the suspiciousness exceeds 30%, the suspicious file should be a virus program.
  • the file or Trojan file, and the signature of the suspicious file is not stored in the server. Therefore, the signature of the suspicious file needs to be uploaded and classified into the blacklist.
  • FIG. 6 shows a scanning system for files in one embodiment, including a client 10 and a server 30.
  • the client 10 includes an enumeration module 110 and an attribute value acquisition module 120.
  • the enumeration module 110 is configured to enumerate files to be scanned.
  • the user when the scan engine of the virus killing software or the Trojan killing software is turned on, the user generates a scan request through the scan engine's killing page, and sends the generated scan request by the IPC module (interprocess communication module) to The underlying hardware of the system, and then the scan request is sent to the server through the underlying hardware of the system, and the scan engine and the server obtain the file to be scanned through the received scan request, so as to target the scanned file according to the scan request.
  • Ground scan The IPC module is defined between the scan engine's killing page and the underlying hardware, and is used to implement communication between the killing page and the underlying hardware, thereby implementing network connectivity between the scanning engine and the server.
  • the scan request includes a task ID, a scan hierarchy, and a method of enumerating folders, wherein the scan hierarchy is related to the user's selection of a quick scan, a full scan, and a custom scan in the killing page, for example, in a fast In the scanning mode, the scanning speed is faster, but the scanning level is shallower.
  • the file specified by the user to be scanned is obtained.
  • the users specify the file to be scanned as the file to be scanned, and the enumerating module 110 sets the plurality of files to be scanned according to the setting.
  • the queue length is enumerated and distributed to form an enumeration queue of a specific length to wait for the scan.
  • the file length is 20000.
  • the attribute value obtaining module 120 is configured to obtain attribute values of the file to be scanned one by one from the enumerated files to be scanned, and transmit the attribute values to the server.
  • the attribute value obtaining module 120 acquires an attribute value of the file to be scanned, and the attribute value uniquely identifies the file to be scanned, and can be used to ensure the integrity of the file to be scanned.
  • the attribute value of the file to be scanned may be an MD5 value.
  • the attribute value obtaining module 120 acquires the attribute value of each file to be scanned one by one from the plurality of files to be scanned, and generates a query request including the attribute value, the file name of the file to be scanned, and the like, and transmits the query request to the server.
  • the generated query request can be a cloud platform built by multiple servers. The number of servers in the cloud platform can be arbitrarily increased or decreased as needed, or it can be a large server cluster.
  • the set time can be 100 milliseconds.
  • the server 30 includes a database 310, a comparison module 320, and a correspondence forming module 330.
  • the database 310 is configured to store the signature and the category to which the signature belongs.
  • the comparison module 320 is configured to compare the attribute value with the stored feature code to obtain a feature code that is consistent with the attribute value and a category to which the feature code belongs.
  • the attribute value may be an MD5 value or a hash value obtained by performing encryption calculation on the file to be scanned, and the attribute value corresponding to each file to be scanned is unique. If the file to be scanned is incomplete, the corresponding value is corresponding. The attribute value will change, which is inconsistent with the attribute value corresponding to the complete file to be scanned.
  • the server stores a large number of signatures and the categories to which the signatures belong. There is a corresponding relationship between the feature code and the category stored by the server, and each feature code has a corresponding category.
  • the matching module 320 performs a search in the server according to the attribute value of the file to be scanned to obtain a feature code that is consistent with the attribute value of the file to be scanned, and then obtains the feature code according to the correspondence between the feature code and the category.
  • the category belongs to the category to which the attribute value of the file to be scanned belongs, indicating whether the file to be scanned is a normal file or a virus program file or a Trojan file. For example, for the signature of the virus program file, the category to which it belongs is a blacklist, and the file whose category is blacklist is a virus program file or a Trojan program file; for the signature of a normal file, the category to which it belongs is a whitelist, and the category is a whitelist.
  • the file is determined to be a file that does not contain a virus program file or a Trojan file. You can safely run the file.
  • the category to which it belongs is graylisted. Files with graylisted categories cannot be identified as virus program files or Trojans. Files, but files that are active in the virus-sensitive parts of the system.
  • the matching module 320 obtains the feature code that is consistent with the attribute value in the process of comparing the attribute value with the feature code stored by the server, and further obtains the corresponding category by the feature code that is consistent with the attribute value. Indicates that the file corresponding to the attribute value is a virus program file or a Trojan file, or a normal file, or a suspicious file. If there is no signature in the server that matches the attribute value, the server is not in the server. A large number of signatures can be used to classify files corresponding to the attribute values into the miss list.
  • the correspondence relationship forming module 330 is configured to form a correspondence between the file to be scanned, the attribute value, and the category according to the feature code that is consistent with the attribute value and the category to which the feature code belongs, and record the correspondence in the first scan result.
  • the correspondence relationship forming module 330 can obtain the category of the file to be scanned by the comparison process of the attribute value and the feature code, thereby obtaining the scan result of the file, and returning the scan result to the user.
  • the client 10 includes a scan file determining module 130 and a scan module 140 to integrate the module 150 in addition to the enumeration module 110 and the attribute value obtaining module 120.
  • the scan file determining module 130 is configured to determine a file to be scanned for performing local scanning according to the first scan result.
  • the scan engine on the basis of file scanning by the server, can also be used to locally scan the file.
  • the scan engine For the efficiency and accuracy of further document scanning, it should be fully combined with the file scanning performed by the server to achieve local scanning of the file.
  • the scan file determining module 130 can know, by the first scan result returned by the server, which files to be scanned are suspicious files and files to be scanned that do not find the feature code corresponding to the attribute value in the server.
  • the first scan result is a suspicious file and a file to be scanned that does not find a signature corresponding to the attribute value in the server is used as a file to be scanned for local scanning.
  • the scan file determining module 130 also needs to perform local scan on multiple files to be scanned that have not been scanned by the server to ensure that all files are scanned and the corresponding scan results are obtained.
  • the set time is 100 milliseconds.
  • the client further includes a marking module for marking a file to be scanned that transmits the attribute value.
  • the tag module scans the file to be scanned that is scanned by the server.
  • the above-described scan file determining module 130 includes a secondary scanning unit 131 and a selecting unit 133.
  • the secondary scanning unit 131 is configured to obtain, according to the first scan result, a file item to be subjected to secondary scanning among the enumerated files to be scanned.
  • the secondary scanning unit 131 obtains, from the first scan result, the file item that has been scanned by the server file and the corresponding category, if the category corresponding to a file item recorded in the first scan result is A gray list or a miss list indicates that the file item may be a suspicious file or a feature code that is not the same as the attribute value of the file item in the signature stored in the server, and the file item cannot be determined.
  • the file item is scanned twice.
  • the selecting unit 133 is configured to select an unmarked file to be scanned from the enumerated files to be scanned, and form a file to be scanned and a non-marked file to be scanned for the second scan.
  • the to-be-scanned files of the plurality of to-be-scanned files that are not scanned by the server are also scanned by the local scan engine.
  • the scanning module 140 is configured to perform local scanning on the determined file to be scanned to obtain a second scan result.
  • the scanning module 140 obtains the attribute value from the determined file to be scanned, and searches for the same feature code and its belonging category from the virus database stored in the local virus database according to the obtained attribute value, and further The found category knows whether the corresponding file is a normal file or a virus program file or a Trojan file.
  • the scanning module 140 is further configured to sequentially scan the file items for the second scan and the unmarked files to be scanned according to the set preference level.
  • the scanning module 140 first performs a local scan on the file item subjected to the second scanning, and does not belong to the file item after the second scanning is completed.
  • the file to be scanned of the PE file is scanned, and finally the PE file that does not meet the preset condition is locally scanned.
  • the preferred level of scanning can be flexibly adjusted.
  • the result integration module 150 is configured to integrate the second scan result and the first scan result to form a third scan result.
  • the result integration module 150 integrates the obtained first scan result and the second scan result, and fully obtains the first scan result and the second scan result.
  • the third scan result is the obtained first scan result and the second scan result.
  • the client 10 further includes a removal module 160 and an add module 170.
  • the removing module 160 is configured to remove the file to be scanned from the enumeration queue according to the corresponding file item in the third scan result.
  • the removing module 160 should remove the scanned file from the enumerated file to be scanned, that is, according to the corresponding file in the third scanning result.
  • the item gets the scanned file, and the files that have been scanned are removed from the enumerated multiple scan files.
  • the adding module 170 is configured to determine, in the enumeration queue, whether there is a gap, and if so, add the file to be scanned that is not in the enumeration queue to the enumeration queue.
  • the adding module 170 searches for empty spaces in the enumeration queue, so as to facilitate Add the files to be scanned that are not already in the enumeration queue to the enumeration queue.
  • the file to be scanned in the enumeration queue is still in the original position, and the movement or position adjustment of the file is not removed due to the removal of a file. For example, if the file in the first queue of the enumeration queue is scanned and moved out of the queue, the file to be scanned in the second position of the enumeration queue will not move forward to fill the gap in the first position of the enumeration queue.
  • the enumeration pointer in the enumeration queue will look up the empty space from the first bit of the enumeration queue.
  • the adding module 170 adds the to-be-scanned file that is not in the enumeration queue and needs to be scanned to the enumeration queue. If the space is not found temporarily, the search is continued to wait for the space to be added before the addition of the file to be scanned to the enumeration queue.
  • the client further includes an enumeration determination module 180 and a lookup module 190.
  • the enumeration determination module 180 is configured to determine whether the length of the enumerated file to be scanned reaches a first threshold, and if so, notify the search module 190.
  • the enumeration determination module 180 determines whether the length of the formed enumeration queue reaches the first threshold, and if so, triggers the server to perform the file. Scanning.
  • the first threshold may be 50, that is, the scanning of the file by the server is triggered immediately when 50 files to be scanned are enumerated.
  • the searching module 190 is configured to search for a file to be scanned that meets a preset condition from the enumerated files to be scanned.
  • the search module 190 searches for the to-be-scanned file that can be scanned by the server from the plurality of enumerated files to be scanned.
  • the preset condition may be a PE file of less than 3 megabytes. Pre-set conditions can be flexibly modified according to actual processing capabilities and user needs.
  • the enumeration determination module 180 is further configured to determine whether the length of the enumerated file to be scanned reaches a second threshold, and if yes, notify the scan file determining module 130.
  • the enumeration determination module 180 determines whether the formed enumeration queue length has reached the set second threshold, and if so, triggers Local scan.
  • the second threshold may be 5000, ie a local scan may be triggered when 5,000 files to be scanned are enumerated.
  • the second threshold should be greater than the first threshold because the file scanning performed by the server requires network connection and data transmission, and in contrast, it takes more time than local scanning. Because of the most comprehensive feature code stored in the server, the final scan result can improve the accuracy of the scan based on the first scan result formed by the server scanning the file, and is beneficial to reduce the total scan time.
  • the client 10 further includes a category obtaining module 200, a risk judging module 210, an uploading module 230, and a suspiciousness judging module 240.
  • the category obtaining module 200 is configured to acquire a category corresponding to the file item that performs the second scan in the second scan result.
  • the category obtaining module 200 can obtain the file that has undergone the secondary scan by searching from the second scan result.
  • the category corresponding to the item, according to which it is known whether the file item subjected to the second scan is a virus program file or a Trojan program file.
  • the risk determination module 210 is configured to determine, according to the acquired category, whether the file item that is subjected to the second scan is dangerous. If yes, notify the uploading module 230, and if not, notify the scanning module 140.
  • the risk determination module 210 can know whether the file item subjected to the second scan is dangerous according to the corresponding category. For example, if the category is blacklist, it indicates that a virus exists in the corresponding file item that has undergone the second scan. Program files or Trojan files are dangerous. Since this category is obtained by local scanning, it means that the large number of signatures stored on the server is not comprehensive enough and needs to be updated. In this case, the corresponding category The attribute value corresponding to the file item of the second scan is performed and stored as a feature code.
  • the uploading module 230 is configured to upload a file item for performing the second scan to obtain a corresponding suspicious degree.
  • the scanning module 140 is further configured to scan the file items that are subjected to the second scan to obtain corresponding suspiciousness.
  • the file item subjected to the second scan may be a suspicious file, and therefore the scan module 140 is required to perform the second scan again.
  • the file item is scanned for the suspiciousness of the file item.
  • the suspiciousness judging module 240 is configured to determine whether the corresponding suspiciousness exceeds the suspicious threshold, and if yes, notify the uploading module 230.
  • the possibility of suspicious file security may be determined according to the set suspicious threshold.
  • the suspicious threshold may be set to 30%. If the suspiciousness judging module 240 determines that the suspiciousness has exceeded 30%, then It indicates that the suspicious file should be a virus program file or a Trojan file, and the signature of the suspicious file is not stored in the server. Therefore, the signature of the suspicious file needs to be uploaded and classified into the blacklist.
  • the scanning method, system, client and server of the above file transmit the attribute value of the file to be scanned to the server, and compare the signature and the category stored in the server to realize the security and danger of the file.
  • Identification because the server that breaks through the storage limit of the client can store a large number of signatures, and the server can update the signatures in the fastest and most timely manner, so that the signatures in the server are comprehensive, which greatly improves the scanning efficiency of the files.
  • the local scanning and the feature code comparison performed by the server are combined to scan the file, thereby improving the accuracy of the file scanning.
  • the file items of the secondary scan that are in a dangerous state or whose suspiciousness exceeds the suspicious threshold are uploaded, and the feature code stored in the server is continuously updated and enriched, and the server scan file is improved. s efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

提供一种文件的扫描方法、系统、客户端及服务器,该方法包括:枚举待扫描文件(S110);从枚举的待扫描文件逐一获取待扫描文件的属性值,并向服务端传输属性值(S130);将属性值与服务端中存储的特征码进行比对,得到与属性值一致的特征码以及特征码所属的类别(S150);根据与属性值一致的特征码以及特征码所属的类别形成待扫描文件、属性值以及类别之间的对应关系,并将对应关系记录于第一扫描结果中(S170)。上述文件的扫描方法、系统、客户端及服务器,可将待扫描文件的属性值传输到服务端中,与存储在服务端中的特征码及其类别进行比对实现文件的安全性和危险性识别,由于突破了客户端存储限制的服务端可存储大量的特征码,并且服务端可快并及时地更新特征码,使得服务端中的特征码较为全面,提高了文件扫描效率。

Description

文件的扫描方法、系统、客户端及服务器
【技术领域】
本发明涉及数据处理技术,特别是涉及一种文件的扫描方法、系统、客户端及服务器。
【背景技术】
随着计算机技术的不断发展,越来越多的人通过各种文件进行工作和娱乐,人们所使用的文件可以是通过互联网下载到的,也可以是从移动存储介质中获取到的,还可以是与其他用户建立连接实现相互传输所得到的,因此,对于用户而言,通过各种途径所得到的文件以及用户所使用的电脑、手机等终端设备中存在产生危害的可疑文件的可能性非常高,进而导致可疑文件中的病毒程序文件和木马程序文件大量泛滥,对用户的使用文件造成了严重的危害。
然而,在对各种文件进行可疑文件扫描时仅仅依赖于安装于本地的客户端引擎和本地病毒库,客户端引擎是杀毒引擎,而本地病毒库中所能够存储的用于查找可疑文件的病毒特征码有限,且存在于各种文件中的病毒程序文件以及木马程序文件数量在迅猛增长,远远超过了本地病毒库的更新速度,从而使得本地病毒库只能被动地加快更新速度。
由于本地病毒库所能够存储的病毒特征码并不能涵盖所有的病毒程序文件以及木马程序文件,这将使得客户端引擎对文件所进行的可疑文件扫描存在着扫描效率不高的缺陷。
【发明内容】
基于此,有必要提供一种能提高查杀率的文件的扫描方法。
此外,还有必要提供一种能提高查杀率的文件的扫描系统。
再者,还有必要提供一种能提高查杀率的文件的扫描客户端。
另外,还有必要提供一种能提高查杀率的文件的扫描服务器。
一种文件的扫描方法,包括如下步骤:
枚举待扫描文件;
从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值;
将所述属性值与服务端中存储的特征码进行比对,得到与所述属性值一致的特征码以及所述特征码所属的类别;
根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
一种文件的扫描方法,包括如下步骤:
枚举待扫描文件;
从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值。
一种文件扫描方法,包括:
将待扫描文件的属性值与服务端中存储的特征码进行比对,得到与所述属性值一致的特征码以及所述特征码所属的类别;
根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
一种文件的扫描系统,包括客户端以及服务端;
所述客户端包括:
枚举模块,用于枚举待扫描文件;
属性值获取模块,用于从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值;
所述服务端包括:
数据库,用于存储特征码以及所述特征码所属的类别;
比对模块,用于将所述属性值与存储的特征码进行比对,得到与所述属性值相一致的特征码以及所述特征码所属的类别;
对应关系形成模块,用于根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
一种文件的扫描客户端,包括:
枚举模块,用于枚举待扫描文件;
属性值获取模块,用于从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值。
一种文件的扫描服务器,包括:
数据库,用于存储特征码以及所述特征码所属的类别;
比对模块,用于将待扫描文件的属性值与存储的特征码进行比对,得到与所述属性值相一致的特征码以及所述特征码所属的类别;
对应关系形成模块,用于根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
上述文件的扫描方法、系统、客户端及服务器,将待扫描文件的属性值传输到服务端中,通过与存储在服务端中的特征码及其类别进行比对实现文件的安全性和危险性识别,由于突破了客户端存储限制的服务端可存储大量的特征码,并且服务端可最快最及时地更新特征码,使得服务端中的特征码较为全面,大大提高了文件的扫描效率。
【附图说明】
图1为一个实施例中文件的扫描方法的流程图;
图2为另一个实施例中文件的扫描方法的流程图;
图3为另一个实施例中文件的扫描方法的流程图;
图4为另一个实施例中文件的扫描方法的流程图;
图5为图4中根据第一扫描结果确定进行本地扫描的待扫描文件的方法流程图;
图6为一个实施例中文件的扫描系统的结构示意图;
图7为一个实施例中客户端的结构示意图;
图8为图7中扫描文件确定模块的结构示意图;
图9为另一个实施例中客户端的结构示意图;
图10为另一个实施例中客户端的结构示意图;
图11为另一个实施例中客户端的结构示意图。
【具体实施方式】
图1示出了一个实施例中文件扫描的方法流程,包括如下步骤:
步骤S110,枚举待扫描文件。
本实施例中,在开启病毒查杀软件或者木马查杀软件的扫描引擎时,用户通过扫描引擎的查杀页面生成扫描请求,并将生成的扫描请求由IPC模块(进程间通信模块)发送到系统的底层硬件,进而通过系统的底层硬件将这一扫描请求发送到服务端中,扫描引擎以及服务端通过接收到的扫描请求获知需要进行扫描的文件,以根据扫描请求对待扫描文件进行针对性地扫描。IPC模块界于扫描引擎的查杀页面和底层硬件之间,用于实现杀查杀页面与底层硬件之间的通信,进而实现了扫描引擎与服务端之间的网络连通。
具体地,扫描请求中包含了任务ID、扫描层次以及枚举文件夹的方式,其中,扫描层次跟用户在查杀页面中对快速扫描、全盘扫描和自定义扫描的选择有关,例如,在快速扫描方式下,扫描速度较快,但是扫描层次较浅。
为进行文件的扫描,根据用户在查杀页面中的操作得到该用户指定要进行扫描的文件,这些用户指定要进行扫描的文件作为待扫描文件,将多个待扫描文件按照设定的队列长度进行枚举,并分发形成特定长度的枚举队列,以等待扫描。在优选的实施例中,文件长度为20000。
步骤S130,从枚举的待扫描文件中逐一获取待扫描文件的属性值,并向服务端传输属性值。
本实施例中,获取待扫描文件的属性值,该属性值对待扫描文件进行唯一标识,并且可用于确保待扫描文件的完整性。在优选的实施例中,待扫描文件的属性值可以是MD5值。
在从枚举的多个待扫描文件中逐一获取每一待扫描文件的属性值,生成包含了属性值、待扫描文件的文件名等信息的查询请求,并向服务端传输生成的查询请求。服务端可以是采用多个服务器构建的云平台,云平台中的服务器数量可以随着需要的变化进行任意增减,也可以是大型的服务器集群。
在触发了服务端对文件的扫描后,若需要进行服务端扫描的枚举的待扫描文件为空,没有找到进行服务端扫描的待扫描文件,则等待设定的时间后再尝试。该设定的时间可以是100毫秒。
上述文件的扫描方法独立运行于扫描引擎所在的客户端,用于实现客户端中的文件扫描。
步骤S150,将属性值与服务端中存储的特征码进行比对,得到与属性值一致的特征码以及该特征码所属的类别。
本实施例中,属性值可以是对待扫描文件进行加密计算得到的MD5值或者哈希值,每一待扫描文件所对应的属性值都是唯一的,若待扫描文件是不完整的,则对应的属性值将发生变化,与完整的待扫描文件所对应的属性值是不相一致的。服务端存储了大量的特征码以及与该特征码所属的类别。服务端所存储的特征码与类别是存在着对应关系的,每一特征码均有与对应的类别。根据待扫描文件的属性值在服务端中进行查找,以得到与该待扫描文件的属性值相一致的特征码,进而根据特征码与类别之间的对应关系得到该特征码所属的类别,该类别即为待扫描文件的属性值所归属的类别,指示了待扫描文件是正常文件还是病毒程序文件或者木马程序文件。例如,对于病毒程序文件的特征码,所属的类别为黑名单,类别为黑名单的文件为病毒程序文件或者木马程序文件;对于正常文件的特征码,所属的类别为白名单,类别为白名单的文件为确定不会包含病毒程序文件或木马程序文件的文件,可放心运行该文件;对于可疑文件,所属的类别为灰名单,类别为灰名单的文件为不能认定为病毒程序文件或木马程序文件,但在系统的病毒敏感部位活动的文件。
在属性值与服务端存储的特征码进行比对的过程中得到与属性值相一致的特征码,进而由这一与属性值相一致的特征码得到对应的类别,这一类别指示了与属性值对应的文件是病毒程序文件或者木马程序文件,或是正常文件,或是可疑文件,若服务端中不存在与属性值相一致的特征码,则未命中服务端中存储的大量特征码,可将该属性值对应的文件归类至未命中名单中。
步骤S170,根据与属性值一致的特征码以及该特征码所属的类别形成待扫描文件、属性值以及类别之间的对应关系,并将对应关系记录于第一扫描结果中。
本实施例中,由属性值与特征码的比对过程可以得到待扫描文件所属类别,进而得到这一文件的扫描结果,并将扫描结果返回给用户。
上述步骤S150和步骤S170独立运行于服务端中,用于实现服务端的文件扫描。
在另一个实施例中,如图2所示,上述步骤S170之后还包括以下步骤:
步骤S210,根据第一扫描结果确定进行本地扫描的待扫描文件。
本实施例中,在通过服务端进行文件扫描基础上,还可利用扫描引擎对文件进行本地扫描。为进一步提高文件扫描的效率以及准确度,应当与服务端所进行的文件扫描充分结合来实现文件的本地扫描。
具体地,由服务端所返回的第一扫描结果可以知道哪些待扫描文件是可疑文件以及在服务端中未查找到与属性值相对应的特征码的待扫描文件,此时为保证扫描结果的准确性,需要将第一扫描结果为可疑文件以及在服务端中未查找到与属性值相对应的特征码的待扫描文件作为进行本地扫描的待扫描文件。
此外,还需要对未通过服务端进行文件扫描的多个待扫描文件进行本地扫描以保证所有的文件均进行了扫描并得到相应的扫描结果。
在触发了本地扫描后,若未寻找到进行本地扫描的待扫描文件,可等待设定的时间后再尝试进行本地扫描的待扫描文件的寻找。在优选的实施例中,设定的时间为100毫秒。
步骤S230,对确定的待扫描文件进行本地扫描得到第二扫描结果。
本实施例中,从确定的待扫描文件中获取属性值,并根据获取的属性值从存储于本地的病毒库中查找与获取的属性值相同的特征码及其所属类别,进而根据查找到的类别获知对应的文件是正常文件还是病毒程序文件或木马程序文件。
步骤S250,整合第二扫描结果和第一扫描结果,形成第三扫描结果。
本实施例中,在完成待扫描文件的服务端扫描和本地扫描后,对得到的第一扫描结果和第二扫描结果进行整合,充分参考第一扫描结果和第二扫描结果得到第三扫描结果。
第一扫描结果 第二扫描结果 第三扫描结果
未命中
未命中
未命中
未命中
未命中
具体地,如上表所示,对于某一文件,若第一扫描结果所记载的类别为黑名单,则按照第一扫描结果形成该文件的第三扫描结果,若第一扫描结果所记载的类别为灰名单或者未命中名单,则需要对此文件进行本地扫描得到第二扫描结果,并按照第二扫描结果作为该文件的第三扫描结果。
在上述文件的扫描方法中,在形成了第三扫描结果之后可向用户展示,并根据这一第三扫描结果向用户提示类别为黑名单的文件存在风险,并对类别为黑名单的文件进行清除处理。
在另一个实施例中,如图3所示,上述步骤S250之后还包括以下步骤:
步骤S310,根据第三扫描结果中对应的文件项将待扫描文件从枚举队列中移除。
本实施例中,在经过服务端对文件的扫描以及本地扫描后,应当将完成了扫描的文件从枚举的待扫描文件中移除,即根据第三扫描结果中对应的文件项得到完成了扫描的文件,进而将这些完成了扫描的文件从枚举的多个扫描文件中移除。
步骤S330,在枚举队列中判断是否存在空位,若是,则进入步骤S350,若否,结束。
本实施例中,由于枚举的待扫描文件形成了特定长度的枚举队列,因此并不是所有的待扫描文件都位于枚举队列中,因此在枚举队列中寻找空位,以便于将还未处于枚举队列中的待扫描文件添加至枚举队列中。
具体地,从枚举队列中在移除了完成扫描的文件后,枚举队列中的待扫描文件还是处于原有的位置,并不会因为某一文件的移除而发生移动或者位置调整,例如,若枚举队列中处于第一位的文件被扫描完毕并移出队列后,处于枚举队列第二位的待扫描文件将不会往前移动来填补枚举队列首位的空位,因此,在枚举队列中枚举指针将从枚举队列的第一位为起始查找空位,当查找到空位时,将未处于枚举队列中需要进行扫描的待扫描文件添加至枚举队列中,若暂时没有查找到空位,则继续进行查找,以等待出现空位时再进行向枚举队列进行待扫描文件的添加。
步骤S350,将未处于枚举队列中的待扫描文件添加至枚举队列中。
在另一个实施例中,如图4所示,上述文件的扫描方法包括以下步骤:
步骤S401,枚举待扫描文件。
步骤S402,判断枚举的待扫描文件长度是否达到第一阈值,若是,则进入步骤S403,若否,则返回步骤S401。
本实施例中,在逐一进行待扫描文件的枚举形成枚举队列的过程中,判断所形成的枚举队列长度是否达到了第一阈值,若是,则触发服务端进行文件的扫描。在优选的实施例中,第一阈值可以是50,即当枚举了50个待扫描文件时立即触发服务端对文件的扫描。
步骤S403,从枚举的待扫描文件中查找符合预设条件的待扫描文件。
本实施例中,在触发服务端进行文件的扫描后,从枚举的多个待扫描文件中查找可进行服务端扫描的待扫描文件。在优选的实施例中,预设条件可以是小于3兆的PE文件(Portable Execute,可执行文件)。预设条件可根据实际的处理能力以及用户需要进行灵活修改。
步骤S405,从枚举的待扫描文件中逐一获取待扫描文件的属性值,并向服务端传输属性值。
步骤S406,将所述属性值与服务端中存储的特征码进行比对,得到与属性值一致的特征码以及该特征码所属的类别。
步骤S407,根据与属性值一致的特征码以及特征码所属的类别形成待扫描文件、属性值以及类别之间的对应关系,并将对应关系记录于第一扫描结果中。
步骤S408,判断枚举的待扫描文件长度是否达到第二阈值,若是,则进入步骤S409,若否,则返回步骤S401。
本实施例中,在逐一进行待扫描文件的枚举形成枚举队列的过程中,判断所形成的枚举队列长度是否已经达到了设定的第二阈值,若是,则触发本地扫描。在优选的实施例中,第二阈值可以是5000,即当枚举了5000个待扫描文件时即可触发本地扫描。
在优选的实施例中,第二阈值应当大于第一阈值,这是因为服务端所进行的文件扫描需要进行网络连接和数据传输,相对地,所花费的时间也比本地扫描花费的时间多得多,并且由于服务端中存储的特征码最为全面,最终的扫描结果在服务端扫描文件所形成的第一扫描结果基础上形成可提高扫描的准确性,并有利于减少总的扫描时间。
步骤S409,根据第一扫描结果确定进行本地扫描的待扫描文件。
在一个具体的实施例中,上述步骤S405之后还包括了标记传输属性值的待扫描文件的步骤。
本实施例中,在向服务端传输属性值后,标记通过服务端进行扫描的待扫描文件。
如图5所示,上述步骤S409的具体过程为:
步骤S501,根据第一扫描结果得到枚举的待扫描文件中将进行二次扫描的文件项。
本实施例中,从第一扫描结果中得到已经完成了服务端文件扫描的文件项以及对应的类别,若第一扫描结果中所记载的某一文件项所对应的类别为灰名单或者未命中名单,则说明该文件项可能是可疑文件或者没有在服务端存储的特征码中找到与该文件项属性值相同的特征码,无法对该文件项进行判定,因此需要对该文件项进行二次扫描。
步骤S503,从枚举的待扫描文件中选取未标记的待扫描文件,将进行二次扫描的文件项和未标记的待扫描文件形成进行本地扫描的待扫描文件。
本实施例中,枚举的多个待扫描文件中未通过服务端进行文件扫描的待扫描文件也需要通过本地的扫描引擎进行扫描。
步骤S410,对确定的待扫描文件进行本地扫描得到第二扫描结果。
本实施例中,对确定的待扫描文件进行本地扫描得到第二扫描结果的具体过程为:按照设定的优先级依次扫描进行二次扫描的文件项和未标记的待扫描文件。
根据设定的优先级依次对确定的待扫描文件进行本地扫描的过程中,首先对进行二次扫描的文件项进行本地扫描,在进行二次扫描的文件项扫描完成后对不属于PE文件的待扫描文件进行扫描,最后对不符合预设条件的PE文件进行本地扫描。根据实际的扫描过程,扫描的优选级可进行灵活地调整。
步骤S411,整合第二扫描结果和第一扫描结果,形成第三扫描结果。
步骤S412,获取进行二次扫描的文件项在第二扫描结果中对应的类别。
本实施例中,由于第二扫描结果中也记录了文件名、属性值以及类别之间的对应关系,因此从第二扫描结果中进行查找可以获取到进行了二次扫描的文件项所对应的类别,根据该类别可以知道这一进行了二次扫描的文件项是否是病毒程序文件或木马程序文件。
步骤S413,根据获取的类别判断进行二次扫描的文件项是否危险,若是,则进入步骤S414,若否,则进入步骤S415。
本实施例中,由对应的类别可以知道进行了二次扫描的文件项是否危险,例如,若类别为黑名单,则说明对应的进行了二次扫描的文件项中存在病毒程序文件或木马程序文件,存在着危险,由于这一类别是由本地扫描获得的,因此说明服务端所存储的大量特征码还不够全面,需要进行更新,此时,将这一类别所对应的进行了二次扫描的文件项对应的属性值,并作为特征码进行存储。
步骤S414,上传进行二次扫描的文件项对应的属性值。
步骤S415,扫描进行二次扫描的文件项得到对应的可疑度。
本实施例中,当判断到进行二次扫描的文件项所属类别是不危险的,这一进行二次扫描的文件项可能是可疑文件,因此需要再次对这一进行二次扫描的文件项进行扫描以得到该文件项的可疑度。
步骤S416,判断对应的可疑度是否超过可疑阈值,若是,则返回步骤S414,若否,则结束。
本实施例中,可根据设定的可疑阈值来判定可疑文件安全的可能性,例如,可疑阈值可设定为30%,若可疑度超过了30%,则说明该可疑文件应当是一个病毒程序文件或者木马程序文件,而服务端中并没存储这一可疑文件的特征码,因此需要将可疑文件的特征码上传,并归类至黑名单中。
图6示出了一个实施例中的文件的扫描系统,包括客户端10以及服务端30。
客户端10包括枚举模块110以及属性值获取模块120。
枚举模块110,用于枚举待扫描文件。
本实施例中,在开启病毒查杀软件或者木马查杀软件的扫描引擎时,用户通过扫描引擎的查杀页面生成扫描请求,并将生成的扫描请求由IPC模块(进程间通信模块)发送到系统的底层硬件,进而通过系统的底层硬件将这一扫描请求发送到服务端中,扫描引擎以及服务端通过接收到的扫描请求获知需要进行扫描的文件,以根据扫描请求对待扫描文件进行针对性地扫描。IPC模块界于扫描引擎的查杀页面和底层硬件之间,用于实现杀查杀页面与底层硬件之间的通信,进而实现了扫描引擎与服务端之间的网络连通。
具体地,扫描请求中包含了任务ID、扫描层次以及枚举文件夹的方式,其中,扫描层次跟用户在查杀页面中对快速扫描、全盘扫描和自定义扫描的选择有关,例如,在快速扫描方式下,扫描速度较快,但是扫描层次较浅。
为进行文件的扫描,根据用户在查杀页面中的操作得到该用户指定要进行扫描的文件,这些用户指定要进行扫描的文件作为待扫描文件,将枚举模块110多个待扫描文件按照设定的队列长度进行枚举,并分发形成特定长度的枚举队列,以等待扫描。在优选的实施例中,文件长度为20000。
属性值获取模块120,用于从枚举的待扫描文件中逐一获取待扫描文件的属性值,并向服务端传输属性值。
本实施例中,属性值获取模块120获取待扫描文件的属性值,该属性值对待扫描文件进行唯一标识,并且可用于确保待扫描文件的完整性。在优选的实施例中,待扫描文件的属性值可以是MD5值。
属性值获取模块120在从枚举的多个待扫描文件中逐一获取每一待扫描文件的属性值,生成包含了属性值、待扫描文件的文件名等信息的查询请求,并向服务端传输生成的查询请求。服务端可以是采用多个服务器构建的云平台,云平台中的服务器数量可以随着需要的变化进行任意增减,也可以是大型的服务器集群。
在触发了服务端对文件的扫描后,若需要进行服务端扫描的枚举的待扫描文件为空,没有找到进行服务端扫描的待扫描文件,则等待设定的时间后再尝试。该设定的时间可以是100毫秒。
服务端30包括数据库310、比对模块320以及对应关系形成模块330。
数据库310,用于存储特征码以及该特征码所属的类别。
比对模块320,用于将属性值与存储的特征码进行比对,得到与属性值相一致的特征码以及该特征码所属的类别。
本实施例中,属性值可以是对待扫描文件进行加密计算得到的MD5值或者哈希值,每一待扫描文件所对应的属性值都是唯一的,若待扫描文件是不完整的,则对应的属性值将发生变化,与完整的待扫描文件所对应的属性值是不相一致的。服务端存储了大量的特征码以及与该特征码所属的类别。服务端所存储的特征码与类别是存在着对应关系的,每一特征码均有与对应的类别。比对模块320根据待扫描文件的属性值在服务端中进行查找,以得到与该待扫描文件的属性值相一致的特征码,进而根据特征码与类别之间的对应关系得到该特征码所属的类别,该类别即为待扫描文件的属性值所归属的类别,指示了待扫描文件是正常文件还是病毒程序文件或者木马程序文件。例如,对于病毒程序文件的特征码,所属的类别为黑名单,类别为黑名单的文件为病毒程序文件或者木马程序文件;对于正常文件的特征码,所属的类别为白名单,类别为白名单的文件为确定不会包含病毒程序文件或木马程序文件的文件,可放心运行该文件;对于可疑文件,所属的类别为灰名单,类别为灰名单的文件为不能认定为病毒程序文件或木马程序文件,但在系统的病毒敏感部位活动的文件。
比对模块320在属性值与服务端存储的特征码进行比对的过程中得到与属性值相一致的特征码,进而由这一与属性值相一致的特征码得到对应的类别,这一类别指示了与属性值对应的文件是病毒程序文件或者木马程序文件,或是正常文件,或是可疑文件,若服务端中不存在与属性值相一致的特征码,则未命中服务端中存储的大量特征码,可将该属性值对应的文件归类至未命中名单中。
对应关系形成模块330,用于根据与属性值一致的特征码以及该特征码所属的类别形成待扫描文件、属性值以及类别之间的对应关系,并将对应关系记录于第一扫描结果中。
本实施例中,对应关系形成模块330由属性值与特征码的比对过程可以得到待扫描文件所属类别,进而得到这一文件的扫描结果,并将扫描结果返回给用户。
在另一个实施例中,如图7所示,上述客户端10除了包括枚举模块110以及属性值获取模块120之外,还包括了扫描文件确定模块130、扫描模块140以结果整合模块150。
扫描文件确定模块130,用于根据第一扫描结果确定进行本地扫描的待扫描文件。
本实施例中,在通过服务端进行文件扫描基础上,还可利用扫描引擎对文件进行本地扫描。为进一步文件扫描的效率以及准确度,应当与服务端所进行的文件扫描充分结合来实现文件的本地扫描。
具体地,扫描文件确定模块130由服务端所返回的第一扫描结果可以知道哪些待扫描文件是可疑文件以及在服务端中未查找到与属性值相对应的特征码的待扫描文件,此时为保证扫描结果的准确性,需要将第一扫描结果为可疑文件以及在服务端中未查找到与属性值相对应的特征码的待扫描文件作为进行本地扫描的待扫描文件。
此外,扫描文件确定模块130还需要对未通过服务端进行文件扫描的多个待扫描文件进行本地扫描以保证所有的文件均进行了扫描并得到相应的扫描结果。
在触发了本地扫描后,若未寻找到进行本地扫描的待扫描文件,可等待设定的时间后再尝试进行本地扫描的待扫描文件的寻找。在优选的实施例中,设定的时间为100毫秒。
在一个具体的实施例中,上述客户端还包括了标记模块,该标记模块用于标记传输属性值的待扫描文件。
本实施例中,标记模块在向服务端传输属性值后,标记通过服务端进行扫描的待扫描文件。
如图8所示,上述扫描文件确定模块130包括二次扫描单元131以及选取单元133。
二次扫描单元131,用于根据第一扫描结果得到枚举的待扫描文件中将进行二次扫描的文件项。
本实施例中,二次扫描单元131从第一扫描结果中得到已经完成了服务端文件扫描的文件项以及对应的类别,若第一扫描结果中所记载的某一文件项所对应的类别为灰名单或者未命中名单,则说明该文件项可能是可疑文件或者没有在服务端存储的特征码中找到与该文件项属性值相同的特征码,无法对该文件项进行判定,因此需要对该文件项进行二次扫描。
选取单元133,用于从枚举的待扫描文件中选取未标记的待扫描文件,将进行二次扫描的文件项和未标记的待扫描文件形成进行本地扫描的待扫描文件。
本实施例中,枚举的多个待扫描文件中未通过服务端进行文件扫描的待扫描文件也需要通过本地的扫描引擎进行扫描。
扫描模块140,用于对确定的待扫描文件进行本地扫描得到第二扫描结果。
本实施例中,扫描模块140从确定的待扫描文件中获取属性值,并根据获取的属性值从存储于本地的病毒库中查找与获取的属性值相同的特征码及其所属类别,进而根据查找到的类别获知对应的文件是正常文件还是病毒程序文件或木马程序文件。
具体地,扫描模块140还用于按照设定的优选级依次扫描进行二次扫描的文件项和未标记的待扫描文件。
根据设定的优先级依次对确定的待扫描文件进行本地扫描的过程中,扫描模块140首先对进行二次扫描的文件项进行本地扫描,在进行二次扫描的文件项扫描完成后对不属于PE文件的待扫描文件进行扫描,最后对不符合预设条件的PE文件进行本地扫描。根据实际的扫描过程,扫描的优选级可进行灵活地调整。
结果整合模块150,用于整合第二扫描结果和第一扫描结果,形成第三扫描结果。
本实施例中,在完成待扫描文件的服务端扫描和本地扫描后,结果整合模块150对得到的第一扫描结果和第二扫描结果进行整合,充分参考第一扫描结果和第二扫描结果得到第三扫描结果。
在另一个实施例中,如图9所示,上述客户端10还包括了移除模块160以及添加模块170。
移除模块160,用于根据第三扫描结果中对应的文件项将待扫描文件从枚举队列中移除。
本实施例中,在经过服务端对文件的扫描以及本地扫描后,移除模块160应当将完成了扫描的文件从枚举的待扫描文件中移除,即根据第三扫描结果中对应的文件项得到完成了扫描的文件,进而将这些完成了扫描的文件从枚举的多个扫描文件中移除。
添加模块170,用于在枚举队列中判断是否存在空位,若是,则将未处于枚举队列中的待扫描文件添加至枚举队列中。
本实施例中,由于枚举的待扫描文件形成了特定长度的枚举队列,因此并不是所有的待扫描文件都位于枚举队列中,因此添加模块170在枚举队列中寻找空位,以便于将还未处于枚举队列中的待扫描文件添加至枚举队列中。
具体地,从枚举队列中在移除了完成扫描的文件后,枚举队列中的待扫描文件还是处于原有的位置,并不会因为某一文件的移除而发生移动或者位置调整,例如,若枚举队列中处于第一位的文件被扫描完毕并移出队列后,处于枚举队列第二位的待扫描文件交不会往前移动来填补枚举队列首位的空位,因此,在枚举队列中枚举指针将从枚举队列的第一位为起始查找空位,当查找到空位时,添加模块170将未处于枚举队列中需要进行扫描的待扫描文件添加至枚举队列中,若暂时没有查找到空位,则继续进行查找,以等待出现空位时再进行向枚举队列进行待扫描文件的添加。
在另一个实施例中,如图10所示,上述客户端还包括了枚举判断模块180以及查找模块190。
枚举判断模块180,用于判断枚举的待扫描文件长度是否达到第一阈值,若是,则通知查找模块190。
本实施例中,在逐一进行待扫描文件的枚举形成枚举队列的过程中,枚举判断模块180判断所形成的枚举队列长度是否达到了第一阈值,若是,则触发服务端进行文件的扫描。在优选的实施例中,第一阈值可以是50,即当枚举了50个待扫描文件时立即触发服务端对文件的扫描。
查找模块190,用于从枚举的待扫描文件中查找符合预设条件的待扫描文件。
本实施例中,在触发服务端进行文件的扫描后,查找模块190从枚举的多个待扫描文件中查找可进行服务端扫描的待扫描文件。在优选的实施例中,预设条件可以是小于3兆的PE文件。预设条件可根据实际的处理能力以及用户需要进行灵活修改。
枚举判断模块180还用于判断枚举的待扫描文件长度是否达到第二阈值,若是,则通知扫描文件确定模块130。
本实施例中,在逐一进行待扫描文件的枚举形成枚举队列的过程中,枚举判断模块180判断所形成的枚举队列长度是否已经达到了设定的第二阈值,若是,则触发本地扫描。在优选的实施例中,第二阈值可以是5000,即当枚举了5000个待扫描文件时即可触发本地扫描。
在优选的实施例中,第二阈值应当大于第一阈值,这是因为服务端所进行的文件扫描需要进行网络连接和数据传输,相对地,所花费的时间也比本地扫描花费的时间多得多,并且由于服务端中存储的特征码最为全面,最终的扫描结果在服务端扫描文件所形成的第一扫描结果基础上形成可提高扫描的准确性,并有利于减少总的扫描时间。
在另一个实施例中,如图11所示,上述客户端10还包括了类别获取模块200、危险性判断模块210、上传模块230以及可疑度判断模块240。
类别获取模块200,用于获取进行二次扫描的文件项在第二扫描结果中对应的类别。
本实施例中,由于第二扫描结果中也记录了文件名、属性值以及类别之间的对应关系,因此类别获取模块200从第二扫描结果中进行查找可以获取到进行了二次扫描的文件项所对应的类别,根据该类别可以知道这一进行了二次扫描的文件项是否是病毒程序文件或木马程序文件。
危险性判断模块210,用于根据获取的类别判断进行二次扫描的文件项是否危险,若是,则通知上传模块230,若否,则通知扫描模块140。
本实施例中,危险性判断模块210由对应的类别可以知道进行了二次扫描的文件项是否危险,例如,若类别为黑名单,则说明对应的进行了二次扫描的文件项中存在病毒程序文件或木马程序文件,存在着危险,由于这一类别是由本地扫描获得的,因此说明服务端所存储的大量特征码还不够全面,需要进行更新,此时,将这一类别所对应的进行了二次扫描的文件项对应的属性值,并作为特征码进行存储。
上传模块230,用于上传进行二次扫描的文件项得到对应的可疑度。
扫描模块140还用于扫描进行二次扫描的文件项得到对应的可疑度。
本实施例中,当判断到进行二次扫描的文件项所属类别是不危险的,这一进行二次扫描的文件项可能是可疑文件,因此需要扫描模块140再次对这一进行二次扫描的文件项进行扫描以得到该文件项的可疑度。
可疑度判断模块240,用于判断对应的可疑度是否超过可疑阈值,若是,则通知上传模块230。
本实施例中,可根据设定的可疑阈值来判定可疑文件安全的可能性,例如,可疑阈值可设定为30%,若通过可疑度判断模块240判断到可疑度已经超过了30%,则说明该可疑文件应当是一个病毒程序文件或者木马程序文件,而服务端中并没存储这一可疑文件的特征码,因此需要将可疑文件的特征码上传,并归类至黑名单中。
上述文件的扫描方法、系统、客户端及服务器,将待扫描文件的属性值传输到服务端中,通过与存储在服务端中的特征码及其类别进行比对实现文件的安全性和危险性识别,由于突破了客户端存储限制的服务端可存储大量的特征码,并且服务端可最快最及时地更新特征码,使得服务端中的特征码较为全面,大大提高了文件的扫描效率。
上述文件的扫描方法、系统、客户端及服务器中,将本地扫描与服务端所进行的特征码比对结合起来对文件进行扫描,提高了文件扫描的准确性。
上述文件的扫描方法、系统、客户端及服务器中,将处于危险状态或者可疑度超过可疑阈值的二次扫描的文件项上传,不断更新和丰富服务端中存储的特征码,提升服务端扫描文件的效率。
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。

Claims (34)

  1. 一种文件的扫描方法,包括如下步骤:
    枚举待扫描文件;
    从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值;
    将所述属性值与服务端中存储的特征码进行比对,得到与所述属性值一致的特征码以及所述特征码所属的类别;
    根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
  2. 根据权利要求1所述的文件的扫描方法,其特征在于,所述将所述对应关系记录于第一扫描结果中的步骤之后还包括:
    根据所述第一扫描结果确定进行本地扫描的待扫描文件;
    对所述确定的待扫描文件进行本地扫描得到第二扫描结果;
    整合所述第二扫描结果和第一扫描结果,形成第三扫描结果。
  3. 根据权利要求2所述的文件的扫描方法,其特征在于,所述将所述对应关系记录于第一扫描结果的步骤之后还包括:
    根据所述第三扫描结果中对应的文件项将所述待扫描文件从枚举队列中移除。
  4. 根据权利要求3所述的文件的扫描方法,其特征在于,所述根据所述第三扫描结果中对应的文件项将所述待扫描文件从枚举的待扫描文件中移除的步骤之后还包括:
    在所述枚举队列中判断是否存在空位,若是,则将未处于枚举队列中的待扫描文件添加至所述枚举队列中。
  5. 根据权利要求2所述的文件的扫描方法,其特征在于,所述从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值的步骤之前还包括:
    判断所述枚举的待扫描文件长度是否达到第一阈值,若是,则
    从所述枚举的待扫描文件中查找符合预设条件的待扫描文件,并进入从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值的步骤;
    所述根据所述第一扫描结果确定进行本地扫描的待扫描文件的步骤之前还包括:
    判断所述枚举的待扫描文件长度是否达到第二阈值,若是,则进入所述根据所述第一扫描结果确定进行本地扫描的待扫描文件的步骤。
  6. 根据权利要求2所述的文件的扫描方法,其特征在于,所述向服务端传输所述属性值的步骤之后还包括:
    标记所述传输属性值的待扫描文件;
    所述根据所述第一扫描结果确定进行本地扫描的待扫描文件的步骤为:
    根据所述第一扫描结果得到所述枚举的待扫描文件中将进行二次扫描的文件项;
    从枚举的待扫描文件中选取未标记的待扫描文件,将所述进行二次扫描的文件项和未标记的待扫描文件形成进行本地扫描的待扫描文件。
  7. 根据权利6所述的文件的扫描方法,其特征在于,所述对所述确定的待扫描文件进行本地扫描得到第二扫描结果的步骤为:
    按照设定的优先级依次扫描所述进行二次扫描的文件项和未标记的待扫描文件。
  8. 根据权利要求2所述的文件的扫描方法,其特征在于,所述整合所述第二扫描结果和第一扫描结果,形成第三扫描结果的步骤之后还包括:
    获取所述进行二次扫描的文件项在所述第二扫描结果中对应的类别;
    根据所述获取的类别判断所述进行二次扫描的文件项是否危险,若是,则上传所述进行二次扫描的文件项对应的属性值,若否,则
    扫描所述进行二次扫描的文件项得到对应的可疑度;
    判断所述对应的可疑度是否超过可疑阈值,若是,则上传所述进行二次扫描的文件项所对应的属性值。
  9. 一种文件的扫描方法,包括如下步骤:
    枚举待扫描文件;
    从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值。
  10. 根据权利要求9所述的文件扫描方法,其特征在于,所述从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值的步骤之后还包括:
    根据由服务端返回的第一扫描结果确定进行本地扫描的待扫描文件;
    对所述确定的待扫描文件进行本地扫描得到第二扫描结果;
    整合所述第二扫描结果和第一扫描结果,形成第三扫描结果。
  11. 根据权利要求10所述的文件扫描方法,其特征在于,所述整合所述第二扫描结果和第一扫描结果,形成第三扫描结果的步骤之后还包括:
    根据所述第三扫描结果中对应的文件项将所述待扫描文件从枚举队列中移除。
  12. 根据权利要求11所述的文件扫描方法,其特征在于,所述根据所述第三扫描结果中对应的文件项将所述待扫描文件从枚举队列中移除的步骤之后还包括:
    在所述枚举队列中判断是否存在空位,若是,则将未处于枚举队列中的待扫描文件添加至所述枚举队列中。
  13. 根据权利要求10所述的文件扫描方法,其特征在于,所述从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值的步骤之前还包括:
    判断所述枚举的待扫描文件长度是否达到第一阈值,若是,则
    从所述枚举的待扫描文件中查找符合预设条件的待扫描文件,并进入从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值的步骤;
    所述根据所述第一扫描结果确定进行本地扫描的待扫描文件的步骤之前还包括:
    判断所述枚举的待扫描文件长度是否达到第二阈值,若是,则进入所述根据所述第一扫描结果确定进行本地扫描的待扫描文件的步骤。
  14. 根据权利要求10所述的文件扫描方法,其特征在于,所述向服务端传输所述属性值的步骤之后还包括:
    标记所述传输属性值的待扫描文件;
    所述根据所述第一扫描结果确定进行本地扫描的待扫描文件的步骤为:
    根据所述第一扫描结果得到所述枚举的待扫描文件中将进行二次扫描的文件项;
    从枚举的待扫描文件中选取未标记的待扫描文件,将所述进行二次扫描的文件项和未标记的待扫描文件形成进行本地扫描的待扫描文件。
  15. 根据权利要求14所述的文件扫描方法,其特征在于,所述对所述确定的待扫描文件进行本地扫描得到第二扫描结果的步骤为:
    按照设定的优先级依次扫描所述进行二次扫描的文件项和未标记的待扫描文件。
  16. 根据权利要求10所述的文件扫描方法,其特征在于,所述整合所述第二扫描结果和第一扫描结果,形成第三扫描结果的步骤之后还包括:
    获取所述进行二次扫描的文件项在所述第二扫描结果中对应的类别;
    根据所述获取的类别判断所述进行二次扫描的文件项是否危险,若是,则上传所述进行二次扫描的文件项对应的属性值,若否,则
    扫描所述进行二次扫描的文件项得到对应的可疑度;
    判断所述对应的可疑度是否超过可疑阈值,若是,则上传所述进行二次扫描的文件项所对应的属性值。
  17. 一种文件扫描方法,其特征在于,包括:
    将待扫描文件的属性值与服务端中存储的特征码进行比对,得到与所述属性值一致的特征码以及所述特征码所属的类别;
    根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
  18. 一种文件的扫描系统,其特征在于,包括客户端以及服务端;
    所述客户端包括:
    枚举模块,用于枚举待扫描文件;
    属性值获取模块,用于从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值;
    所述服务端包括:
    数据库,用于存储特征码以及所述特征码所属的类别;
    比对模块,用于将所述属性值与存储的特征码进行比对,得到与所述属性值相一致的特征码以及所述特征码所属的类别;
    对应关系形成模块,用于根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
  19. 根据权利要求18所述的文件的扫描系统,其特征在于,所述客户端还包括:
    扫描文件确定模块,用于根据所述第一扫描结果确定进行本地扫描的待扫描文件;
    扫描模块,用于对所述确定的待扫描文件进行本地扫描得到第二扫描结果;
    结果整合模块,用于整合所述第二扫描结果和第一扫描结果,形成第三扫描结果。
  20. 根据权利要求19所述的文件的扫描系统,其特征在于,所述客户端还包括:
    移除模块,用于根据所述第三扫描结果中对应的文件项将所述待扫描文件从枚举队列中移除。
  21. 根据权利要求20所述的文件的扫描系统,其特征在于,所述客户端还包括:
    添加模块,用于在所述枚举队列中判断是否存在空位,若是,则将未处于枚举队列中的待扫描文件添加至所述枚举队列中。
  22. 根据权利要求19所述的文件的扫描系统,其特征在于,所述客户端还包括:
    枚举判断模块,用于判断所述枚举的待扫描文件长度是否达到第一阈值,若是,则通知查找模块;
    所述查找模块用于从所述枚举的待扫描文件中查找符合预设条件的待扫描文件;
    所述枚举判断模块还用于判断所述枚举的待扫描文件长度是否达到第二阈值,若是,则通知所述扫描文件确定模块。
  23. 根据权利要求19所述的文件的扫描方法,其特征在于,所述客户端还包括:
    标记模块,用于标记所述传输属性值的待扫描文件;
    所述扫描文件确定模块包括:
    二次扫描单元,用于根据所述第一扫描结果得到所述枚举的待扫描文件中将进行二次扫描的文件项;
    选取单元,用于从枚举的待扫描文件中选取未标记的待扫描文件,将所述进行二次扫描的文件项和未标记的待扫描文件形成进行本地扫描的待扫描文件。
  24. 根据权利要求23所述的文件的扫描系统,其特征在于,所述扫描模块还用于按照设定的优先级依次扫描所述进行二次扫描的文件项和未标记的待扫描文件。
  25. 根据权利要求19所述的文件的扫描系统,其特征在于,所述客户端还包括:
    类别获取模块,用于获取所述进行二次扫描的文件项在所述第二扫描结果中对应的类别;
    危险性判断模块,用于根据所述获取的类别判断所述进行二次扫描的文件项是否危险,若是,则通知上传模块,若否,则通知所述扫描模块;
    所述上传模块用于上传所述进行二次扫描的文件项对应的属性值;
    所述扫描模块还用于扫描所述进行二次扫描的文件项得到对应的可疑度;
    可疑度判断模块,用于判断所述对应的可疑度是否超过可疑阈值,若是,则通知所述上传模块。
  26. 一种文件的扫描客户端,其特征在于,包括:
    枚举模块,用于枚举待扫描文件;
    属性值获取模块,用于从所述枚举的待扫描文件中逐一获取所述待扫描文件的属性值,并向服务端传输所述属性值。
  27. 根据权利要求26所述的文件的扫描客户端,其特征在于,还包括:
    扫描文件确定模块,用于根据由服务端返回的第一扫描结果确定进行本地扫描的待扫描文件;
    扫描模块,用于对所述确定的待扫描文件进行本地扫描得到第二扫描结果;
    结果整合模块,用于整合所述第二扫描结果和第一扫描结果,形成第三扫描结果。
  28. 根据权利要求27所述的文件的扫描客户端,其特征在于,还包括:
    移除模块,用于根据所述第三扫描结果中对应的文件项将所述待扫描文件从枚举队列中移除。
  29. 根据权利要求28所述的文件的扫描客户端,其特征在于,还包括:
    添加模块,用于在所述枚举队列中判断是否存在空位,若是,则将未处于枚举队列中的待扫描文件添加至所述枚举队列中。
  30. 根据权利要求27所述的文件的扫描客户端,其特征在于,还包括:
    枚举判断模块,用于判断所述枚举的待扫描文件长度是否达到第一阈值,若是,则通知查找模块;
    所述查找模块用于从所述枚举的待扫描文件中查找符合预设条件的待扫描文件;
    所述枚举判断模块还用于判断所述枚举的待扫描文件长度是否达到第二阈值,若是,则通知所述扫描文件确定模块。
  31. 根据权利要求27所述的文件的扫描客户端,其特征在于,还包括:
    标记模块,用于标记所述传输属性值的待扫描文件;
    所述扫描文件确定模块包括:
    二次扫描单元,用于根据所述第一扫描结果得到所述枚举的待扫描文件中将进行二次扫描的文件项;
    选取单元,用于从枚举的待扫描文件中选取未标记的待扫描文件,将所述进行二次扫描的文件项和未标记的待扫描文件形成进行本地扫描的待扫描文件。
  32. 根据权利要求31所述的文件的扫描客户端,其特征在于,还包括所述扫描模块还用于按照设定的优先级依次扫描所述进行二次扫描的文件项和未标记的待扫描文件。
  33. 根据权利要求27所述的文件的扫描客户端,其特征在于,还包括:
    类别获取模块,用于获取所述进行二次扫描的文件项在所述第二扫描结果中对应的类别;
    危险性判断模块,用于根据所述获取的类别判断所述进行二次扫描的文件项是否危险,若是,则通知上传模块,若否,则通知所述扫描模块;
    所述上传模块用于上传所述进行二次扫描的文件项对应的属性值;
    所述扫描模块还用于扫描所述进行二次扫描的文件项得到对应的可疑度;
    可疑度判断模块,用于判断所述对应的可疑度是否超过可疑阈值,若是,则通知所述上传模块。
  34. 一种文件的扫描服务器,其特征在于,包括:
    数据库,用于存储特征码以及所述特征码所属的类别;
    比对模块,用于将待扫描文件的属性值与存储的特征码进行比对,得到与所述属性值相一致的特征码以及所述特征码所属的类别;
    对应关系形成模块,用于根据与所述属性值一致的特征码以及所述特征码所属的类别形成所述待扫描文件、属性值以及类别之间的对应关系,并将所述对应关系记录于第一扫描结果中。
PCT/CN2012/078387 2011-08-04 2012-07-09 文件的扫描方法、系统、客户端及服务器 WO2013017004A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US14/130,665 US9069956B2 (en) 2011-08-04 2012-07-09 Method for scanning file, client and server thereof
EP12820533.3A EP2741227B1 (en) 2011-08-04 2012-07-09 Method, system, client and server for scanning file
RU2014102898/08A RU2581560C2 (ru) 2011-08-04 2012-07-09 Способ сканирования файлов, клиентский компьютер и сервер
BR112014002425-1A BR112014002425B1 (pt) 2011-08-04 2012-07-09 Método para varrer arquivos

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2011102227389A CN102915421B (zh) 2011-08-04 2011-08-04 文件的扫描方法及系统
CN201110222738.9 2011-08-04

Publications (1)

Publication Number Publication Date
WO2013017004A1 true WO2013017004A1 (zh) 2013-02-07

Family

ID=47613784

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/078387 WO2013017004A1 (zh) 2011-08-04 2012-07-09 文件的扫描方法、系统、客户端及服务器

Country Status (6)

Country Link
US (1) US9069956B2 (zh)
EP (1) EP2741227B1 (zh)
CN (1) CN102915421B (zh)
BR (1) BR112014002425B1 (zh)
RU (1) RU2581560C2 (zh)
WO (1) WO2013017004A1 (zh)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103248666A (zh) * 2012-02-14 2013-08-14 深圳市腾讯计算机系统有限公司 一种离线下载资源的系统、方法及装置
CN102799811B (zh) * 2012-06-26 2014-04-16 腾讯科技(深圳)有限公司 扫描方法和装置
CN103390130B (zh) * 2013-07-18 2017-04-05 北京奇虎科技有限公司 基于云安全的恶意程序查杀的方法、装置和服务器
CN103605743A (zh) * 2013-11-20 2014-02-26 中国科学院深圳先进技术研究院 一种移动终端空文件夹删除的方法及装置
WO2015128612A1 (en) 2014-02-28 2015-09-03 British Telecommunications Public Limited Company Malicious encrypted traffic inhibitor
US9383989B1 (en) 2014-06-16 2016-07-05 Symantec Corporation Systems and methods for updating applications
CN104268288B (zh) * 2014-10-21 2018-06-19 福州瑞芯微电子股份有限公司 一种基于ntfs的媒体库扫描方法及装置
EP3241140B1 (en) 2014-12-30 2021-08-18 British Telecommunications public limited company Malware detection in migrated virtual machines
WO2016107754A1 (en) * 2014-12-30 2016-07-07 British Telecommunications Public Limited Company Malware detection
US10075453B2 (en) * 2015-03-31 2018-09-11 Juniper Networks, Inc. Detecting suspicious files resident on a network
US10931689B2 (en) 2015-12-24 2021-02-23 British Telecommunications Public Limited Company Malicious network traffic identification
US10839077B2 (en) 2015-12-24 2020-11-17 British Telecommunications Public Limited Company Detecting malicious software
WO2017108575A1 (en) 2015-12-24 2017-06-29 British Telecommunications Public Limited Company Malicious software identification
EP3394783B1 (en) 2015-12-24 2020-09-30 British Telecommunications public limited company Malicious software identification
WO2017109129A1 (en) 2015-12-24 2017-06-29 British Telecommunications Public Limited Company Software security
EP3437290B1 (en) 2016-03-30 2020-08-26 British Telecommunications public limited company Detecting computer security threats
WO2017167545A1 (en) 2016-03-30 2017-10-05 British Telecommunications Public Limited Company Network traffic threat identification
EP3500970B8 (en) 2016-08-16 2021-09-22 British Telecommunications Public Limited Company Mitigating security attacks in virtualised computing environments
US11562076B2 (en) 2016-08-16 2023-01-24 British Telecommunications Public Limited Company Reconfigured virtual machine to mitigate attack
US10771483B2 (en) 2016-12-30 2020-09-08 British Telecommunications Public Limited Company Identifying an attacked computing device
EP3602999B1 (en) 2017-03-28 2021-05-19 British Telecommunications Public Limited Company Initialisation vector identification for encrypted malware traffic detection
CN108881120B (zh) 2017-05-12 2020-12-04 创新先进技术有限公司 一种基于区块链的数据处理方法及设备
EP3623980B1 (en) 2018-09-12 2021-04-28 British Telecommunications public limited company Ransomware encryption algorithm determination
EP3623982B1 (en) 2018-09-12 2021-05-19 British Telecommunications public limited company Ransomware remediation
ES2879907T3 (es) * 2018-12-28 2021-11-23 Advanced New Technologies Co Ltd Ejecución paralela de transacciones en una red de cadena de bloques basada en listas blancas de contratos inteligentes
CN111506747B (zh) * 2020-04-16 2023-09-08 Oppo(重庆)智能科技有限公司 文件解析方法、装置、电子设备及存储介质
CN112149115A (zh) * 2020-08-28 2020-12-29 杭州安恒信息技术股份有限公司 一种病毒库的更新方法、装置、电子装置和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127061B (zh) * 2006-08-16 2010-05-26 珠海金山软件股份有限公司 可进度预估的防治计算机病毒的装置及进度预估的方法
CN101808102A (zh) * 2010-04-23 2010-08-18 潘燕辉 一种基于云计算的操作记录追踪系统和方法
CN101827096A (zh) * 2010-04-09 2010-09-08 潘燕辉 一种基于云计算的多用户协同安全防护系统和方法

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6021510A (en) * 1997-11-24 2000-02-01 Symantec Corporation Antivirus accelerator
US6931548B2 (en) * 2001-01-25 2005-08-16 Dell Products L.P. System and method for limiting use of a software program with another software program
US7062490B2 (en) * 2001-03-26 2006-06-13 Microsoft Corporation Serverless distributed file system
US6993132B2 (en) * 2002-12-03 2006-01-31 Matsushita Electric Industrial Co., Ltd. System and method for reducing fraud in a digital cable network
US7257842B2 (en) * 2003-07-21 2007-08-14 Mcafee, Inc. Pre-approval of computer files during a malware detection
US7475427B2 (en) * 2003-12-12 2009-01-06 International Business Machines Corporation Apparatus, methods and computer programs for identifying or managing vulnerabilities within a data processing network
EP1549012A1 (en) * 2003-12-24 2005-06-29 DataCenterTechnologies N.V. Method and system for identifying the content of files in a network
US7971257B2 (en) * 2006-08-03 2011-06-28 Symantec Corporation Obtaining network origins of potential software threats
CN101039177A (zh) * 2007-04-27 2007-09-19 珠海金山软件股份有限公司 一种在线查毒的装置和方法
CN101621511A (zh) * 2009-06-09 2010-01-06 北京安天电子设备有限公司 一种多层次的无本地病毒库检测方法及系统
RU2420791C1 (ru) * 2009-10-01 2011-06-10 ЗАО "Лаборатория Касперского" Метод отнесения ранее неизвестного файла к коллекции файлов в зависимости от степени схожести
CN101795267B (zh) * 2009-12-30 2012-12-19 成都市华为赛门铁克科技有限公司 病毒检测方法、装置和网关设备
RU103201U1 (ru) * 2010-11-01 2011-03-27 Закрытое акционерное общество "Лаборатория Касперского" Система оптимизации использования ресурсов компьютера при антивирусной проверке
CN102024113B (zh) * 2010-12-22 2012-08-01 北京安天电子设备有限公司 快速检测恶意代码的方法和系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127061B (zh) * 2006-08-16 2010-05-26 珠海金山软件股份有限公司 可进度预估的防治计算机病毒的装置及进度预估的方法
CN101827096A (zh) * 2010-04-09 2010-09-08 潘燕辉 一种基于云计算的多用户协同安全防护系统和方法
CN101808102A (zh) * 2010-04-23 2010-08-18 潘燕辉 一种基于云计算的操作记录追踪系统和方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2741227A4 *

Also Published As

Publication number Publication date
RU2014102898A (ru) 2015-09-10
EP2741227B1 (en) 2017-01-04
EP2741227A4 (en) 2015-07-22
US20140157408A1 (en) 2014-06-05
CN102915421A (zh) 2013-02-06
BR112014002425B1 (pt) 2021-06-01
US9069956B2 (en) 2015-06-30
EP2741227A1 (en) 2014-06-11
BR112014002425A2 (pt) 2017-02-21
RU2581560C2 (ru) 2016-04-20
CN102915421B (zh) 2013-10-23

Similar Documents

Publication Publication Date Title
WO2013017004A1 (zh) 文件的扫描方法、系统、客户端及服务器
WO2017088664A1 (zh) 集群文件系统的数据处理方法和装置
WO2015172684A1 (en) Ap connection method, terminal, and server
WO2016036192A1 (en) Image display apparatus and image display method
WO2015030556A1 (en) Apparatus and method for displaying chart in electronic device
WO2016089009A1 (en) Method and cloud server for managing device
WO2020224247A1 (zh) 基于区块链的数据溯源方法、装置、设备及可读存储介质
WO2015072646A1 (en) Image forming apparatus and method for controlling display of pop-up window
WO2018076875A1 (zh) 备份数据的同步方法、装置、存储介质、电子设备及服务器
WO2015196960A1 (en) Method and system for checking security of url for mobile terminal
WO2018076864A1 (zh) 一种数据同步方法、装置、存储介质及电子设备
WO2015120774A1 (en) Network access method and apparatus applied to mobile application
WO2018023926A1 (zh) 电视与移动终端的互动方法及系统
WO2018098881A1 (zh) 应用程序的访问处理方法及装置
WO2017155207A1 (en) Management server and file managing method
WO2019161597A1 (zh) 基于即时通讯的信息发送方法、装置、设备和存储介质
WO2017071352A1 (zh) 密码的推送方法、推送系统及终端设备
WO2018034491A1 (en) A primary device, an accessory device, and methods for processing operations on the primary device and the accessory device
WO2015139594A1 (en) Security verification method, apparatus, and system
WO2018076811A1 (zh) 数据分享方法、装置、存储介质及电子设备
WO2019037395A1 (zh) 密钥管理方法、装置及可读存储介质
WO2019000466A1 (zh) 人脸识别方法、装置、存储介质及电子设备
WO2019164281A1 (en) Electronic device and control method thereof
WO2018076453A1 (zh) 一种关联应用显示方法、装置及移动终端
WO2017190451A1 (zh) 图片推送方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12820533

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14130665

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2012820533

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2012820533

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2014102898

Country of ref document: RU

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112014002425

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112014002425

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20140131