CN102609515B - Quick file scanning method and quick file scanning system - Google Patents

Quick file scanning method and quick file scanning system Download PDF

Info

Publication number
CN102609515B
CN102609515B CN2012100267600A CN201210026760A CN102609515B CN 102609515 B CN102609515 B CN 102609515B CN 2012100267600 A CN2012100267600 A CN 2012100267600A CN 201210026760 A CN201210026760 A CN 201210026760A CN 102609515 B CN102609515 B CN 102609515B
Authority
CN
China
Prior art keywords
file
characteristic information
information
scanning
filename
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2012100267600A
Other languages
Chinese (zh)
Other versions
CN102609515A (en
Inventor
邹贵强
付旻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qizhi Business Consulting Co ltd
Beijing Qihoo Technology Co Ltd
360 Digital Security Technology Group Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN2012100267600A priority Critical patent/CN102609515B/en
Publication of CN102609515A publication Critical patent/CN102609515A/en
Priority to PCT/CN2013/071383 priority patent/WO2013117151A1/en
Priority to US14/377,014 priority patent/US9355250B2/en
Application granted granted Critical
Publication of CN102609515B publication Critical patent/CN102609515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The invention provides a quick file scanning method and a quick file scanning system, which relate to the technical field of networks. The method includes: acquiring a data packet, wherein the data packet comprises safety file characteristic information used for determining whether files in a system are safety files or not; and scanning file characteristic information of files in the system one by one, and if the currently scanned file characteristic information is matched with the safety file characteristic information of the safety file in the data packet, skipping over virus killing and scanning of the current file to continue to scan a next file. By means of the data packet which comprises the safety file characteristic information used for determining whether the files in the system are safety files or not, when a new user scans the files for the first time, when the file with same characteristic information as that in the data packet is scanned, the long-time and safe files represented by the file can be skipped over, and accordingly first scanning time is shortened.

Description

A kind of file fast scanning method and system
Technical field
The application relates to networking technology area, particularly relates to a kind of file fast scanning method and system.
Background technology
Along with popularizing of computing machine, user side needs all that basically antivirus software is installed the file in the computing machine is scanned.When antivirus software scans, need a large amount of CPU computing and disk operating, so that scanning process is very long and affect system speed.And for the file in the computing machine, it all is identical that a lot of files are arranged, such as the file of Windows, and the installation kit file of a lot of softwares, help file, compressed file etc.
In the prior art, All Files when the antivirus software of user side scans for the first time in the meeting scan full hard disk computing machine, and the various contents in the meeting scanning document, if for the file that comprises that content is larger, the time consumption of its scanning quite a lot of, thus for the first time time length very of scanning caused.Such as for compressed package, prior art can scan the content decompress(ion) in the compressed package one by one according to normal scanning process, to guarantee that file is safe, like this for a compressed package, the sweep time of prior art is long, thereby has prolonged the time of whole scanning, and for the user, the speed that the CPU computing that takies for a long time owing to scan for a long time and disk operating have affected system, thus the user of impact is to the use of computing machine.
Summary of the invention
The application's technical matters to be solved provides a kind of file fast scanning method and system, has solved the New Consumers end long problem sweep time first time.
In order to address the above problem, the application discloses a kind of file fast scanning method, comprising:
Obtain packet; Described packet comprises whether for definite system File be the secure file characteristic information of secure file;
The file feature information of scanning system File one by one, when if markup document is the secure file characteristic information coupling of secure file in the file feature information of current scanning and the described packet, then skip the virus killing scanning to current file, continue the next file of scanning.
Wherein, described secure file characteristic information obtains by the secure file characteristic information of adding up each user side and sending in the processing enter server, comprising:
Receive the characteristic information of the secure file of each user side transmission; When described characteristic information comprises user side complete scan file, determine the characteristic information of the file of safety;
For identical characteristic information, add up its multiplicity;
Extract multiplicity greater than or more than or equal to the characteristic information of amount threshold;
Wherein, when user side complete scan file:
When the file security of described scanning, obtain the characteristic information of file, described characteristic information comprises filename, file size, file modification time and file content descriptor;
Filename is carried out CRC calculate, obtain the filename crc value;
The file content descriptor is carried out Message Digest 5 calculate, obtain content matching information;
To comprise that file size, file modification time, filename crc value and content matching the characteristics of information information send.Preferably, for identical characteristic information, add up its multiplicity:
The characteristic information that receives is sorted by total multiplicity;
For each same characteristic features information, will belong to the heavily computing that disappears of same characteristic features message that same user side sends;
For each the same characteristic features information that disappears after weighing, add up the multiplicity of each characteristic message.
Preferably, extract multiplicity greater than or characteristic information more than or equal to amount threshold after comprise:
Deposit the described characteristic information that extracts in data file, and according to described Generating Data File packet.
Preferably, before generating, described packet also comprises:
Receive the characteristic information of the unsafe file that user side sends, this characteristic information is not deposited in data file or characteristic information identical with this characteristic information in the packet is deleted.
Preferably, on generating, behind packet, behind the characteristic information that counts on new secure file, be updated to a packet.
Preferably, by the following method the characteristic information of described file and the characteristic information in the described packet are mated:
In buffer memory, begin to mate from the highest characteristic information of matching efficiency.
Preferably, when a characteristic information coupling in file size and file modification time and the database, then filename is carried out described CRC and calculate, obtain the filename crc value, and the filename crc value of described filename crc value and this characteristic information is mated;
When the filename crc value of described filename crc value and this characteristic information mates, then the file content descriptor being carried out described Message Digest 5 calculates, obtain content matching information, and the content matching information of described content matching information and this characteristic information is mated.
Preferably, when the first user end carried out scanning the first time, whether prompting first user end selected rapid scanning, if select then the characteristic information of scanning document, and called described packet and mated.
Preferably, when the first user end scanned, the characteristic information that will work as secure file in time scanning result deposited the secure file information list of described packet in; When scanning next time, the first user end scans according to the record after the last time scanning.
Accordingly, disclosed herein as well is a kind of file rapid scanning system, comprising:
The first user end, described first user end comprises:
Acquisition module is used for obtaining packet; Described packet comprises whether for definite system File be the secure file characteristic information of secure file;
The scan matching module, be used for the one by one file feature information of scanning system File, if when markup document is the secure file characteristic information coupling of secure file in the file feature information of current scanning and the described packet, then skip the virus killing scanning to current file, continue the next file of scanning;
Wherein, described file rapid scanning system also comprises: the second user side group and processing enter server; The secure file characteristic information that each user side that described processing enter server is used for adding up described the second user side group sends to the processing enter server obtains the characteristic information in the described packet; Each client of described the second user side group is used for sending the characteristic information of secure file; Described processing enter server comprises:
The characteristic information module is used for receiving the characteristic information of the secure file that each user side of described the second user side group sends; When described characteristic information comprises user side complete scan file, determine the characteristic information of the file of safety;
Statistical module is used for for identical characteristic information, adds up its multiplicity;
Extraction module, be used for to extract multiplicity greater than or more than or equal to the characteristic information of amount threshold;
Wherein, each user side of described the second user side group comprises:
Characteristic acquisition unit is used for obtaining the characteristic information of file when the file security that scans, and described characteristic information comprises filename, file size, file modification time and file content descriptor;
The filename computing unit is used for that filename is carried out CRC and calculates, and obtains the filename crc value;
The content description information computing unit is used for that the file content descriptor is carried out Message Digest 5 and calculates, and obtains content matching information;
The characteristic information transmitting element is used for comprising that file size, file modification time, filename crc value and content matching the characteristics of information information send.
Preferably, described statistical module comprises: sequencing unit, and the characteristic information that is used for receiving sorts by total multiplicity;
The heavy unit that disappears is used for for each same characteristic features information, will belong to the heavily computing that disappears of same characteristic features message that same user side sends;
Statistic unit is used for adding up the multiplicity of each characteristic message for each the same characteristic features information that disappears after weighing.
Preferably, after extraction module, also comprise:
Generation unit is used for depositing the described characteristic information that extracts in data file, and according to described Generating Data File packet.
Preferably, before generating, described packet also comprises:
Remove the unit, be used for receiving the characteristic information of the unsafe file that user side sends, this characteristic information is not deposited in data file or characteristic information identical with this characteristic information in the packet is deleted.
Preferably, also comprise:
Update module, be used on generating a packet after, behind the characteristic information that counts on new secure file, be updated to a packet.
Preferably, by the following method the characteristic information of described file and the characteristic information in the described packet are mated:
In buffer memory, begin to mate from the highest characteristic information of matching efficiency.
Preferably, when a characteristic information coupling in file size and file modification time and the database, then filename is carried out described CRC and calculate, obtain the filename crc value, and the filename crc value of described filename crc value and this characteristic information is mated;
When the filename crc value of described filename crc value and this characteristic information mates, then the file content descriptor being carried out described Message Digest 5 calculates, obtain content matching information, and the content matching information of described content matching information and this characteristic information is mated.
Preferably, also comprise:
Prompting module is used for when scanning for the first time, and whether prompting first user end selects rapid scanning, if select then the characteristic information of scanning document, and calls described packet and mates.
Preferably, also comprise: upgrade logging modle, be used for when the first user end scans, the characteristic information that will work as secure file in time scanning result deposits the secure file information list of described packet in; When scanning next time, the first user end scans according to the safety record result after the last time scanning.Compared with prior art, the application comprises following advantage:
The application comprises by utilization whether for definite system File be the packet of the secure file characteristic information of secure file, New Consumers is when scanning for the first time, if just can skip the long and safe file of time of its representative when scanning the file identical with characteristic information in the described packet, can reduce the first time of scanning.
Description of drawings
Fig. 1 is the schematic flow sheet of a kind of file fast scanning method of the application;
Fig. 2 is the schematic flow sheet of the preferred a kind of packet generation method of the application;
Fig. 3 is a kind of example of compressed package head data;
Fig. 4 is the structural representation of a kind of file rapid scanning of the application system;
Fig. 5 is the structural representation of the preferred a kind of file rapid scanning of the application system.
Embodiment
For above-mentioned purpose, the feature and advantage that make the application can become apparent more, below in conjunction with the drawings and specific embodiments the application is described in further detail.
In practice, for being in some levels (such as 10 5) above user side, if when this user side more than order of magnitude all scans the file with same feature (such as comprising filename, file size, file modification time, file content descriptor, content information etc.), if this kind file is safe, other users also are safe at the file that use has same feature basically so.The application is namely consuming time long to complete scan according to this specific character statistics flood tide user, and the special information of the file of safety, then generate packet (upgrade patch etc.) to be matched based on described characteristic information, the user is after using described packet, but the characteristic information in the characteristic information of scanning document and the described packet mates, if on the coupling, can skip so long file of these normal scan times, thereby can reach the purpose of saving sweep time.
With reference to Fig. 1, the schematic flow sheet that it shows a kind of file fast scanning method of the application comprises:
Step 110 is obtained packet; Described packet comprises whether for definite system File be the secure file characteristic information of secure file.
The New Consumers end at first obtains the packet that comprises security feature information, and then the New Consumers end then can scan according to this packet, to save for the first time time of scanning.Wherein the New Consumers end adopts the first user end to be described in this application, described first user end mainly comprises the user side that antivirus software is not installed and the user side of antivirus software being installed but not being carried out overall file scanning, can also comprise the user side of antivirus software being installed and being carried out overall file scanning, scan the user side that has occurred new not scanned file in the rear user side but carry out overall file in the last time.
The first user end can deposit by installation the installation kit of described packet in, perhaps mounted upgrading antivirus software is obtained described packet.In practice, the first user end may not carry out overscanning fully, also may in use may get access to a lot of new files, and these file user sides also may not carry out complete scan, when this user side is wanted to carry out rapid scanning, then can be undertaken by the application's packet.
Wherein, in the described packet really the fixed system File be that the characteristic information of secure file can obtain by the complete scan result of each user side of statistics.Such as, carry out the result of complete scan for a plurality of users, for the file that in each user side, has same characteristic features information (such as characteristic informations such as file size, file modification time, filename crc value and content matching information), if each user side is to the equal safety of the scanning result of this document, can deposit this characteristic information in described packet so, the file that has this characteristic information for definite system is secure file.
Preferably, described characteristic information obtains by the secure file characteristic information of adding up each user side and sending in the processing enter server.Namely carry out the characteristic information of the file of the affirmation safety that complete scan obtains for each user side, the processing enter server carries out statistics and analysis to it automatically, the characteristic information greater than some threshold values can be generated corresponding packet.In this application, at first need the generated data bag, with reference to Fig. 2, show the schematic flow sheet of the preferred a kind of packet generation method of the application, comprising:
Step 210 receives the characteristic information of the secure file that each user side sends; When described characteristic information comprises user side complete scan file, determine the characteristic information of the file of safety.
In practice, have the second user side group, comprising a large number of users end, these user sides all can be sent to the qualified characteristic information that the file in the own system is carried out after the complete scan in the processing enter server.Namely when each user side of the second user side group is selected file in its computing machine of complete scan, can with scanning consuming time greater than or send to the processing enter server more than or equal to the characteristic information of the secure file of certain hour threshold value.
In practice, the application adds up by the normal scanning of the user side scanning characteristic information of long file consuming time to flood tide, then based on greater than or generate packet to be matched more than or equal to the characteristic information of the respective file of threshold value, then New Consumers is when scanning for the first time, if just can skip the long and safe file of time of its representative when scanning the file identical with characteristic information in the described packet, can reduce the first time of scanning.
Preferably, when user side complete scan file:
Described complete scan is the scanning of killing virus of the full content of each file in the user terminal system.
Step S11 when the file security of described scanning, obtains the characteristic information of file, and described characteristic information comprises filename, file size, file modification time and file content descriptor.
When user side is selected the complete scan file, scanning engine scans each file in the user end computer one by one, for a lot of files, such as for compressed package, scanning engine need to be according to requirement first decompress(ion) in engine of compressed package, again to the scanning of killing virus of each file that comprises in the compressed package, generally speaking, suitable many of the time that this scan mode needs; Such as the installation kit for software, scanning engine also needs the information decompress(ion) in the installation kit is out scanned again, need to expend the considerable time equally again.So when time that user side scans the cost of whole file greater than or during more than or equal to threshold value, the described characteristic information of characteristic information that then can obtain current file comprises filename, file size, file modification time and file content descriptor, and described characteristic information comprises filename, file size, file modification time and file content descriptor.
When described sweep time greater than or more than or equal to time threshold and when safe, obtain the characteristic information of file, described characteristic information comprises filename, file size, file modification time and file content descriptor.Wherein, when calculating user side for sweep time of a file, in the time of can be according to the beginning scanning document and the time point when finishing scanning document, calling system api function GetTickCount calculates from the beginning scanning document and the sweep time when finishing scanning document.Such as beginning scanning document time point be, 21 o'clock 50 minutes 30 seconds and 00 millisecond Tu. of on January 17th, 2012, time point is 21 o'clock 51 minutes 30 seconds and 00 millisecond Tu. of on January 17th, 2012 when finishing scanning document, and user side is for being 1 minute the sweep time of current file so.If the setup times threshold value is 30 seconds, so for this document, user side namely obtains the characteristic information of this document, comprises filename, file size, file modification time and file content descriptor.
Wherein the file content descriptor only accounts for the very little part of whole file, and when user side scanning document content description information part, the spent time is far smaller than the time of the whole file of scanning.Such as for compressed file, its file content descriptor is in compressed package head data, the byte number that can take according to the current file content description information that header data is informed during scanning, user side only need to scan byte address space, current file header data place and get final product, with reference to Fig. 3, it is a kind of header data example of compressed package, fileinfo in the compressed package (filename, size, data check value etc.) has been described in the header data of compressed package, as long as these critical datas do not change, just illustrate that the compressed package content does not change.Such as the installation kit for software, content description information is in the tail data of installation kit again, and the byte number that can take according to the tail data that file is informed during scanning scans the respective byte address space and gets final product.
Step S12 carries out CRC with filename and calculates, and obtains the filename crc value.
Because filename relates to user's privacy, the application calculates CRC of file masterpiece (CRC, Cyclical Redundancy Check), obtains one without readable filename crc value.
Step S13 carries out Message Digest 5 with the file content descriptor and calculates, and obtains content matching information.
For the file content descriptor, relatively whole file, although the file content descriptor is very little, if but itself go to mate with the file content descriptor, then may cause the time relatively many owing to the file content descriptor is huge, the application then carries out the file content descriptor Message Digest 5 (Message-Digest Algorithm) and calculates, obtain content matching information, general the application carries out MD5 (Message-Digest Algorithm 5, Message Digest Algorithm 5) calculates, the content matching information that obtains is the MD5 value, can greatly reduce match time when mating by the MD5 value, guarantee fast contrast coupling also to have guaranteed the security of file simultaneously.
Step S14 will comprise that file size, file modification time, filename crc value and content matching the characteristics of information information send.
Obtain above-mentioned comprise file size, file modification time, filename crc value and content matching the characteristics of information information after, described characteristic information can be sent to the processing enter server, wait for the processing enter server process.
The application's applied environment has comprised provides the antivirus software installation kit of being correlated with, the processing enter server of the data such as AKU, a large amount of user sides that passes through the connection of network language processing enter server, the complete scan work of therefore for the online user file being carried out all can be carried out above-mentioned steps.
Step 220 for identical characteristic information, is added up its multiplicity.
In this step, the processing enter server can go heavily computing to the data of receiving, a plurality of data with same characteristic features information that are about to the same client transmission are gone heavily, and making its number of times is 1.
Preferably, for identical characteristic information, add up its multiplicity:
S21 sorts the characteristic information that receives by total multiplicity.
At first the characteristic information that receives is sorted by total multiplicity, such as the characteristic information (m that holds file for corresponding certain user, 100kb, 2012/1/11/21:50:30:10, n) wherein m is the filename crc value, n is that the content matching information of file content descriptor is the MD5 value, the corresponding user side that sends this information of corresponding every characteristic information, such as with to user side A, its information that sends to the processing enter server can A-(m, 100kb, 2012/1/11/21:50:30:10, n) form show.All bars that the processing enter server is received have identical (A, 100kb, 2012/1/11/21:50:30:10, B) to be added up it and repeats total degree, then sorts according to the statistics total degree.In the process of this time sort method, can easily find out the data with same characteristic features information that the same subscriber end sends.
S22 for each same characteristic features information, will belong to the heavily computing that disappears of same characteristic features message that same user side sends.
A plurality of data with same characteristic features information for same client transmission, sending characteristic information such as customer end A is (m, 100kb, 2012/1/11/21:50:30:10, n) there are 10, so it is gone heavily, (the m that makes processing enter that this user side is repeated to send, 100kb, 2012/1/11/21:50:30:10, n) characteristic information is designated as 1 time.So can guarantee the accuracy of number of users when adding up for certain characteristic information, guarantee the application's validity.
S23 for each the same characteristic features information that disappears after weighing, adds up the multiplicity of each characteristic message.
After going to weigh, can add up the multiplicity of each characteristic message, this number of times is consistent with the user side quantity that scanning obtains this characteristic information again.
The application also can add up the multiplicity of each file feature information by the additive method heavily computing that disappears, and this application is not limited it.
Step 230, extract multiplicity greater than or more than or equal to the characteristic information of amount threshold.
After obtaining the multiplicity of characteristic information, multiplicity and amount threshold are compared, if described multiplicity then deposits it in data file greater than threshold value.Be 150,000 such as the multiplicity for aforementioned (m, 100kb, 2012/1/11/21:50:30:10, n) characteristic information, and threshold value setting is 100,000, these characteristic information data can be extracted so.
Step 240 deposits the described characteristic information that extracts in data file, and according to described Generating Data File packet.
In this application data file can in the form that can tabulate of data exist, every characteristic information comprises a plurality of dimensions, such as for (m, 100kb, 2012/1/11/21:50:30:10, n), then this characteristic information comprises four dimensions in the tabulation, is respectively file size, file modification time, filename crc value and content matching information.Then when mating, follow up scan mates with this four dimensions.
For the data file that generates, can be bundled to the installation kit of antivirus software, also can generate the form of upgrade patch.
In addition, before generating, described packet also comprises:
Step S210 receives the characteristic information of the unsafe file that user side sends, and this characteristic information is not deposited in data file or characteristic information identical with this characteristic information in the packet is deleted.
User side for the complete scan time greater than the appearance of threshold value viral file, also the characteristic information of this unsafe file can be labeled as dangerous, be sent to again in the processing enter server, the processing enter server can all not processed for all the bar data with this characteristic information, does not deposit data file in yet.Such as there being 50,000 user sides to send aforementioned (m, 100kb, 2012/1/11/21:50:30:10, n) data of characteristic information, but the Data Identification of this characteristic information that one of them user side sends is dangerous, namely find virus, then the processing enter server then can be to not having (m, 100kb, 2012/1/11/21:50:30:10, n) data of characteristic information process, will not comprise (m, 100kb, 2012/1/11/21:50:30:10, n) characteristic information deposit in the packet.
In addition, in practice, when the processing enter server is added up at the characteristic information that each user side is sent, generally add up take certain hour length as the cycle, and at the packet to the characteristic information of the statistics generating security file in this cycle.The characteristic information data that each user side sends are added up take the sky as chronomere such as the processing enter server, generated data bag then, the first user end that can offer next cycle uses.
Step 120, the file feature information of scanning system File one by one, if when markup document is the secure file characteristic information coupling of secure file in the file feature information of current scanning and the described packet, then skip the virus killing scanning to current file, continue the next file of scanning.
Such as to scannings of killing virus such as the malicious code that comprises in the file content or viruses.
After getting access to described data, then can carry out rapid scanning:
Step S121, the system of first user end is the characteristic information of scanning system File one by one;
Step S122 is that the secure file characteristic information of secure file mates with markup document in the file feature information of current scanning and the described packet; If during coupling, then change step S123 over to, skip current file, continue the next file of scanning; If when not mating, then change step S124 over to, current file is carried out complete virus killing scanning, namely scan all the elements of current file.
In practice, before scanning, the first user end also comprises: be confirmed whether to select rapid scanning, if so, the characteristic information of scanning document then, and call described packet and mate.
Be that the first user end can select to carry out rapid scanning or to the complete scan of file, if select rapid scanning, but the characteristic information of scanning document then, and call described packet and mate the coupling of carrying out characteristic information.When the characteristic information of described file and the coupling of the characteristic information in the described packet, then skip current file, continue the next file of scanning.
In this application, when scanning for the first time, can point out the first user end whether to select rapid scanning, if select then the characteristic information of scanning document, and call described packet and mate.
If the first user end selects to carry out rapid scanning, the first user end characteristic information that then at first obtains first user end file when scanning mates so, and without the full content of scanning document.
When the first user end mates the characteristic information of described file and the characteristic information in the described packet:
In buffer memory, begin to mate from the highest characteristic information of matching efficiency.
Because characteristic information is multidimensional in the packet, when determining whether coupling, can begin comparison from most effective data, the outer dimension of calculating such as not needing to carry out, for example file size and file modification time, because file size and modification time are obtainable system datas when the traversal file, do not need extra computing, so so operation can improve the efficient of the comparison match that presets buffer memory.
Preferably, when coupling:
Step S31, when a characteristic information coupling in file size and file modification time and the database, then filename is carried out described CRC and calculate, obtain the filename crc value, and the filename crc value of described filename crc value and this characteristic information is mated.Wherein, the CRC computing of filename is internal memory operation, and data volume is very little.
When coupling, at first mate from the higher dimension of matching efficiency, in characteristic information dimension in the packet, file size and file modification time do not need to carry out extra computation, can directly mate, so when user side scans, for file size and the file modification time of the characteristic information that gets access to, such as at first comparison document is big or small, at the comparison document modification time.When if the file size of the file of current scanning is identical with the file modification time, the dimension that then compares the calculated amount less, such as passing through CRC computing calculation document name crc value, then the filename crc value with described filename crc value and this characteristic information mates, if on the coupling, then do not enter complete scan, if on the coupling, then enter the relatively many dimensions of calculated amount and mate, such as changing step S32 over to.
Step S32, when the filename crc value of described filename crc value and this characteristic information mates, then the file content descriptor is carried out described Message Digest 5 and calculate, obtain content matching information, and the content matching information of described content matching information and this characteristic information is mated.
Work as file size, when file modification time and filename crc value all mate, then the file content descriptor being carried out Message Digest 5 calculates, generally be to carry out MD5 to calculate, obtain the file content match information, then the content matching information of described content matching information and this characteristic information is mated, on coupling, then skip current file, change the next file of scanning over to.
In this application, during coupling, if in the characteristic information, there is a dimension not mate, represents that namely this document does not mate, the scanning of this document can being killed virus.Four dimensions such as aforesaid characteristic information: file size, the file modification time, filename crc value and content matching information, the matching order of this four dimensions is: 1, file size, 2, the file modification time, 3, filename crc value 4, content matching information; If one when not mating when namely file size is mated, can be carried out the coupling of 2,3 and 4 dimensions with the file size of file A and the first dimension of the characteristic information in the packet so, the scanning of can file A need to killing virus; If on the file size of the file A coupling, again with the file modification time of file A and the 2nd dimension of this characteristic information in the packet, be not mate when mating the file modification time, can carry out so the coupling of 3 and 4 dimensions, the scanning of file A can being killed virus.Other situations can the rest may be inferred.
In addition, preferred when each scanning, when the first user end scanned, the characteristic information that will work as secure file in time scanning result deposited the secure file information list of described packet in; When scanning next time, the first user end scans according to the record after the last time scanning.
When the first user end scans in conjunction with current packet, be not included in file in the packet for characteristic information, if when time detection safety, then the characteristic information of its this secure file can be deposited in the secure file information list of packet, when the first user end scans next time, can scan more fast according to the scanning result of last time.In addition, when user side carried out complete scan to each file, if it is dangerous to detect certain file, but the characteristic information of this document then can will be deleted in its secure file information list from packet again in packet.
Accordingly, with reference to Fig. 4, disclosed herein as well is the structural representation of a kind of file rapid scanning system, comprising:
First user end 410, described first user end comprises:
Acquisition module S411 is used for obtaining packet; Described packet comprises whether for definite system File be the secure file characteristic information of secure file;
Scan matching module S412, be used for the one by one file feature information of scanning system File, when if markup document is the secure file characteristic information coupling of secure file in the file feature information of current scanning and the described packet, then skip the virus killing scanning to current file, continue the next file of scanning.
With reference to Fig. 5, the structural representation that it shows the preferred a kind of file rapid scanning of the application system comprises:
First user end 510, the second user side groups 520 and processing enter server 530;
Described first user end comprises 510:
Acquisition module S511 is used for obtaining packet; Described packet comprises whether for definite system File be the secure file characteristic information of secure file;
Scan matching module S512, be used for the one by one file feature information of scanning system File, when if markup document is the secure file characteristic information coupling of secure file in the file feature information of current scanning and the described packet, then skip the virus killing scanning to current file, continue the next file of scanning;
Described processing enter server 530 is used for secure file characteristic information that each user side of statistics sends to the processing enter server and obtains characteristic information in the described packet;
Described the second user side group 520 is used for the characteristic information of the secure file of transmission.
Preferably, described processing enter server comprises:
The characteristic information module is for the characteristic information of the secure file that receives each user side transmission; When described characteristic information comprises user side complete scan file, determine the characteristic information of the file of safety;
Statistical module is used for for identical characteristic information, adds up its multiplicity;
Extraction module, be used for to extract multiplicity greater than or more than or equal to the characteristic information of amount threshold.
Preferably, each user side of described the second user side group comprises:
Characteristic acquisition unit is used for obtaining the characteristic information of file when the file security that scans, and described characteristic information comprises filename, file size, file modification time and file content descriptor;
The filename computing unit is used for that filename is carried out CRC and calculates, and obtains the filename crc value;
The content description information computing unit is used for that the file content descriptor is carried out Message Digest 5 and calculates, and obtains content matching information;
The characteristic information transmitting element is used for comprising that file size, file modification time, filename crc value and content matching the characteristics of information information send.
Preferably, described statistical module comprises: sequencing unit, the characteristic information that is used for receiving is by gross weight
Again number sorts;
The heavy unit that disappears is used for for each same characteristic features information, will belong to the heavily computing that disappears of same characteristic features message that same user side sends;
Statistic unit is used for adding up the multiplicity of each characteristic message for each the same characteristic features information that disappears after weighing.
Preferably, when user side complete scan file:
According to beginning during scanning document and the time point when finishing scanning document, calling system api function GetTickCount calculates from the beginning scanning document and the sweep time when finishing scanning document.
Preferably, after extraction module, also comprise:
Generation unit is used for depositing the described characteristic information that extracts in data file, and according to described Generating Data File packet.
Preferably, before generating, described packet also comprises:
Remove the unit, be used for receiving the characteristic information of the unsafe file that user side sends, this characteristic information is not deposited in data file or characteristic information identical with this characteristic information in the packet is deleted.
Preferably, by the following method the characteristic information of described file and the characteristic information in the described packet are mated:
In buffer memory, begin to mate from the highest characteristic information of matching efficiency.
Preferably, when a characteristic information coupling in file size and file modification time and the database, then filename is carried out described CRC and calculate, obtain the filename crc value, and the filename crc value of described filename crc value and this characteristic information is mated;
When the filename crc value of described filename crc value and this characteristic information mates, then the file content descriptor being carried out described Message Digest 5 calculates, obtain content matching information, and the content matching information of described content matching information and this characteristic information is mated.
Preferably, also comprise:
Prompting module is used for when scanning for the first time, and whether prompting first user end selects rapid scanning, if select then the characteristic information of scanning document, and calls described packet and mates.
Preferably, also comprise: upgrade logging modle, be used for when the first user end scans, the characteristic information that will work as secure file in time scanning result deposits the secure file information list of described packet in; When scanning next time, the first user end scans according to the safety record result after the last time scanning.
For system embodiment because itself and embodiment of the method basic simlarity, so describe fairly simple, relevant part gets final product referring to the part explanation of embodiment of the method.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.
More than a kind of file fast scanning method and system that the application is provided, be described in detail, used specific case herein the application's principle and embodiment are set forth, the explanation of above embodiment just is used for helping to understand the application's method and core concept thereof; Simultaneously, for one of ordinary skill in the art, the thought according to the application all will change in specific embodiments and applications, and in sum, this description should not be construed as the restriction to the application.

Claims (18)

1. a file fast scanning method is characterized in that, comprising:
Obtain packet; Described packet comprises whether for definite system File be the secure file characteristic information of secure file;
The file feature information of scanning system File one by one, when if markup document is the secure file characteristic information coupling of secure file in the file feature information of current scanning and the described packet, then skip the virus killing scanning to current file, continue the next file of scanning;
Wherein, described secure file characteristic information obtains by the secure file characteristic information of adding up each user side and sending in the processing enter server, comprising:
Receive the characteristic information of the secure file of each user side transmission; When described characteristic information comprises user side complete scan file, determine the characteristic information of the file of safety;
For identical characteristic information, add up its multiplicity;
Extract multiplicity greater than or more than or equal to the characteristic information of amount threshold;
Wherein, when user side complete scan file:
When the file security of described scanning, obtain the characteristic information of file, described characteristic information comprises filename, file size, file modification time and file content descriptor;
Filename is carried out CRC calculate, obtain the filename crc value;
The file content descriptor is carried out Message Digest 5 calculate, obtain content matching information;
To comprise that file size, file modification time, filename crc value and content matching the characteristics of information information send.
2. method according to claim 1 is characterized in that, for identical characteristic information, adds up its multiplicity:
The characteristic information that receives is sorted by total multiplicity;
For each same characteristic features information, will belong to the heavily computing that disappears of same characteristic features message that same user side sends;
For each the same characteristic features information that disappears after weighing, add up the multiplicity of each characteristic message.
3. method according to claim 1 is characterized in that, extract multiplicity greater than or characteristic information more than or equal to amount threshold after comprise:
Deposit the described characteristic information that extracts in data file, and according to described Generating Data File packet.
4. method according to claim 1 is characterized in that, also comprises before described packet generates:
Receive the characteristic information of the unsafe file that user side sends, this characteristic information is not deposited in data file or characteristic information identical with this characteristic information in the packet is deleted.
5. method according to claim 1 is characterized in that:
On generating, behind packet, behind the characteristic information that counts on new secure file, be updated to a packet.
6. method according to claim 1 is characterized in that, by the following method the characteristic information of described file and the characteristic information in the described packet is mated:
In buffer memory, begin to mate from the highest characteristic information of matching efficiency.
7. method according to claim 6 is characterized in that:
When a characteristic information coupling in file size and file modification time and the database, then filename being carried out described CRC calculates, obtain the filename crc value, and the filename crc value of described filename crc value and this characteristic information is mated;
When the filename crc value of described filename crc value and this characteristic information mates, then the file content descriptor being carried out described Message Digest 5 calculates, obtain content matching information, and the content matching information of described content matching information and this characteristic information is mated.
8. it is characterized in that according to claim 1 or 6 described methods:
When the first user end carried out scanning the first time, whether prompting first user end selected rapid scanning, if select then the characteristic information of scanning document, and called described packet and mated.
9. method according to claim 1 is characterized in that:
When the first user end scanned, the characteristic information that will work as secure file in time scanning result deposited the secure file information list of described packet in; When scanning next time, the first user end scans according to the record after the last time scanning.
10. a file rapid scanning system is characterized in that, comprising:
The first user end, described first user end comprises:
Acquisition module is used for obtaining packet; Described packet comprises whether for definite system File be the secure file characteristic information of secure file;
The scan matching module, be used for the one by one file feature information of scanning system File, if when markup document is the secure file characteristic information coupling of secure file in the file feature information of current scanning and the described packet, then skip the virus killing scanning to current file, continue the next file of scanning;
Wherein, described file rapid scanning system also comprises: the second user side group and processing enter server; The secure file characteristic information that each user side that described processing enter server is used for adding up described the second user side group sends to the processing enter server obtains the characteristic information in the described packet; Each client of described the second user side group is used for sending the characteristic information of secure file; Described processing enter server comprises:
The characteristic information module is used for receiving the characteristic information of the secure file that each user side of described the second user side group sends; When described characteristic information comprises user side complete scan file, determine the characteristic information of the file of safety;
Statistical module is used for for identical characteristic information, adds up its multiplicity;
Extraction module, be used for to extract multiplicity greater than or more than or equal to the characteristic information of amount threshold;
Wherein, each user side of described the second user side group comprises:
Characteristic acquisition unit is used for obtaining the characteristic information of file when the file security that scans, and described characteristic information comprises filename, file size, file modification time and file content descriptor;
The filename computing unit is used for that filename is carried out CRC and calculates, and obtains the filename crc value;
The content description information computing unit is used for that the file content descriptor is carried out Message Digest 5 and calculates, and obtains content matching information;
The characteristic information transmitting element is used for comprising that file size, file modification time, filename crc value and content matching the characteristics of information information send.
11. system according to claim 10 is characterized in that, described statistical module comprises: sequencing unit, and the characteristic information that is used for receiving sorts by total multiplicity;
The heavy unit that disappears is used for for each same characteristic features information, will belong to the heavily computing that disappears of same characteristic features message that same user side sends;
Statistic unit is used for adding up the multiplicity of each characteristic message for each the same characteristic features information that disappears after weighing.
12. system according to claim 10 is characterized in that, also comprises after extraction module:
Generation unit is used for depositing the described characteristic information that extracts in data file, and according to described Generating Data File packet.
13. system according to claim 10 is characterized in that, also comprises before described packet generates:
Remove the unit, be used for receiving the characteristic information of the unsafe file that user side sends, this characteristic information is not deposited in data file or characteristic information identical with this characteristic information in the packet is deleted.
14. system according to claim 10 is characterized in that, also comprises:
Update module, be used on generating a packet after, behind the characteristic information that counts on new secure file, be updated to a packet.
15. system according to claim 10 is characterized in that, by the following method the characteristic information of described file and the characteristic information in the described packet is mated:
In buffer memory, begin to mate from the highest characteristic information of matching efficiency.
16. system according to claim 15 is characterized in that:
When a characteristic information coupling in file size and file modification time and the database, then filename being carried out described CRC calculates, obtain the filename crc value, and the filename crc value of described filename crc value and this characteristic information is mated;
When the filename crc value of described filename crc value and this characteristic information mates, then the file content descriptor being carried out described Message Digest 5 calculates, obtain content matching information, and the content matching information of described content matching information and this characteristic information is mated.
17. system according to claim 10 is characterized in that, also comprises:
Prompting module is used for when scanning for the first time, and whether prompting first user end selects rapid scanning, if select then the characteristic information of scanning document, and calls described packet and mates.
18. system according to claim 10 is characterized in that, also comprises:
Upgrade logging modle, be used for when the first user end scans, the characteristic information that will work as secure file in time scanning result deposits the secure file information list of described packet in; When scanning next time, the first user end scans according to the safety record result after the last time scanning.
CN2012100267600A 2012-02-07 2012-02-07 Quick file scanning method and quick file scanning system Active CN102609515B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN2012100267600A CN102609515B (en) 2012-02-07 2012-02-07 Quick file scanning method and quick file scanning system
PCT/CN2013/071383 WO2013117151A1 (en) 2012-02-07 2013-02-05 Method and system for rapidly scanning files
US14/377,014 US9355250B2 (en) 2012-02-07 2013-02-05 Method and system for rapidly scanning files

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012100267600A CN102609515B (en) 2012-02-07 2012-02-07 Quick file scanning method and quick file scanning system

Publications (2)

Publication Number Publication Date
CN102609515A CN102609515A (en) 2012-07-25
CN102609515B true CN102609515B (en) 2013-10-16

Family

ID=46526887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012100267600A Active CN102609515B (en) 2012-02-07 2012-02-07 Quick file scanning method and quick file scanning system

Country Status (1)

Country Link
CN (1) CN102609515B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9355250B2 (en) * 2012-02-07 2016-05-31 Beijing Qihoo Technology Company Limited Method and system for rapidly scanning files
CN102970420B (en) * 2012-11-07 2014-01-22 广东欧珀移动通信有限公司 Picture filter method and system in Android system
CN104063660B (en) * 2013-03-20 2016-06-22 腾讯科技(深圳)有限公司 A kind of virus scan method, device and terminal
CN103198253B (en) * 2013-03-29 2016-03-30 北京奇虎科技有限公司 The method and system of operating file
CN103473350B (en) * 2013-09-24 2016-10-05 北京奇虎科技有限公司 Document handling method and equipment
CN104573512B (en) * 2013-10-23 2019-02-05 腾讯科技(深圳)有限公司 A kind of method and terminal of feature detection
CN103714269A (en) * 2013-12-02 2014-04-09 百度国际科技(深圳)有限公司 Virus identification method and device
CN103678692B (en) * 2013-12-26 2018-04-27 北京奇虎科技有限公司 A kind of security sweep method and device for downloading file
CN104866289A (en) * 2014-02-21 2015-08-26 北京奇虎科技有限公司 Method and device for software identification
CN106502582A (en) * 2016-09-30 2017-03-15 维沃移动通信有限公司 A kind of directory scan method and mobile terminal
WO2018058517A1 (en) * 2016-09-30 2018-04-05 北京小米移动软件有限公司 Secure scanning method and apparatus, and electronic device
CN109101644A (en) * 2018-08-21 2018-12-28 上海新炬网络信息技术股份有限公司 A kind of sound state journal file scanning collecting method
CN110929110B (en) * 2019-11-13 2023-02-21 北京北信源软件股份有限公司 Electronic document detection method, device, equipment and storage medium
CN111061656A (en) * 2019-11-13 2020-04-24 杭州安恒信息技术股份有限公司 Secondary rapid disinfection method with low resource consumption
CN113360904A (en) * 2021-05-17 2021-09-07 杭州美创科技有限公司 Unknown virus detection method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1964357A (en) * 2006-12-04 2007-05-16 北京金山软件有限公司 A method to process file and information processing device
CN101308533A (en) * 2008-06-30 2008-11-19 华为技术有限公司 Method, apparatus and system for virus checking and killing
CN101639880A (en) * 2008-07-31 2010-02-03 华为技术有限公司 File test method and device
CN101996259A (en) * 2010-12-12 2011-03-30 成都东方盛行电子有限责任公司 Method for deeply analyzing data based on white list mechanism
CN102222201A (en) * 2011-06-03 2011-10-19 奇智软件(北京)有限公司 File scanning method and device thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1964357A (en) * 2006-12-04 2007-05-16 北京金山软件有限公司 A method to process file and information processing device
CN101308533A (en) * 2008-06-30 2008-11-19 华为技术有限公司 Method, apparatus and system for virus checking and killing
CN101639880A (en) * 2008-07-31 2010-02-03 华为技术有限公司 File test method and device
CN101996259A (en) * 2010-12-12 2011-03-30 成都东方盛行电子有限责任公司 Method for deeply analyzing data based on white list mechanism
CN102222201A (en) * 2011-06-03 2011-10-19 奇智软件(北京)有限公司 File scanning method and device thereof

Also Published As

Publication number Publication date
CN102609515A (en) 2012-07-25

Similar Documents

Publication Publication Date Title
CN102609515B (en) Quick file scanning method and quick file scanning system
CN102609653B (en) File quick-scanning method and file quick-scanning system
CN102594809B (en) Method and system for rapidly scanning files
CN112019575B (en) Data packet processing method and device, computer equipment and storage medium
CN104640092A (en) Spam short message identifying method, client end, cloud server and system
EP3771171A1 (en) Website detection method and system
CN105516196A (en) HTTP message data-based parallelization network anomaly detection method and system
CN106789849B (en) CC attack identification method, node and system
CN103379099A (en) Hostile attack identification method and system
CN106790085B (en) Vulnerability scanning method, device and system
CN111147489B (en) Link camouflage-oriented fishfork attack mail discovery method and device
CN105447113A (en) Big data based informatiion analysis method
CN111092902A (en) Attachment camouflage-oriented fishfork attack mail discovery method and device
CN102750476B (en) Method and system for identifying file security
CN102090039B (en) A method of performing data mediation, and an associated computer program product, data mediation device and information system
CN102882988A (en) Method, device and equipment for acquiring address information of resource information
CN104978257A (en) Computer device elastic scoring method and computer device elastic scoring device
CN114143060A (en) Information security prediction method based on artificial intelligence prediction and big data security system
CN101997830A (en) Distributed intrusion detection method, device and system
CN102299869A (en) Method, client and system for storing network link in instant messaging
CN104715197A (en) Quick file scanning method and system
CN105516114B (en) Method and device for scanning vulnerability based on webpage hash value and electronic equipment
CN105430623A (en) Monitoring method, device and system for RCS junk message
CN105205390A (en) Security check system and security check method of mobile terminal
CN107959662B (en) Website security detection method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: BEIJING QIHU TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: QIZHI SOFTWARE (BEIJING) CO., LTD.

Effective date: 20121025

Owner name: QIZHI SOFTWARE (BEIJING) CO., LTD.

Effective date: 20121025

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100016 CHAOYANG, BEIJING TO: 100088 XICHENG, BEIJING

TA01 Transfer of patent application right

Effective date of registration: 20121025

Address after: 100088 Beijing city Xicheng District xinjiekouwai Street 28, block D room 112 (Desheng Park)

Applicant after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Applicant after: Qizhi software (Beijing) Co.,Ltd.

Address before: The 4 layer 100016 unit of Beijing city Chaoyang District Jiuxianqiao Road No. 14 Building C

Applicant before: Qizhi software (Beijing) Co.,Ltd.

C14 Grant of patent or utility model
GR01 Patent grant
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Zhou Hongdai

Inventor after: Zou Guiqiang

Inventor after: Fu Fu

Inventor before: Zou Guiqiang

Inventor before: Fu Fu

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: ZOU GUIQIANG FU MIN TO: ZHOU HONGYI ZOU GUIQIANG FU MIN

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee after: Beijing Qizhi Business Consulting Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Qizhi software (Beijing) Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220329

Address after: 100016 1773, 15 / F, 17 / F, building 3, No.10, Jiuxianqiao Road, Chaoyang District, Beijing

Patentee after: Sanliu0 Digital Security Technology Group Co.,Ltd.

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Patentee before: Beijing Qizhi Business Consulting Co.,Ltd.