CN103914655A - Downloaded file security detection method and device - Google Patents

Downloaded file security detection method and device Download PDF

Info

Publication number
CN103914655A
CN103914655A CN201410098964.4A CN201410098964A CN103914655A CN 103914655 A CN103914655 A CN 103914655A CN 201410098964 A CN201410098964 A CN 201410098964A CN 103914655 A CN103914655 A CN 103914655A
Authority
CN
China
Prior art keywords
file
download
characteristic
cloud server
compression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410098964.4A
Other languages
Chinese (zh)
Inventor
魏志江
孙晓骏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201410098964.4A priority Critical patent/CN103914655A/en
Publication of CN103914655A publication Critical patent/CN103914655A/en
Priority to PCT/CN2014/095951 priority patent/WO2015139507A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Technology Law (AREA)
  • Information Transfer Between Computers (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a downloaded file security detection method and device. The method includes the steps: a client collects features of a download scene of a downloaded file and file features of the downloaded file and uploads the features to a cloud server; the cloud server matches the features of the download scene and the file features with cloud rules in a rule engine so as to obtain corresponding matching results; the cloud rules are obtained by comprehensively analyzing the download scene features and file features of a file sample; the cloud server issues the matching results to the client. Compared with the prior art for files such as downloaded files with feature codes not recorded in a security feature code base and other unknown files, the downloaded file security detection method and device has the advantages that detection is well effective and detection of abrupt virus files is well effective.

Description

A kind of method and device that detects download file security
Technical field
The present invention relates to network security technology field, be specifically related to a kind of method and device that detects download file security.
Background technology
At present, along with social progress and the development of technology, people utilize the terminal obtaining information that enters the Internet more and more, comprise information browse and file download etc.And lower carrier band poison file has become computing machine to infect topmost approach from network, therefore the security of download file is paid attention to by increasing people.
In order to prevent from being with malicious file intrusion system, a kind of detection method of prior art can be by the MD5(Message Digest Algorithm 5 to the executable file of having downloaded, Message Digest Algorithm) safety analysis of condition code, determine fast the security of executable file, particularly, security control server carries out download file in the process of safety analysis in the request of corresponding client, first scan the condition code of executable file, then judge that it is whether in the security feature code storehouse of security control server; If, represent that this executable file was not tampered, be original, safe executable file, directly relevant security information is returned to client; If not in security feature code storehouse, security control server is done to analyze to this executable file, specifically can comprise: whether this executable file exists is stolen privacy of user, cannot unload, limit other software application, networking automatically, automatically send short messages or multimedia message, make that system is slack-off, even viral wooden horse etc. may, if had, differentiate the risk of this executable file existence and differentiation result is returned to client, if not, return to the information of security situation the unknown to client.
Above-mentioned detection method can be determined the security of executable file in security feature code storehouse fast.But the collection in security feature code storehouse has certain property delayed, can not be indexed in time in security feature code storehouse for emerging virus document,, do not include to the detectability of the executable file in security feature code storehouse very limited for condition code yet.In addition, the security that is only only applicable to executable file detects, and limited for the detectability of the formatted files such as compressed package, * doc, * txt.
Summary of the invention
In view of the above problems, the present invention has been proposed to a kind of a kind of method and device that detects download file security that overcomes the problems referred to above or address the above problem is at least in part provided.
According to one aspect of the present invention, a kind of method that detects download file security is provided, comprising:
Client gathers download scene characteristic and the file characteristic of download file, and is uploaded to Cloud Server;
Cloud Server mates described download scene characteristic and file characteristic with regulation engine medium cloud rule, obtain corresponding matching result; Wherein, described cloud rule is for to obtain by download scene characteristic and the file characteristic of comprehensive analysis paper sample;
Described matching result is handed down to described client by Cloud Server.
Alternatively, described download scene characteristic comprises one or more in following feature: download link, whether the type of download tool or immediate communication tool, in net purchase or payment mode.
Alternatively, the collection mode of described paper sample comprises one or more in following mode:
Initiatively capture download link;
Obtain the paper sample of user's report
Obtain the paper sample that website that third party cooperates provides.
Alternatively, described download file comprises compressed package files, and the file characteristic of described compressed package files comprises: file size before and after file layout, compression front and back file name, compression before and after file header form, compression;
Described cloud rule comprises: by the rule that before and after file name, compression before and after file layout, compression before and after the compression of compressed package files sample relatively, file size generates.
Alternatively, described client upload is that described client is encapsulated into the download scene characteristic of gathered download file and file characteristic in packet with the form of character string, and this packet is uploaded to Cloud Server to the step of Cloud Server;
Described Cloud Server mates described download scene characteristic and file characteristic with regulation engine medium cloud rule, obtains the step of corresponding matching result, comprising:
Cloud Server is resolved and is obtained corresponding character string from this packet;
This character string is inputted at least one decision machine corresponding to described cloud rule with in the training pattern identical with decision machine quantity, and output is corresponding differentiates result.
According to a further aspect in the invention, provide a kind of system that detects download file security, having comprised: client and Cloud Server;
Wherein, described client, comprising:
Acquisition module, for gathering download scene characteristic and the file characteristic of download file; And
Upper transmission module, for being uploaded to Cloud Server by download scene characteristic and the file characteristic of gathered download file;
Described Cloud Server, comprising:
Matching module, for described download scene characteristic and file characteristic are mated with regulation engine medium cloud rule, obtains corresponding matching result; Wherein, described cloud rule is for to obtain by download scene characteristic and the file characteristic of comprehensive analysis paper sample; And
Issue module, for described matching result is handed down to described client.
Alternatively, described download scene characteristic comprises one or more in following feature: download link, whether the type of download tool or immediate communication tool, in net purchase or payment mode.
Alternatively, the collection mode of described paper sample comprises one or more in following mode:
Initiatively capture download link;
Obtain the paper sample of user's report
Obtain the paper sample that website that third party cooperates provides.
Alternatively, described download file is compressed package files, and described file characteristic comprises: file size before and after file layout, compression front and back file name, compression before and after file header form, compression;
Described cloud rule comprises: by the rule that before and after file name, compression before and after file layout, compression before and after the compression of compressed package files sample relatively, file size generates.
Alternatively, described upper transmission module, specifically for the download scene characteristic of gathered download file and file characteristic are encapsulated in packet with the form of character string, and is uploaded to Cloud Server by this packet;
Described matching module, comprising:
Analyzing sub-module, resolves and obtains corresponding character string from this packet for Cloud Server;
Differentiate submodule, for this character string being inputted to corresponding at least one decision machine and the training pattern identical with decision machine quantity of described cloud rule, result is differentiated in output accordingly.
Can be gathered by client download scene characteristic and the file characteristic of download file according to the method for detection download file of the present invention security and device, and be uploaded to Cloud Server, the cloud rule being obtained by download scene characteristic and the file characteristic of comprehensive analysis paper sample by Cloud Server utilization is mated download scene characteristic and the file characteristic of the described download file of uploading, and obtains corresponding matching result; On the one hand, only utilize MD5 condition code to carry out safety analysis with respect to prior art, the embodiment of the present invention has considered download scene characteristic and file characteristic in the process of safety analysis, therefore, do not include to the file such as download file and other unknown file in security feature code storehouse and there is good detection effect for condition code in prior art; On the other hand, can bring into play Cloud Server real-time update, updating speed is fast, computing power is strong advantage, for burst virus document have good detection effect.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to better understand technological means of the present invention, and can be implemented according to the content of instructions, and for above and other objects of the present invention, feature and advantage can be become apparent, below especially exemplified by the specific embodiment of the present invention.
Brief description of the drawings
By reading below detailed description of the preferred embodiment, various other advantage and benefits will become cheer and bright for those of ordinary skill in the art.Accompanying drawing is only for the object of preferred implementation is shown, and do not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:
Fig. 1 shows the process flow diagram of the method that detects according to an embodiment of the invention download file security;
Fig. 2 shows the process flow diagram of the method that detects according to an embodiment of the invention download file security; And
Fig. 3 shows the structural drawing of the device that detects according to an embodiment of the invention download file security.
Embodiment
Exemplary embodiment of the present disclosure is described below with reference to accompanying drawings in more detail.Although shown exemplary embodiment of the present disclosure in accompanying drawing, but should be appreciated that and can realize the disclosure and the embodiment that should do not set forth limits here with various forms.On the contrary, it is in order more thoroughly to understand the disclosure that these embodiment are provided, and can be by the those skilled in the art that conveys to complete the scope of the present disclosure.
With reference to Fig. 1, show the process flow diagram of the method that detects according to an embodiment of the invention download file security, specifically can comprise:
Step 101, client gather download scene characteristic and the file characteristic of download file, and are uploaded to Cloud Server;
In specific implementation, described download scene characteristic specifically can comprise one or more in following feature: download link, whether the type of download tool or immediate communication tool, in net purchase or payment mode.
In preferred embodiments more of the present invention, after getting above-mentioned Download Info, can also further these Download Infos be saved in default download log, so that inquire about in subsequent process.Store these Download Infos in download log time, a Download Info corresponding to download file can be stored as to a store items, in this store items, using every Download Info item of information in this store items respectively.In the time of subsequent query, can inquire about according to each item of information.
In addition, after all download behaviors of having preserved user by download log, can also set up the pass tethers between each download file based on this download log, thereby according to this pass tethers interception malicious file.For example, for a file obtaining by download, after inquiry download log, just can know that this download file transmits by chat tool, or download by a certain download tool, but also can determine corresponding download chain when this download file is downloaded, here, download chain and refer to the chained list forming with the file downloaded afterwards by before this download file download, can set up thus a pass tethers of being convenient to defence.For example, suppose under the scene of Taobao's shopping, user has received a PE file by chat tool, at this moment, just can carry out key monitoring to this PE file according to download log, such as monitoring, which position it has moved to, do which sensitive operation (for example directly operation, the risky operation such as edit the registry), like this, once this file has been carried out sensitive operation and just will be blocked, the specific aim while having improved thus interception malicious file.In addition, can also get the download source of download file by download log, and the file of accordingly mistake being deleted is given for change.In addition, if show that by download log user, often from a certain website download file, also can collect this download site, with user-friendly.
Wherein, described download link mainly refers to the URL(URL(uniform resource locator) of download file, Uniform Resource Locator) address, conventionally the file of downloading from network all has its exclusive URL address, even if the address of again being pointed to, what finally point to is also its exclusive URL address.If file corresponding to URL address is proved to be adventurous, to download file corresponding to this URL address should be also dangerous to whoso so.Although do not get rid of file corresponding to URL address by the possibility that does not have virulent file to replace, but such possibility is very little, because the personnel's that sabotage object allows the other side poisoning exactly, therefore be that wooden horse or virus are all that to make the other side poisoning be object, therefore this situation can occur hardly.To sum up, utilize URL address to determine whether download file is safely feasible.
Described download tool is a kind of software that can download quickly the information resources such as text, image, image, video, audio frequency, animation from the Internet, it has adopted multiple spot interconnection technique, take full advantage of the unnecessary bandwidth on network, adopt breakpoint transmission technology simultaneously, continue at any time and ended position continuation download last time, effectively avoided the duplication of labour.This has saved download person's line download time greatly.Download tool in practice has and can comprise: a sudden peal of thunder, flash get, browser, various download websites etc., the embodiment of the present invention is not limited concrete download tool.
The download file that described immediate communication tool is corresponding mainly can comprise: receive the file that others sends from the immediate communication tool such as QQ, MSN.
In preferred embodiments more of the present invention, described download tool can be supported the function choosing-item such as " arranging by download time ", " arranging by file type " and " arranging by download tool ".
Suppose that user has carried out the action of the function choosing-item of click " by download time arrangement ", this action can trigger this function choosing-item and send an instruction that shows the bookkeeping entrance of each download file and each download file according to download time to background program, after background program receives this instruction, can inquire about the download log of storage, and according to " download time " this item of information, all store items are sorted in download log, then show the bookkeeping entrance of each download file and each download file according to the result after sequence.Like this, user just can according to download time from the new to the old the order of (or from the old to the new) browse all downloaded files, search certain downloaded file thereby be convenient to user according to time sequencing.And, due to the each downloaded file demonstrating corresponding corresponding bookkeeping entrance, for example, for the download file of a safe level, its correspondence and is opened entrance, backup entrance and delete the entrances such as entrance; For the download file of a unknown level, its correspondence isolation and is opened entrance, backup entrance and delete the entrances such as entrance, and therefore, user can also carry out bookkeeping to the downloaded file finding easily.
In the embodiment of the present invention, can lock according to the type of download tool or immediate communication tool the scope of download file, and download file and the paper sample in the same scope of having collected are compared, to obtain corresponding safety results.
Because net purchase relates to payment process, be easily subject to the attack of wooden horse.For example, the net purchase wooden horse buyer that disguises oneself as, links up with seller, wait for an opportunity to send by chat tool the compressed files such as so-called commodity figure to seller, seller infects wooden horse after clicking, cheat steals seller's account password again by wooden horse, obtain the administration authority in shop.Next, cheat just can pretend to be seller to implement to have swindled to real buyer.Cheated buyer calculates problem with it shopping website or real seller, has also therefore caused a lot of disputes; Therefore whether be, an important download scene characteristic for detection of security in net purchase or payment mode.
Common file characteristic specifically can comprise: filename, file layout and file size, also, in the time cannot obtaining the MD5 feature of download file, the present invention still can detect its security.
Step 102, Cloud Server mate described download scene characteristic and file characteristic with regulation engine medium cloud rule, obtain corresponding matching result; Wherein, described cloud rule is for to obtain by download scene characteristic and the file characteristic of Study document sample;
In specific implementation, the collection mode of described paper sample specifically can comprise one or more in following mode:
Initiatively capture download link; For example, can initiatively capture the download link on certain download tool, owing to initiatively capturing the common technology that download link is this area, and therefore not to repeat here.
Obtain the paper sample of user's report;
Obtain the paper sample that website that third party cooperates provides.
In an embodiment of the present invention, in the time that the download scene characteristic of client and file characteristic hit certain cloud rule, can also ask for corresponding download file to client, and further analyze and obtain corresponding cloud rule.In another embodiment of the invention, can also be using existing unknown file as paper sample.
In practice, can adopt database to store collected paper sample, meanwhile, this database can also record with the form of daily record download scene characteristic and file characteristic and the corresponding matching result of the download file of each client upload.
In a word, Cloud Server can pass through manual type, user report, etc. variety of way collect paper sample, and paper sample is analyzed, with the novel virus of fast detecting.
In actual applications, described paper sample specifically can comprise: safe sample, dangerous sample, risk sample, suspicious sample etc., can obtain corresponding cloud rule by the download scene characteristic of Study document sample and file characteristic; Wherein, each cloud rule can comprise that at least one is downloaded scene characteristic and at least one file characteristic.
Wherein each cloud rule can have corresponding level of security, if hit this cloud rule corresponding matching result be its safe class.In a kind of application example of the present invention, described safe class specifically can comprise safe class, suspicious/highly suspicious grade, risk class and danger classes.For the setting of grade, can arrange when matching result is 10-29 is safe class, it when matching result is 30-49, is risk class, it when matching result is 50-69, is suspicious/highly suspicious grade, it is malice grade etc. that matching result is more than or equal at 70 o'clock, the division of the present invention to concrete safe class, and the corresponding relation of matching result and safe class is not limited.
In a preferred embodiment of the present invention, described client upload to the step of Cloud Server is specifically as follows, described client is encapsulated into the download scene characteristic of gathered download file and file characteristic in packet with the form of character string, and this packet is uploaded to Cloud Server;
In a kind of application example of the present invention, can adopt regular expression to verify whether above-mentioned character string does not meet download scene characteristic and the file characteristic of specifying safe class.
For example, in a kind of application example of the present invention, download scene characteristic and the file characteristic of described character string forms are specifically as follows: www.abc.com: 1.txt: download tool, wherein, www.abc.com is used for representing download link, 1.txt is used for representing file name, and download tool represents corresponding download tool.
In another kind of application example of the present invention, in the download scene characteristic of described character string forms and file characteristic, can record download tool, memory location and the download time of filename, file type, correspondence.Wherein, filename can directly obtain according to the title of download file itself, file type can obtain (also obtaining by other type analysis mode) according to information such as the filename suffix of download file, corresponding download tool is the download tool that this download file adopts in the time downloading, memory location can obtain according to download path, and system time when download time can download according to download file obtains.
It should be noted that, as long as the embodiment of the present invention has monitored download file and has downloaded this download behavior or immediate communication tool and be transmitted this transport behavior and just can trigger the step that described download scene characteristic and file characteristic is uploaded to Cloud Server, and without the relevant information such as registration table of each download tool of access, therefore can detect in time the security of download file.
In practice, the character string of one or more download files can be encapsulated as to TCP(transmission control protocol, Transmission Control Protocol) or UDP(User Datagram Protocol, User Datagram Protocol) packet, and upload.
In another kind of application example of the present invention, described Cloud Server mates described download scene characteristic and file characteristic with regulation engine medium cloud rule, obtain the step of corresponding matching result, specifically can comprise:
Step S100, Cloud Server are resolved and are obtained corresponding character string from this packet;
Step S101, this character string is inputted at least one decision machine corresponding to described cloud rule with in the training pattern identical with decision machine quantity, output is corresponding differentiates result.
Wherein, different decision machines uses identical or different mode to train feature, and corresponding training process specifically can comprise: use the decision machine of support vector machine to train, or use the decision machine of decision tree to train; Training pattern can be the training pattern of the training pattern of band coding or compression.
Taking PE file as example, can use k decision machine according to the difference of the tagsort of the architectural feature of PE file, and k training pattern of a corresponding k decision machine.Wherein, analyzing after PE file, extract the architectural feature of corresponding PE file, within the architectural feature of extracted PE file is put into a corresponding proper vector, according to the feature being drawn into, carry out tagsort, for example, can be divided into PE file header tagsort, PE standard header tagsort, optional tagsort of PE, data directory tagsort, conventional joint table tagsort, according to the result of classification, use different decision machines to train the proper vector of different classes of program file and security attribute, obtain corresponding training pattern.The differentiation result that the security attribute here is also finally exported, it specifically can comprise: multiple safe classes; Wherein, safe class at least comprises hazard class and safe level.Preferably, can also carry out to safe class the division of more multi-layered time, to determine exactly the hazard index of a certain download file.For example, can be hazard class, suspicious level, unknown level and four grades of safe level according to hazard index order Further Division from high to low by safe class.
In a word, can save a large amount of manpowers by the use of above-mentioned training pattern and decision machine, improve the recognition efficiency to virus document; And based on magnanimity program being carried out on the basis of data mining, the download scene characteristic based on download file and file characteristic can be found the inherent law of virus document, the virus document of burst is prevented.
Described matching result is handed down to described client by step 103, Cloud Server.
Client can be resolved and obtain corresponding safety results the matching result of receiving, and sends corresponding information, prompting as corresponding in safe class, suspicious/highly suspicious grade, risk class and danger classes etc.
In a word, the embodiment of the present invention is gathered download scene characteristic and the file characteristic of download file by client, and be uploaded to Cloud Server, the cloud rule being obtained by download scene characteristic and the file characteristic of comprehensive analysis paper sample by Cloud Server utilization is mated download scene characteristic and the file characteristic of the described download file of uploading, and obtains corresponding matching result; On the one hand, only utilize MD5 condition code to carry out safety analysis with respect to prior art, the embodiment of the present invention has considered download scene characteristic and file characteristic in the process of safety analysis, therefore, do not include to the file such as download file and other unknown file in security feature code storehouse and there is good detection effect for condition code in prior art; On the other hand, can bring into play Cloud Server real-time update, updating speed is fast, computing power is strong advantage, for burst virus document have good detection effect.
With reference to Fig. 2, show the process flow diagram of the method that detects according to an embodiment of the invention download file security, specifically can comprise:
Step 201, client gather download scene characteristic and the file characteristic of compressed package files, and are uploaded to Cloud Server; Described file characteristic specifically can comprise: file size before and after file layout, compression front and back file name, compression before and after file header form, compression;
Step 202, Cloud Server mate described download scene characteristic and file characteristic with regulation engine medium cloud rule, obtain corresponding matching result; Wherein, described cloud rule is for to obtain by download scene characteristic and the file characteristic of comprehensive analysis paper sample, and described cloud rule specifically can comprise: by the rule of file size generation before and after file name, compression before and after file layout, compression before and after the compression of compressed package files sample relatively;
Described matching result is handed down to described client by step 203, Cloud Server.
With respect to embodiment illustrated in fig. 1, the present embodiment is specifically directed to the detection of compressed package files security, wherein, particularly, can generate corresponding cloud rule, and utilize this cloud rule to carry out the detection of compressed package files by the compression front and back file layout, compression front and back file name, the compression front and back file size that compare compressed package files sample; Described cloud rule can detect file layout and whether file name change has occurred, if changed, safe class can decline, described cloud rule can also detect the difference of compression front and back file size, if difference is excessive (if compression preceding document size is 1G, after compression, file size is 10M), safe class also can be very low; In a word, the present embodiment can be on the basis of the cloud rule based on download scene characteristic, flexible utilization is generated corresponding cloud rule and is carried out the detection of compressed package files security by file size before and after file name, compression before and after file layout, compression before and after the compression of compressed package files sample relatively, can detect in time and be with virulent compressed package files; Wherein, the form of compressed package can comprise the multiple formats such as ace, winrar, ar, ip, tar, cab, uue, jar, iso, z, 7-zip, lzh, arj, gzip, bz2.
Certainly except compressed package files, in the embodiment of the present invention, the form of download file to be detected can also comprise: * .doc, * .docx, * .txt, * .BMP, * .JPG, * .AI, * .xlsx etc., or the form of download file to be detected can also comprise MHT(polymerization html document), script etc., for example, the sample of JS form, sample of the sample of html format or VBS type etc., the embodiment of the present invention is not limited the form of download file to be detected.
It should be noted that, except the mode that above-described embodiment is mentioned, in one implementation, the file feature information of download file can also comprise: the URL address corresponding to file content of this download file.Here, for the web page files of MHT form, the URL address that file content is corresponding typically refers to the one or more URL address comprising in file content (being the document text of web page files), these URL addresses both can adopt the hyperlink form that can click to realize, and also can adopt the textual form that can be replicated to realize.For the web page files of extended formatting, one or more URL address that the URL address that file content is corresponding comprises in comprising file content, can also comprise the URL address (appear at the URL address in the address field of this web page files, can open this web page files by this address) of this web page files itself.Correspondingly, in local characteristic information storehouse, canned data item specifically can comprise: the URL item of information of multiple safe classes.Wherein, safe class at least can comprise hazard class and safe level.Preferably, can also carry out to safe class the division of more multi-layered time, to determine exactly the hazard index of a certain download file.For example, can be hazard class, suspicious level, unknown level and four grades of safe level according to hazard index order Further Division from high to low by safe class, each grade can be distinguished corresponding one or more URL item of information.Wherein, corresponding each URL item of information of each grade can be both complete URL address, can be also the Partial Fragment comprising in URL address.Particularly, in the time obtaining the corresponding URL item of information of each grade, can obtain the sample analysis of predetermined number by machine learning algorithm.
In the second implementation, the file feature information of download file can also comprise: the plaintext character string comprising in the file content of this download file.For example, the Chinese words, the English word etc. that occur with plaintext form comprising in file content all can be used as expressly character string.Particularly, in the time obtaining these plaintext character strings, only need carry out word segmentation processing to file content.Correspondingly, in local characteristic information storehouse, canned data item specifically can comprise: the plaintext string assemble of multiple safe classes.Wherein, safe class can directly be divided with reference to dividing mode above, and each grade is corresponding one or more plaintext string assembles respectively.For example, just can be used as the corresponding plaintext string assemble of hazard class by " lucky user " and " prize-winning " these two set that expressly character string forms, if comprise this plaintext string assemble in a download file, probably represent that this download file is for " fishing file ".So-called " fishing file " refers to that lawless person utilizes various means, URL address and the content of pages of counterfeit true website, or utilize the leak in true Website server program to insert dangerous HTML code in some webpage of website, gain user bank or the private data such as credit card account, password by cheating with this.Particularly, in the time obtaining the corresponding plaintext string assemble of each grade, also can obtain the sample analysis of predetermined number by machine learning algorithm.
In the third implementation, the file feature information of download file can also comprise: the file page surface element that file content is corresponding.Here the file page surface element of mentioning mainly can comprise: the contents such as picture, text feature and web page interlinkage.The third implementation is with the key distinction of the second implementation: compared with the concept of the concept of file page surface element and plaintext character string, the content that the former contains is more abundanter, thereby can more fully embody the feature of file.For example, in the present embodiment, can represent above-mentioned file page surface element by DOM Document Object Model dom tree.Can clearly reflect page structure and the content of pages of one section of document by dom tree.Before introducing the structure of dom tree, first introduce the common structure of web page files: on web page files, comprise polylith content, the content of text (as the character express about this Web page subject) for example showing, image content, the video content etc. of URL content, demonstration.The every corresponding page assembly of content, each page assembly has different separately data contents, and data content has been recorded structure and pattern that this page assembly is shown on the page.Taking image content as example, in the data content of its corresponding page assembly, be included in the picture size of showing on the page, picture header with respect to the position of picture, the text formatting of picture header, text form comprises font size, color, font type etc.In module list, comprise the data content of each page assembly module, data content comprises HTML (Hypertext Markup Language) (HTML, Hyper Text Mark-up Language), the manifestation mode such as CSS (cascading style sheet) (CSS, Cascading Style Sheet) and javascript assembling script; Page assembly module in module list can adopt forms mode to arrange, and also can adopt graphics mode to represent each page assembly module.Dom tree is exactly a kind of mode of describing above-mentioned web page files structure by tree structure.In the time building dom tree, need to analyze document, obtain root element and each element wherein, the structure of clear and definite entire chapter document accordingly, wherein, root element can identify by html, and element can identify by bytes such as head, body, title; Then, also will obtain the content of text that each element is corresponding, text content comprises picture, link etc., the content that clear and definite entire chapter document is explained accordingly.As can be seen here, represent that by dom tree the mode of file page surface element can reflect the content that one section of document comprises all sidedly, thereby can not omit the fallacious messages such as any fishing content, reach the object of comprehensive scanning.
With reference to Fig. 3, show the structural drawing of the device that detects according to an embodiment of the invention download file security, specifically can comprise: client 301 and Cloud Server 302;
Wherein, described client 301 specifically can comprise:
Acquisition module 311, for gathering download scene characteristic and the file characteristic of download file; And
Upper transmission module 312, for being uploaded to Cloud Server by download scene characteristic and the file characteristic of gathered download file;
Described Cloud Server 302 specifically can comprise:
Matching module 321, for described download scene characteristic and file characteristic are mated with regulation engine medium cloud rule, obtains corresponding matching result; Wherein, described cloud rule is for to obtain by download scene characteristic and the file characteristic of comprehensive analysis paper sample; And
Issue module 322, for described matching result is handed down to described client.
In a preferred embodiment of the present invention, described download scene characteristic specifically can comprise one or more in following feature: download link, whether the type of download tool or immediate communication tool, in net purchase or payment mode.
In another kind of preferred embodiment of the present invention, the collection mode of described paper sample specifically can comprise one or more in following mode:
Initiatively capture download link;
Obtain the paper sample of user's report
Obtain the paper sample that website that third party cooperates provides.
In another preferred embodiment of the present invention, stating download file is compressed package files, and described file characteristic specifically can comprise: file size before and after file layout, compression front and back file name, compression before and after file header form, compression;
Described cloud rule specifically can comprise: by the rule that before and after file name, compression before and after file layout, compression before and after the compression of compressed package files sample relatively, file size generates.
In embodiments of the present invention, preferably, described upper transmission module 312, can be specifically for the download scene characteristic of gathered download file and file characteristic are encapsulated in packet with the form of character string, and this packet is uploaded to Cloud Server;
Described matching module 321, specifically can comprise:
Analyzing sub-module, resolves and obtains corresponding character string from this packet for Cloud Server;
Differentiate submodule, for this character string being inputted to corresponding at least one decision machine and the training pattern identical with decision machine quantity of described cloud rule, result is differentiated in output accordingly.
The algorithm providing at this is intrinsic not relevant to any certain computer, virtual system or miscellaneous equipment with demonstration.Various general-purpose systems also can with based on using together with this teaching.According to description above, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.It should be understood that and can utilize various programming languages to realize content of the present invention described here, and the description of above language-specific being done is in order to disclose preferred forms of the present invention.
In the instructions that provided herein, a large amount of details are described.But, can understand, embodiments of the invention can be put into practice in the situation that there is no these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand one or more in each inventive aspect, in the above in the description of exemplary embodiment of the present invention, each feature of the present invention is grouped together into single embodiment, figure or sometimes in its description.But, the method for the disclosure should be construed to the following intention of reflection: the present invention for required protection requires than the more feature of feature of clearly recording in each claim.Or rather, as reflected in claims below, inventive aspect is to be less than all features of disclosed single embodiment above.Therefore, claims of following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
Those skilled in the art are appreciated that and can the module in the equipment in embodiment are adaptively changed and they are arranged in one or more equipment different from this embodiment.Module in embodiment or unit or assembly can be combined into a module or unit or assembly, and can put them in addition multiple submodules or subelement or sub-component.At least some in such feature and/or process or unit are mutually repelling, and can adopt any combination to combine all processes or the unit of disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and disclosed any method like this or equipment.Unless clearly statement in addition, in this instructions (comprising claim, summary and the accompanying drawing followed) disclosed each feature can be by providing identical, be equal to or the alternative features of similar object replaces.
In addition, those skilled in the art can understand, although embodiment more described herein comprise some feature instead of further feature included in other embodiment, the combination of the feature of different embodiment means within scope of the present invention and forms different embodiment.For example, in the following claims, the one of any of embodiment required for protection can be used with array mode arbitrarily.
All parts embodiment of the present invention can realize with hardware, or realizes with the software module of moving on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that and can use in practice microprocessor or digital signal processor (DSP) to realize according to the some or all functions of the some or all parts in method and the appliance arrangement of the detection download file security of the embodiment of the present invention.The present invention can also be embodied as part or all equipment or the device program (for example, computer program and computer program) for carrying out method as described herein.Realizing program of the present invention and can be stored on computer-readable medium like this, or can there is the form of one or more signal.Such signal can be downloaded and obtain from internet website, or provides on carrier signal, or provides with any other form.
It should be noted above-described embodiment the present invention will be described instead of limit the invention, and those skilled in the art can design alternative embodiment in the case of not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and is not listed as element or step in the claims.Being positioned at word " " before element or " one " does not get rid of and has multiple such elements.The present invention can be by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim of having enumerated some devices, several in these devices can be to carry out imbody by same hardware branch.The use of word first, second and C grade does not represent any order.Can be title by these word explanations.

Claims (10)

1. a method that detects download file security, comprising:
Client gathers download scene characteristic and the file characteristic of download file, and is uploaded to Cloud Server;
Cloud Server mates described download scene characteristic and file characteristic with regulation engine medium cloud rule, obtain corresponding matching result; Wherein, described cloud rule is for to obtain by download scene characteristic and the file characteristic of comprehensive analysis paper sample;
Described matching result is handed down to described client by Cloud Server.
2. the method for claim 1, is characterized in that, described download scene characteristic comprises one or more in following feature: download link, whether the type of download tool or immediate communication tool, in net purchase or payment mode.
3. the method for claim 1, is characterized in that, the collection mode of described paper sample comprises one or more in following mode:
Initiatively capture download link;
Obtain the paper sample of user's report
Obtain the paper sample that website that third party cooperates provides.
4. the method for claim 1, it is characterized in that, described download file comprises compressed package files, and the file characteristic of described compressed package files comprises: file size before and after file layout, compression front and back file name, compression before and after file header form, compression;
Described cloud rule comprises: by the rule that before and after file name, compression before and after file layout, compression before and after the compression of compressed package files sample relatively, file size generates.
5. the method for claim 1, it is characterized in that, described client upload is that described client is encapsulated into the download scene characteristic of gathered download file and file characteristic in packet with the form of character string, and this packet is uploaded to Cloud Server to the step of Cloud Server;
Described Cloud Server mates described download scene characteristic and file characteristic with regulation engine medium cloud rule, obtains the step of corresponding matching result, comprising:
Cloud Server is resolved and is obtained corresponding character string from this packet;
This character string is inputted at least one decision machine corresponding to described cloud rule with in the training pattern identical with decision machine quantity, and output is corresponding differentiates result.
6. a system that detects download file security, comprising: client and Cloud Server;
Wherein, described client, comprising:
Acquisition module, for gathering download scene characteristic and the file characteristic of download file; And
Upper transmission module, for being uploaded to Cloud Server by download scene characteristic and the file characteristic of gathered download file;
Described Cloud Server, comprising:
Matching module, for described download scene characteristic and file characteristic are mated with regulation engine medium cloud rule, obtains corresponding matching result; Wherein, described cloud rule is for to obtain by download scene characteristic and the file characteristic of comprehensive analysis paper sample; And
Issue module, for described matching result is handed down to described client.
7. system as claimed in claim 6, is characterized in that, described download scene characteristic comprises one or more in following feature: download link, whether the type of download tool or immediate communication tool, in net purchase or payment mode.
8. system as claimed in claim 6, is characterized in that, the collection mode of described paper sample comprises one or more in following mode:
Initiatively capture download link;
Obtain the paper sample of user's report
Obtain the paper sample that website that third party cooperates provides.
9. system as claimed in claim 6, is characterized in that, described download file is compressed package files, and described file characteristic comprises: file size before and after file layout, compression front and back file name, compression before and after file header form, compression;
Described cloud rule comprises: by the rule that before and after file name, compression before and after file layout, compression before and after the compression of compressed package files sample relatively, file size generates.
10. system as claimed in claim 6, is characterized in that, described upper transmission module specifically for the download scene characteristic of gathered download file and file characteristic are encapsulated in packet with the form of character string, and is uploaded to Cloud Server by this packet;
Described matching module, comprising:
Analyzing sub-module, resolves and obtains corresponding character string from this packet for Cloud Server;
Differentiate submodule, for this character string being inputted to corresponding at least one decision machine and the training pattern identical with decision machine quantity of described cloud rule, result is differentiated in output accordingly.
CN201410098964.4A 2014-03-17 2014-03-17 Downloaded file security detection method and device Pending CN103914655A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410098964.4A CN103914655A (en) 2014-03-17 2014-03-17 Downloaded file security detection method and device
PCT/CN2014/095951 WO2015139507A1 (en) 2014-03-17 2014-12-31 Method and apparatus for detecting security of a downloaded file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410098964.4A CN103914655A (en) 2014-03-17 2014-03-17 Downloaded file security detection method and device

Publications (1)

Publication Number Publication Date
CN103914655A true CN103914655A (en) 2014-07-09

Family

ID=51040328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410098964.4A Pending CN103914655A (en) 2014-03-17 2014-03-17 Downloaded file security detection method and device

Country Status (2)

Country Link
CN (1) CN103914655A (en)
WO (1) WO2015139507A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102861A (en) * 2014-07-16 2014-10-15 中山大学 JPEG (joint photographic experts group) image primitiveness detection method based on file header and compressed parameter
CN104462601A (en) * 2014-12-31 2015-03-25 北京奇虎科技有限公司 File scanning method, device and system
CN104462400A (en) * 2014-12-10 2015-03-25 北京奇虎科技有限公司 Method, device and browser client for downloading files in mobile terminals
WO2015139507A1 (en) * 2014-03-17 2015-09-24 北京奇虎科技有限公司 Method and apparatus for detecting security of a downloaded file
CN106101086A (en) * 2016-06-02 2016-11-09 北京奇虎科技有限公司 The cloud detection method of optic of program file and system, client, cloud server
CN106454393A (en) * 2016-11-23 2017-02-22 天脉聚源(北京)传媒科技有限公司 Video caching method and device
CN107465654A (en) * 2016-08-31 2017-12-12 哈尔滨广播电视台 For content distribution between each business subnet of TV and Radio Service and the system of safe killing
CN109558548A (en) * 2017-09-25 2019-04-02 北京国双科技有限公司 A kind of method and Related product for eliminating CSS style redundancy
CN110844852A (en) * 2019-11-26 2020-02-28 北谷电子有限公司上海分公司 Scissor aerial work platform and method for automatically remotely and virtually realizing full-load calibration curve
CN112580057A (en) * 2020-12-17 2021-03-30 光通天下网络科技股份有限公司 Attack vulnerability detection method, device, equipment and medium for ZIP encrypted compressed packet
CN114185610A (en) * 2021-11-18 2022-03-15 福建省天奕网络科技有限公司 Client function configuration method and server
CN115309785A (en) * 2022-08-08 2022-11-08 北京百度网讯科技有限公司 File rule engine library generation method, file information detection method, device and equipment

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111078538B (en) * 2019-11-29 2023-06-20 杭州安恒信息技术股份有限公司 JMH-based rule automation test method
CN113821796A (en) * 2020-06-18 2021-12-21 深信服科技股份有限公司 File virus checking and killing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010011281A1 (en) * 2001-03-12 2001-08-02 Fry Randolph A. Instant random display of electronic file through machine-readable codes on printed documents
CN1889773A (en) * 2006-07-18 2007-01-03 毛兴鹏 Mobile phone virtus examining and protecting method and system based on base station
CN101924760A (en) * 2010-08-17 2010-12-22 优视科技有限公司 Method and system for downloading executable file securely

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009151591A (en) * 2007-12-21 2009-07-09 Duaxes Corp File access control device
CN102469146B (en) * 2010-11-19 2015-11-25 北京奇虎科技有限公司 A kind of cloud security downloading method
CN102332071B (en) * 2011-09-30 2014-07-30 奇智软件(北京)有限公司 Methods and devices for discovering suspected malicious information and tracking malicious file
CN103646062A (en) * 2013-12-02 2014-03-19 北京奇虎科技有限公司 Scanning method and device for downloaded file
CN103914655A (en) * 2014-03-17 2014-07-09 北京奇虎科技有限公司 Downloaded file security detection method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010011281A1 (en) * 2001-03-12 2001-08-02 Fry Randolph A. Instant random display of electronic file through machine-readable codes on printed documents
CN1889773A (en) * 2006-07-18 2007-01-03 毛兴鹏 Mobile phone virtus examining and protecting method and system based on base station
CN101924760A (en) * 2010-08-17 2010-12-22 优视科技有限公司 Method and system for downloading executable file securely

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015139507A1 (en) * 2014-03-17 2015-09-24 北京奇虎科技有限公司 Method and apparatus for detecting security of a downloaded file
CN104102861A (en) * 2014-07-16 2014-10-15 中山大学 JPEG (joint photographic experts group) image primitiveness detection method based on file header and compressed parameter
CN104462400A (en) * 2014-12-10 2015-03-25 北京奇虎科技有限公司 Method, device and browser client for downloading files in mobile terminals
CN104462601A (en) * 2014-12-31 2015-03-25 北京奇虎科技有限公司 File scanning method, device and system
WO2016107309A1 (en) * 2014-12-31 2016-07-07 北京奇虎科技有限公司 File scanning method, device and system
CN106101086A (en) * 2016-06-02 2016-11-09 北京奇虎科技有限公司 The cloud detection method of optic of program file and system, client, cloud server
CN107465654B (en) * 2016-08-31 2020-07-31 哈尔滨广播电视台 System for distributing and safely searching and killing contents among service subnets of broadcast station
CN107465654A (en) * 2016-08-31 2017-12-12 哈尔滨广播电视台 For content distribution between each business subnet of TV and Radio Service and the system of safe killing
CN106454393A (en) * 2016-11-23 2017-02-22 天脉聚源(北京)传媒科技有限公司 Video caching method and device
CN109558548A (en) * 2017-09-25 2019-04-02 北京国双科技有限公司 A kind of method and Related product for eliminating CSS style redundancy
CN109558548B (en) * 2017-09-25 2021-05-25 北京国双科技有限公司 Method for eliminating CSS style redundancy and related product
CN110844852A (en) * 2019-11-26 2020-02-28 北谷电子有限公司上海分公司 Scissor aerial work platform and method for automatically remotely and virtually realizing full-load calibration curve
CN112580057A (en) * 2020-12-17 2021-03-30 光通天下网络科技股份有限公司 Attack vulnerability detection method, device, equipment and medium for ZIP encrypted compressed packet
CN114185610A (en) * 2021-11-18 2022-03-15 福建省天奕网络科技有限公司 Client function configuration method and server
CN115309785A (en) * 2022-08-08 2022-11-08 北京百度网讯科技有限公司 File rule engine library generation method, file information detection method, device and equipment

Also Published As

Publication number Publication date
WO2015139507A1 (en) 2015-09-24

Similar Documents

Publication Publication Date Title
CN103914655A (en) Downloaded file security detection method and device
CN109922052B (en) Malicious URL detection method combining multiple features
US10505986B1 (en) Sensor based rules for responding to malicious activity
CN103634306B (en) The safety detection method and safety detection server of network data
CN101895516B (en) Method and device for positioning cross-site scripting attack source
CN105184159B (en) The recognition methods of webpage tamper and device
CN103559235B (en) A kind of online social networks malicious web pages detection recognition methods
CN106357689B (en) The processing method and system of threat data
CN103368957B (en) Method and system that web page access behavior is processed, client, server
CN103716394B (en) Download the management method and device of file
CN105933268A (en) Webshell detection method and apparatus based on total access log analysis
Barua et al. Server side detection of content sniffing attacks
CN101816148A (en) System and method for authentication, data transfer and protection against phishing
CN112887341B (en) External threat monitoring method
CN106384048A (en) Threat message processing method and device
CN106022126B (en) A kind of web page characteristics extracting method towards WEB trojan horse detections
CN103647767A (en) Website information display method and apparatus
CN110879891A (en) Vulnerability detection method and device based on web fingerprint information
CN113032655A (en) Method for extracting and fixing dark network electronic data
CN104158828A (en) Method and system for identifying doubtful phishing webpage on basis of cloud content rule base
CN103312692B (en) Chained address safety detecting method and device
JP5656266B2 (en) Blacklist extraction apparatus, extraction method and extraction program
CN107566371B (en) WebShell mining method for massive logs
CN103152356A (en) Method, server and system for detecting safety of file sample
CN113114609A (en) Webshell detection evidence obtaining method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20140709

RJ01 Rejection of invention patent application after publication