CN110210218A - A kind of method and relevant apparatus of viral diagnosis - Google Patents

A kind of method and relevant apparatus of viral diagnosis Download PDF

Info

Publication number
CN110210218A
CN110210218A CN201810402154.1A CN201810402154A CN110210218A CN 110210218 A CN110210218 A CN 110210218A CN 201810402154 A CN201810402154 A CN 201810402154A CN 110210218 A CN110210218 A CN 110210218A
Authority
CN
China
Prior art keywords
rule
pattern detection
sample
detected
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810402154.1A
Other languages
Chinese (zh)
Other versions
CN110210218B (en
Inventor
雷经纬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810402154.1A priority Critical patent/CN110210218B/en
Publication of CN110210218A publication Critical patent/CN110210218A/en
Application granted granted Critical
Publication of CN110210218B publication Critical patent/CN110210218B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection

Abstract

The embodiment of the invention discloses a kind of methods of viral diagnosis, comprising: obtains the target feature vector of file to be detected;The target feature vector is matched using pattern detection regular collection, to generate object matching result, wherein, include first sample detected rule and the second pattern detection rule in the pattern detection regular collection, the first sample detected rule is used to indicate the corresponding relationship between security type and routing information, the second pattern detection rule is used to indicate that the corresponding relationship between Virus Type and routing information, the routing information to be used to indicate the probability of occurrence of behavior mark;The viral diagnosis result of the file to be detected is determined according to the object matching result.A kind of viral diagnosis device is additionally provided in the embodiment of the present invention.On the one hand the embodiment of the present invention can save the artificial process for extracting condition code and on the other hand can accurately perceive the type of file to be detected, be conducive to the safety of lifting scheme.

Description

A kind of method and relevant apparatus of viral diagnosis
Technical field
The present invention relates to field of information security technology more particularly to the methods and relevant apparatus of a kind of viral diagnosis.
Background technique
With the development of computer technology and network technology, viral type is more and more, and destructive and concealment is very strong Viral long-term existence.Virus is a program or one section of executable code, just as biological virus, have self-reproduction, The mutual biological virus feature such as phase transmission and activating and regenerating.They can be attached to itself on various types of files, work as file As soon as be replicated or be transmitted to another user from a user, they spread together in company with file comes.
Currently, generalling use the detection of virus such as under type, firstly, being carried out to the Virus Sample come is manually marked out Then analysis extracts binary segments as condition code, if file to be detected hits condition code from Virus Sample, then it represents that This document carries virus.
However, judging whether carry virus in file using aforesaid way, there are the following problems: since condition code is to shift to an earlier date It determines, once there is new virus, is then difficult to detect by the new virus, in other words, existing scheme can not be to unknown disease Poison is detected, and information security is unfavorable for.
Summary of the invention
The embodiment of the invention provides a kind of method of viral diagnosis and relevant apparatus, on the one hand can save and manually mention The process of condition code is taken, on the other hand, the type of file to be detected can be accurately perceived, be conducive to the safety of lifting scheme Property.
In view of this, the first aspect of the present invention first provides a kind of method of viral diagnosis, comprising:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its In, first sample detected rule and the second pattern detection rule, first sample are included in the pattern detection regular collection This detected rule is used to indicate the corresponding relationship between security type and routing information, and the second pattern detection rule is used for table Show that the corresponding relationship between Virus Type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result.
The second aspect of the present invention first provides a kind of viral diagnosis device, comprising:
Module is obtained, for obtaining the target feature vector of file to be detected;
Generation module, the target feature vector for being obtained using pattern detection regular collection to the acquisition module Matched, to generate object matching result, wherein in the pattern detection regular collection comprising first sample detected rule with And second pattern detection rule, the first sample detected rule be used for indicate between security type and routing information it is corresponding close System, the second pattern detection rule are used to indicate the corresponding relationship between Virus Type and routing information, the routing information It is used to indicate the probability of occurrence of behavior mark;
Determining module, the object matching result for being generated according to the generation module determine the file to be detected Viral diagnosis result.
The third aspect of the present invention first provides a kind of viral diagnosis device, comprising: memory, transceiver, processor And bus system;
Wherein, the memory is for storing program;
The processor is used to execute the program in the memory, includes the following steps:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its In, first sample detected rule and the second pattern detection rule, first sample are included in the pattern detection regular collection This detected rule is used to indicate the corresponding relationship between security type and routing information, and the second pattern detection rule is used for table Show that the corresponding relationship between Virus Type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result;
The bus system is for connecting the memory and the processor, so that the memory and the place Reason device is communicated.
The fourth aspect of the present invention provides a kind of computer readable storage medium, in the computer readable storage medium It is stored with instruction, when run on a computer, so that computer executes method described in above-mentioned various aspects.
As can be seen from the above technical solutions, the embodiment of the present invention has the advantage that
In the embodiment of the present invention, a kind of method of viral diagnosis is provided, obtains the target signature of file to be detected first Vector matches target feature vector using pattern detection regular collection, to generate object matching result, wherein sample Table is used for comprising first sample detected rule and the second pattern detection rule, first sample detected rule in detected rule set Show the corresponding relationship between security type and routing information, the second pattern detection rule is for indicating Virus Type and routing information Between corresponding relationship, routing information be used to indicate behavior mark probability of occurrence, finally according to object matching result determine to Detect the viral diagnosis result of file.By the above-mentioned means, the artificial process for extracting condition code, directly benefit on the one hand can be saved It is matched to obtain the matching result of file to be detected with pattern detection regular collection, which can indicate the peace of file to be detected Quan Xing, on the other hand, pattern detection regular collection include at least the rule for detecting security type and Virus Type, Neng Gouzhun The type for really perceiving file to be detected is conducive to the safety of lifting scheme.
Detailed description of the invention
Fig. 1 is a configuration diagram of virus detection system in the embodiment of the present invention;
Fig. 2 is a call relation schematic diagram of virus detection system in the embodiment of the present invention;
Fig. 3 is method one embodiment schematic diagram of viral diagnosis in the embodiment of the present invention;
Fig. 4 is the flow diagram that target feature vector is obtained in the embodiment of the present invention;
Fig. 5 is the flow diagram that decision-tree model file is generated in the embodiment of the present invention;
Fig. 6 is the flow diagram that pattern detection rule is generated in the embodiment of the present invention;
Fig. 7 is a schematic diagram of decision-tree model in the embodiment of the present invention;
Fig. 8 is the flow diagram tested in the embodiment of the present invention to file to be detected;
Fig. 9 is a flow diagram of viral diagnosis in application scenarios of the present invention;
Figure 10 is one embodiment schematic diagram of viral diagnosis device in the embodiment of the present invention;
Figure 11 is a structural schematic diagram of viral diagnosis device in the embodiment of the present invention.
Specific embodiment
The embodiment of the invention provides a kind of method of viral diagnosis and relevant apparatus, on the one hand can save and manually mention The process of condition code is taken, on the other hand, the type of file to be detected can be accurately perceived, be conducive to the safety of lifting scheme Property.
Description and claims of this specification and term " first ", " second ", " third ", " in above-mentioned attached drawing The (if present)s such as four " are to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should manage The data that solution uses in this way are interchangeable under appropriate circumstances, so that the embodiment of the present invention described herein for example can be to remove Sequence other than those of illustrating or describe herein is implemented.In addition, term " includes " and " having " and theirs is any Deformation, it is intended that cover it is non-exclusive include, for example, containing the process, method of a series of steps or units, system, production Product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include be not clearly listed or for this A little process, methods, the other step or units of product or equipment inherently.
It should be understood that present invention is primarily applicable to the detections of Android (Android) virus, in addition it is also possible to be applied to it The viral diagnosis of his type, such as Computer parallel processing, apple system (iphone operation system, iOS) virus Detection and microsoft system (Windows) viral diagnosis etc., this programme will be introduced by taking Android viral diagnosis as an example. Android can same series core application package issue together, which includes client, SMS (Short Message Service) (Short Message Service, SMS) program, calendar, map, browser and contact management's program etc..
At the same time, android system also faces the infringement of this Android virus, such as " hundred brain worm wooden horses " (can infect Promote class application program), " the tail tree horse of lizard " (can infect system library file, replacement system file, injected system process, surreptitiously Take user information and monitor call with short message etc.) and " permission killer " (can fight security software, monitor short message, play advertisement, Popularization and brush flow) etc..This programme can not only detect Android virus known to these, can also be to other Unknown Android virus is detected.
Referring to Fig. 1, Fig. 1 is a configuration diagram of virus detection system in the embodiment of the present invention, as shown, this Viral diagnosis device in scheme can be deployed in server, after server obtains viral diagnosis result, by the viral diagnosis As a result it is sent to terminal device, so that user can understand the virus inspection of file to be detected by the display interface of terminal device Survey result.Optionally, the viral diagnosis device in this programme can also be deployed in terminal device, by terminal device directly to be detected File is detected, and viral diagnosis result is showed in the display interface of front end.
Viral diagnosis device in the present invention may include four logic modules, and each logic module is for realizing corresponding function Energy.Referring to Fig. 2, Fig. 2 is a call relation schematic diagram of virus detection system in the embodiment of the present invention, as shown, this Four logic modules are respectively behavioral data extraction module S1, decision-tree model training module S2, testing process control module S3 And Rule Extraction module S4.Wherein, behavioral data extraction module S1 is program-controlled by decision-tree model training module S2 and detection stream Two modules of molding block S3 are called.It is understood that behavioral data extraction module S1 can be an independent module, it can also To be integrated respectively with by two modules of decision-tree model training module S2 and testing process control module S3.Pass through decision-tree model Training module S2 input is a collection of Android Virus Sample and the safe sample of Android, decision-tree model training module S2 tune Indicate that then Rule Extraction module S4 is defeated according to decision-tree model with the vector that behavioral data extraction module S1 obtains training sample Obtaining as a result, further according to preset rules formation condition trade-off decision tree path and sample type, to generate pattern detection out Regular collection.Testing process control module S3 calls behavioral data extraction module S1, is indicated with obtaining the vector of sample to be detected, Last calling rule extraction module S4 pattern detection regular collection generated, obtains the safe condition of sample to be detected.
Below by from the angle of viral diagnosis device, the method for viral diagnosis in the present invention is introduced, figure is please referred to 3, method one embodiment of viral diagnosis includes: in the embodiment of the present invention
101, the target feature vector of file to be detected is obtained;
In the present embodiment, firstly, viral diagnosis device receives viral diagnosis instruction, carried in viral diagnosis instruction to be checked The mark for surveying file, just can determine that file to be detected by the mark.Then, behavior mark is carried out to file to be detected to mention It takes, and generate target feature vector according to result is extracted.
102, target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its In, first sample detected rule and the second pattern detection rule, first sample detection rule are included in pattern detection regular collection Then for indicating corresponding relationship between security type and routing information, the second pattern detection rule for indicate Virus Type with Corresponding relationship between routing information, routing information are used to indicate the probability of occurrence of behavior mark;
In the present embodiment, viral diagnosis device is using at least one rule in pattern detection regular collection, to target spy Sign vector is matched, and includes first sample detected rule and the second pattern detection in pattern detection regular collection specifically Rule, first sample detected rule is used to detect the corresponding relationship between security type and routing information, for example, security type institute Corresponding routing information is " to identify comprising behavior mark 1, not comprising behavior mark 2, comprising behavior mark 4 and comprising behavior 5".It is understood that the corresponding relationship between above-mentioned security type and routing information is only a signal, it is not construed as Restriction to the application.And the second pattern detection rule is used to detect the corresponding relationship between Virus Type and routing information, than Such as, routing information corresponding to Virus Type be " do not include behavior mark 1, comprising behavior mark 3, do not include behavior mark 5 with And behavior mark 6 " is not included.It is understood that the corresponding relationship between above-mentioned Virus Type and routing information is only one Signal, is not construed as the restriction to the application.
Include in routing information behavior mark probability of occurrence, can with " 1 " come indicate some behavior mark occur, with " 0 " come indicate some behavior mark do not occur.
By behavior mark included in target feature vector and pattern detection regular (first sample detected rule or second Pattern detection rule) indicated by behavior mark matched, specifically, it is assumed that first sample detected rule are as follows: security classes Type --- comprising behavior mark 1, do not include behavior mark 2, comprising behavior mark 4 and comprising behavior mark 5, wherein and " road Diameter information " is " comprising behavior mark 1, not including behavior mark 2, comprising behavior mark 4 and comprising behavior mark 5 ".It is false If target feature vector is [10011], in order to make it easy to understand, the behavior for illustrating target feature vector by table 1 is identified below Situation.
Table 1
As shown in table 1, by the behavior mark in target feature vector and behavior defined in first sample detected rule Mark is compared, it is not difficult to find out that, target feature vector includes behavior mark 1, does not include behavior mark 2, comprising behavior mark 4 And comprising behavior mark 5, it is therefore contemplated that target feature vector is matched with first sample detected rule, phase then can be generated The object matching result answered.
103, the viral diagnosis result of file to be detected is determined according to object matching result.
In the present embodiment, viral diagnosis device determines the viral diagnosis knot of the file to be detected according to object matching result Fruit, and viral diagnosis result can be sent to client, user can understand whether file to be detected is peace by client Full situation.
Wherein, viral diagnosis result may include following three kinds of situations, the first is to match with first sample detected rule Security type, second is Virus Type with the second pattern detection rule match, and the third for i.e. not with first sample Detected rule matching, and the not UNKNOWN TYPE with the second pattern detection rule match.
In the embodiment of the present invention, a kind of method of viral diagnosis is provided, obtains the target signature of file to be detected first Vector matches target feature vector using pattern detection regular collection, to generate object matching result, wherein sample Table is used for comprising first sample detected rule and the second pattern detection rule, first sample detected rule in detected rule set Show the corresponding relationship between security type and routing information, the second pattern detection rule is for indicating Virus Type and routing information Between corresponding relationship, routing information be used to indicate behavior mark probability of occurrence, finally according to object matching result determine to Detect the viral diagnosis result of file.By the above-mentioned means, the artificial process for extracting condition code, directly benefit on the one hand can be saved It is matched to obtain the matching result of file to be detected with pattern detection regular collection, which can indicate the peace of file to be detected Quan Xing, on the other hand, pattern detection regular collection include at least the rule for detecting security type and Virus Type, Neng Gouzhun The type for really perceiving file to be detected is conducive to the safety of lifting scheme.
Optionally, on the basis of above-mentioned Fig. 3 corresponding embodiment, the method for viral diagnosis provided in an embodiment of the present invention In first alternative embodiment, the target feature vector of file to be detected is obtained, may include:
Obtaining the log information of file to be detected, wherein log information is identified comprising N number of behavior and N number of triggered time, N is the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to the probability of occurrence that N number of triggered time and each behavior identify, generate the target signature of file to be detected to Amount.
In the present embodiment, the target feature vector for how obtaining file to be detected will be introduced, target feature vector is behavior What mark was arranged according to triggered time vertical sequence.
Specifically, referring to Fig. 4, Fig. 4 is a flow diagram for obtaining target feature vector in the embodiment of the present invention, As shown, obtaining file to be detected in step 201, wherein file to be detected can be picture file, video file, document File, audio file or application program etc..In step 202, file to be detected is sent in simulator and is run, simulates utensil Body can be Android simulator, which is a kind of running environment, and execution journal records function under this running environment Can, when file to be detected is run in simulator, the execution of some function will be triggered, at this time exportable log letter Breath, wherein log information includes two fields, i.e. behavior identification field and triggered time field.In step 203, viral diagnosis dress The log information run in simulator can be extracted by setting, finally in step 204, by log information be converted to target signature to Amount.
It illustrates how log information being converted to target feature vector below in conjunction with table 2.
Table 2
By taking table 2 as an example, the probability that 11 behavior marks (i.e. N is 11) and each behavior mark occur is counted, if going out Existing is mark, then is denoted as 1, is otherwise denoted as 0, and arrangement obtains one group of feature vector, text to be detected indicated by table 2 from front to back The target feature vector of part is [1 10110111 0].
It should be noted that this programme can generate the feature vector and Virus Sample of safe sample using aforesaid way Feature vector, be not repeated herein.
Secondly, in the embodiment of the present invention, then the log information of the available file to be detected of viral diagnosis device counts The probability of occurrence of each behavior mark, the appearance finally identified according to N number of triggered time and each behavior in N number of behavior mark Probability generates the target feature vector of file to be detected.By the above-mentioned means, can be according to the probability of occurrence and touching that behavior identifies Time generation feature vector is sent out, can be subsequent rule so that there is incidence relation between feature vector and behavior mark With reliable foundation is provided, thus the feasibility of lifting scheme.
Optionally, on the basis of above-mentioned Fig. 3 corresponding embodiment, the method for viral diagnosis provided in an embodiment of the present invention In second alternative embodiment, before being matched using pattern detection regular collection to target feature vector, can also include:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining safe sample by decision-tree model, wherein decision Tree-model is used for outgoing route information and sample type, and sample type includes security type and Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining Virus Sample by decision-tree model.
In the present embodiment, it will introduce and how generate pattern detection rule (including first sample detected rule and the second sample This detected rule), pattern detection rule how is generated below in conjunction with Fig. 5 introduction, referring to Fig. 5, Fig. 5 is the embodiment of the present invention A middle flow diagram for generating decision-tree model file, as shown, first obtaining a collection of positive sample in step 301 and bearing Sample, wherein assuming that positive sample is safe sample, and negative sample is Virus Sample.Then, in step 302, using such as Fig. 3 Eigen vector generation method provided by corresponding one embodiment generates the feature vector of each safe sample, and each The feature vector of a Virus Sample.
In step 303, using decision-tree model to the feature vector of safe sample and the feature vector of Virus Sample It is trained, wherein each sample has one group of attribute and a classification, these classifications are pre-determined, then passing through To a classifier, this classifier can provide correct classification to emerging object for acquistion.In decision-tree model, packet Include Rule of judgment, routing information and the result of decision.In step 304, decision-tree model library file is generated according to routing information (such as sample label 1 --- routing information 1;Sample label 2 --- routing information 2), wherein decision-tree model library file is for giving birth to At pattern detection rule, if input is safe sample, what it is according to the output of decision-tree model library file may be the first sample This detected rule, if input is Virus Sample, what it is according to the output of decision-tree model library file may be the inspection of the second sample Gauge is then.
Secondly, viral diagnosis device generates first sample detected rule and the second pattern detection rule in the embodiment of the present invention Method then can be, and first obtains the feature vector of safe sample and the feature vector of Virus Sample, is then input to the two Decision-tree model determines routing information according to the result of decision of output, to generate pattern detection rule.By the above-mentioned means, Also there is following advantage using decision-tree model, first, decision tree should be readily appreciated that and realize, can directly embody the spy of data Point.Second, for decision tree, result that is feasible and working well can be made to mass data within the relatively short time.The Three, it is easy to evaluate and test model by static test, it can be with rating model confidence level.
Optionally, on the basis of above-mentioned Fig. 3 corresponding second embodiment, viral diagnosis provided in an embodiment of the present invention Method third alternative embodiment in, the corresponding first sample of the feature vector for obtaining safe sample by decision-tree model is examined Gauge then, may include:
The feature vector of safe sample is input to decision-tree model, obtains the pattern detection to be selected rule of X item first, In, X is the integer more than or equal to 1;
Selection meets the Y first sample detection of preset rules formation condition from the pattern detection to be selected rule of X item first Rule, wherein Y is the integer more than or equal to 1, and less than or equal to X;
Corresponding second pattern detection of feature vector for obtaining Virus Sample by decision-tree model is regular, may include:
The feature vector of Virus Sample is input to decision-tree model, obtains the pattern detection to be selected rule of Q item second, In, Q is the integer more than or equal to 1;
Selection meets the second pattern detection of P item of preset rules formation condition from the pattern detection to be selected rule of Q item second Rule, wherein P is the integer more than or equal to 1, and less than or equal to Q.
In the present embodiment, it will introduce and how to generate first sample detected rule and the second pattern detection rule.Specifically, will The feature vector of all safe samples is input to decision-tree model, can export X item first pattern detection to be selected rule, but this The pattern detection rule to be selected of X item first might not be all suitable for, for example, the road that certain first pattern detection rules to be selected are included Diameter information is very short, or effectively node ratio is very low, just needs this when from the pattern detection to be selected rule of X item first Selection meets the first pattern detection rule to be selected of preset rules formation condition, these meet the first pattern detection to be selected of condition Rule is Y first sample detected rule.
Similarly, the feature vector of all Virus Samples is input to decision-tree model, Q item second can be exported to sampling This detected rule, but this pattern detection rule to be selected of Q item second might not be all suitable for, for example, certain second samples to be selected The routing information that detected rule is included is very short, or effectively node ratio is very low, just needs this when from Q item second Selection meets the second pattern detection rule to be selected of preset rules formation condition in pattern detection rule to be selected, these meet condition The second pattern detection rule to be selected be P item the second pattern detection rule.
For the ease of introducing, referring to Fig. 6, Fig. 6 is a process for generating pattern detection rule in the embodiment of the present invention Schematic diagram, as shown, specifically:
In step 401, decision-tree model file is obtained;
In step 402, can be according to two following conditions filtering decision tree-model file, first condition is according to road Electrical path length filtering, is considered as the case where being unsatisfactory for regular formation condition for the shorter situation of path length;
In step 403, second condition is filtered according to positive node ratio, by the lower situation of positive node ratio It is considered as the case where being unsatisfactory for regular formation condition;
In step 404, raw using the decision-tree model file (including sample label and routing information) being obtained by filtration At pattern detection regular collection (including first sample detected rule and the second pattern detection rule).
Again, in the embodiment of the present invention, it is contemplated that and not all routing information is suitable for building pattern detection rule , therefore also need to be arranged " threshold " Lai Shengcheng pattern detection rule.By the above-mentioned means, being able to ascend pattern detection rule Reliability then is conducive to the safety of lifting scheme so as to accurately perceive the type of file to be detected.
Optionally, on the basis of above-mentioned Fig. 3 corresponding third embodiment, viral diagnosis provided in an embodiment of the present invention The 4th alternative embodiment of method in, selection meets preset rules formation condition from X item first pattern detection to be selected rule Y first sample detected rule, may include:
Selection path length is greater than Y first sample of preset length thresholding from the pattern detection to be selected rule of X item first Detected rule;
Selection meets the second pattern detection of P item of preset rules formation condition from the pattern detection to be selected rule of Q item second Rule, comprising:
Selection path length is greater than the second sample of P item of preset length thresholding from the pattern detection to be selected rule of Q item second Detected rule.
In the present embodiment, a kind of method for selecting pattern detection rule will be introduced, for the ease of introducing, referring to Fig. 7, figure 7 be a schematic diagram of decision-tree model in the embodiment of the present invention, as shown, Fig. 6 is the decision-tree model that depth is 6, from Vertex is a paths to each sample label (virus or safety), shares 10 paths.With first sample detected rule It, will be through from peak to label corresponding to the first sample detected rule for (i.e. safety label path shown in dash area) It crosses 6 to judge node (i.e. routing information), respectively behavior identifies [4] > 0.35, and behavior identifies [2] > 0.235, behavior mark [1] > 0.35, behavior identifies [7] > 0.76, and behavior identifies [5] > 0.65, and behavior identifies [72] > 0.75, because in feature vector not It is 0 is exactly 1, so the case where whether being not in unknown safety.
The first sample detected rule for judging that node generates according to 6 are as follows: there are behavior marks 4, and there are behaviors to identify 2, There are behavior marks 1, and there are behavior marks 7, and there are behavior marks 5, and there is no behavior marks 72.
It is filtered using path length, path length can be required to be more than or equal to the 2/3 of the depth of tree, it is assumed that decision tree The depth of model is 30, then the preset length thresholding that this programme is chosen is 30 × 2/3=20.It should be noted that default length Degree thresholding can be 2/3, be also possible to other reasonable values, and only one signal, is not construed as to the present invention herein Restriction.
Further, it in the embodiment of the present invention, can be selected from pattern detection rule to be selected according to path length full The pattern detection rule required enough, i.e. generation first sample detected rule and the second pattern detection rule.By the above-mentioned means, Generated pattern detection regular collection has preferable reliability, it is desirable that path length is greater than preset length thresholding, otherwise will view For underproof routing information, corresponding pattern detection rule would not be also generated, thus the feasibility of lifting scheme and practical Property.
Optionally, provided in an embodiment of the present invention on the basis of the corresponding third of above-mentioned Fig. 3 or the 4th embodiment In the 5th alternative embodiment of method of viral diagnosis, selection meets preset rules from the pattern detection to be selected rule of X item first Y first sample detected rule of formation condition may include:
Positive node ratio is selected to be greater than the Y item first of preset ratio thresholding from the pattern detection to be selected rule of X item first Pattern detection rule, wherein positive node ratio indicates positive number of nodes ratio shared by total node number amount, positive node Indicate the node comprising behavior mark;
Selection meets the second pattern detection of P item of preset rules formation condition from the pattern detection to be selected rule of Q item second Rule, comprising:
Positive node ratio is selected to be greater than the P item second of preset ratio thresholding from the pattern detection to be selected rule of Q item second Pattern detection rule.
In the present embodiment, it is based on the corresponding third alternative embodiment of Fig. 3, additionally provides a kind of selection pattern detection rule Method.Specifically, for decision tree, it includes that some behavior identifies and do not include some row that routing information, which is by several, For mark composition, positive node just refers to be identified comprising certain behavior.It is default that this programme requires positive node ratio to be more than or equal to Ratio thresholding, it is assumed that the length of certain paths is 20, and preset ratio thresholding is 4/5, then the minimum number of positive node is 20 × 4/5=16.
It should be noted that preset ratio thresholding can be 4/5, it is also possible to other reasonable values, herein only one A signal, is not construed as limitation of the invention.
Still further, can be selected from pattern detection rule to be selected according to positive node ratio in the embodiment of the present invention Select out the pattern detection rule met the requirements, i.e. generation first sample detected rule and the second pattern detection rule.By upper Mode is stated, generated pattern detection regular collection has preferable reliability, it is desirable that positive node ratio is greater than preset ratio door Limit, otherwise will be regarded as underproof routing information, would not also generate corresponding pattern detection rule, thus lifting scheme can Row and practicability.
Optionally, on the basis of above-mentioned Fig. 3 corresponding embodiment, the method for viral diagnosis provided in an embodiment of the present invention In 6th alternative embodiment, target feature vector is matched using pattern detection regular collection, to generate object matching As a result, may include:
Judge whether target feature vector meets first sample detected rule, if target feature vector meets first sample inspection Gauge then, then generates the first matching result;
If target feature vector is unsatisfactory for first sample detected rule, judge whether target feature vector meets the second sample This detected rule generates the second matching result if target feature vector meets the second pattern detection rule;
If target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
In the present embodiment, viral diagnosis device can be successively to matching target feature vector and the progress of pattern detection rule Match.Assuming that pattern detection regular collection includes first sample detected rule and the second pattern detection rule, firstly, viral diagnosis fills It sets and judges whether target feature vector meets first sample detected rule, if satisfied, then directly generating the first matching result, instead It, then continue to determine whether to meet next rule, that is, judges whether target feature vector meets the second pattern detection rule, if Target feature vector meets the second pattern detection rule, then generating the second matching result.If target feature vector is both discontented with Sufficient first sample detected rule, and it is unsatisfactory for the second pattern detection rule, then third matching result will be generated.
It is understood that in practical applications, the matching to pattern detection rule in selection pattern detection regular collection Sequence is not construed as limiting, and can first match the second pattern detection rule, then match first sample detected rule, vice versa.
Below in conjunction with Fig. 8, a process for detecting file type to be detected is introduced, referring to Fig. 8, Fig. 8 is the present invention The flow diagram tested in embodiment to file to be detected, as shown, specifically:
In step 501, a collection of positive sample and negative sample are obtained, wherein positive sample can refer to that safe sample, negative sample can To refer to Virus Sample.It should be noted that in practical applications, positive sample can also be set as to Virus Sample, negative sample It is set as safe sample, this depends on user's setting to positive negative sample in advance;
In step 502, safe sample and Virus Sample are sent into simulator respectively, and generate the log letter of simulator output Breath generates the feature vector of safe sample and the feature vector of Virus Sample;
In step 503, by the feature vector of the feature vector of safe sample and Virus Sample be input to decision-tree model into Row training;
In step 504, an available model library file, i.e. decision-tree model file after model training are carried out, it should Decision-tree model file can be called, decision-tree model file here can be understood as one kind by storage replication for subsequent detection Configuration file;
In step 505, each path information is filtered according to decision-tree model file, filter type may is that as Fruit path length is greater than or equal to the 2/3 of decision tree depth, and positive node is greater than or equal to the 4/5 of path length, then should Routing information can be identified as pattern detection rule;
In step 506, the pattern detection rule that step 505 generates is sorted out, pattern detection regular collection is obtained;
In step 507, file to be detected is obtained;
In step 508, extract the target feature vector of the file to be detected, and by the target signature of file to be detected to Amount, is matched with each pattern detection rule included in pattern detection regular collection;
In step 509, judge whether target feature vector advises with pattern detection included in pattern detection regular collection It then matches, if matched with Virus Sample detected rule, enters step 511, if mismatched with Virus Sample detected rule, Then enter step 510;
In step 510, judge whether target feature vector advises with pattern detection included in pattern detection regular collection Then match, if entering step 513 with safe pattern detection rule match, if mismatched with Virus Sample detected rule, Then enter step 512;
In step 511, determine that the file to be detected is virus document;
In step 512, the security situation of the file to be detected can not be determined, or think that the file to be detected is safe text Part;
In step 513, determine that the file to be detected is secure file.
Secondly, in the embodiment of the present invention, viral diagnosis device can will be in target feature vector and pattern detection regular collection Rule matched, if certain rule mismatch, will continue to be matched with next rule, until match knot Fruit, or determining and all rule all mismatch.By the above-mentioned means, can accurately know the matching knot of file to be detected Fruit, thus the reliability of lifting scheme.
Optionally, on the basis of above-mentioned Fig. 3 corresponding 6th embodiment, viral diagnosis provided in an embodiment of the present invention The 7th alternative embodiment of method in, the viral diagnosis of file to be detected is determined according to object matching result as a result, can wrap It includes:
If object matching result is the first matching result, it is determined that file to be detected belongs to secure file;
If object matching result is the second matching result, it is determined that file to be detected belongs to virus document;
If object matching result is third matching result, it is determined that file to be detected belongs to unknown secure file.
In the present embodiment, viral diagnosis device can according to pattern detection regular collection viral diagnosis generated as a result, Know type belonging to file to be detected.
Specifically, it is assumed that the target feature vector of file to be detected is matched with first sample detected rule, then it is determined that mesh Mark matching result is the first matching result, can determine that file to be detected belongs to the file of security type, just also for subsequent behaviour Make.Assuming that the target feature vector of file to be detected and the second pattern detection rule match, then it is determined that object matching result is Second matching result also just can determine that the file to be detected belongs to the file of Virus Type, it usually needs to Virus Type File is isolated.Assuming that the target feature vector of file to be detected is neither matched with first sample detected rule, and not with Two pattern detection rule match, then it is assumed that the file to be detected belongs to unknown safe file, that is, is used as apocrypha.
Again, in the embodiment of the present invention, viral diagnosis device determines the type of file to be detected according to object matching result, That is it is secure file that the first matching result, which is used to indicate file to be detected, and the second matching result is used to indicate file to be detected as disease Malicious file, and it is position secure file that third matching result, which is used to indicate file to be detected,.By the above-mentioned means, can be accurately The type for knowing file to be detected is not only able to determine Virus Type, can also distinguish between out security type and unknown security classes The case where type, thus the practicability of lifting scheme and safety.
In order to make it easy to understand, the process of viral diagnosis is introduced below in conjunction with Fig. 9, referring to Fig. 9, Fig. 9 answers for the present invention With a flow diagram of viral diagnosis in scene, as shown, specifically:
In step 601, start to carry out viral diagnosis;
In step 602, selection a batch is for generating pattern detection regular collection (first sample detected rule and the second sample This detected rule) safe sample and Virus Sample;
In step 603, a file to be detected is selected;
In step 604, can specifically be divided into four steps, obtained in step 6041 safe sample, Virus Sample and Safe sample, Virus Sample and file to be detected are sent into simulator in step 6042 and are run, then by file to be detected In step 6043, the log information of safe sample, the log information of Virus Sample and to be checked are extracted respectively from simulator The log information for surveying file believes the feature that the log information of safe sample is converted into safe sample finally in step 6044 Breath, converts the log information of Virus Sample to the characteristic information of Virus Sample, converts the log information of file to be detected to The target feature vector of file to be detected;
In step 605, by the characteristic information of the characteristic information of Virus Sample and safe sample be input to decision-tree model into Row training;
In step 606, decision-tree model library file is obtained according to training, it is filtered, the purpose of filtering is mainly sieved Select the pattern detection rule for meeting regular formation condition.Wherein, which can be by storage replication, and confession is subsequent Viral diagnosis calls, and decision-tree model library file here can be understood as a kind of configuration file;
In step 607, the target feature vector of the file to be detected is extracted;
In step 608, and the target feature vector of file to be detected is matched with pattern detection rule;
In step 609, judge whether target feature vector advises with pattern detection included in pattern detection regular collection It then matches, if matched with Virus Sample detected rule, enters step 611, if mismatched with Virus Sample detected rule, Then enter step 610;
In step 610, judge whether target feature vector advises with pattern detection included in pattern detection regular collection Then match, if entering step 613 with safe pattern detection rule match, if mismatched with Virus Sample detected rule, Then enter step 612;
In step 611, determine that the file to be detected is virus document;
In step 612, the security situation of the file to be detected can not be determined;
In step 613, determine that the file to be detected is secure file.
The viral diagnosis device in the present invention is described in detail below, referring to Fig. 10, Figure 10 is that the present invention is implemented Viral diagnosis device one embodiment schematic diagram in example, viral diagnosis device 70 include:
Module 701 is obtained, for obtaining the target feature vector of file to be detected;
Generation module 702, the target for being obtained using pattern detection regular collection to the acquisition module 701 are special Sign vector is matched, to generate object matching result, wherein is detected in the pattern detection regular collection comprising first sample Rule and the second pattern detection rule, the first sample detected rule is for indicating between security type and routing information Corresponding relationship, the second pattern detection rule are used to indicate the corresponding relationship between Virus Type and routing information, the road Diameter information is used to indicate the probability of occurrence of behavior mark;
Determining module 703, the object matching result for being generated according to the generation module 702 determine described to be checked Survey the viral diagnosis result of file.
In the present embodiment, the target feature vector that module 701 obtains file to be detected is obtained, generation module 702 uses sample The target feature vector that this detected rule set obtains the acquisition module 701 matches, to generate object matching As a result, wherein described comprising first sample detected rule and the second pattern detection rule in the pattern detection regular collection First sample detected rule is used to indicate the corresponding relationship between security type and routing information, the second pattern detection rule For indicating the corresponding relationship between Virus Type and routing information, the appearance that the routing information is used to indicate behavior mark is general Rate, determining module 703 determine the disease of the file to be detected according to the object matching result that the generation module 702 generates Malicious testing result.
In the embodiment of the present invention, provide a kind of viral diagnosis device, obtain first the target signature of file to be detected to Amount, matches target feature vector using pattern detection regular collection, to generate object matching result, wherein sample inspection It surveys comprising first sample detected rule and the second pattern detection rule in regular collection, first sample detected rule is for indicating Corresponding relationship between security type and routing information, the second pattern detection rule for indicate Virus Type and routing information it Between corresponding relationship, routing information be used to indicate behavior mark probability of occurrence, finally determined according to object matching result to be checked Survey the viral diagnosis result of file.By the above-mentioned means, the artificial process for extracting condition code on the one hand can be saved, directly utilize Pattern detection regular collection matches to obtain the matching result of file to be detected, which can indicate the safety of file to be detected Property, on the other hand, pattern detection regular collection includes at least the rule for detecting security type and Virus Type, can be accurate Ground perceives the type of file to be detected, is conducive to the safety of lifting scheme.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention In another embodiment set,
The acquisition module 701, specifically for obtaining the log information of the file to be detected, wherein the log letter Breath includes N number of behavior mark and N number of triggered time, and the N is the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the file to be detected is generated The target feature vector.
Secondly, in the embodiment of the present invention, then the log information of the available file to be detected of viral diagnosis device counts The probability of occurrence of each behavior mark, the appearance finally identified according to N number of triggered time and each behavior in N number of behavior mark Probability generates the target feature vector of file to be detected.By the above-mentioned means, can be according to the probability of occurrence and touching that behavior identifies Time generation feature vector is sent out, can be subsequent rule so that there is incidence relation between feature vector and behavior mark With reliable foundation is provided, thus the feasibility of lifting scheme.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention In another embodiment set,
It is special to the target using pattern detection regular collection to be also used to the generation module 702 for the acquisition module 701 Before sign vector is matched, the feature vector of safe sample and the feature vector of Virus Sample are obtained;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, In, for the decision-tree model for exporting the routing information and sample type, the sample type includes the security classes Type and the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model Then.
Secondly, viral diagnosis device generates first sample detected rule and the second pattern detection rule in the embodiment of the present invention Method then can be, and first obtains the feature vector of safe sample and the feature vector of Virus Sample, is then input to the two Decision-tree model determines routing information according to the result of decision of output, to generate pattern detection rule.By the above-mentioned means, Also there is following advantage using decision-tree model, first, decision tree should be readily appreciated that and realize, can directly embody the spy of data Point.Second, for decision tree, result that is feasible and working well can be made to mass data within the relatively short time.The Three, it is easy to evaluate and test model by static test, it can be with rating model confidence level.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention In another embodiment set,
The acquisition module 701, specifically for the feature vector of the safe sample is input to the decision-tree model, Obtain the pattern detection to be selected rule of X item first, wherein the X is the integer more than or equal to 1;
It selects to meet first described in the Y item of preset rules formation condition from the X item first pattern detection to be selected rule Pattern detection rule, wherein the Y is the integer more than or equal to 1, and less than or equal to the X;
The feature vector of the Virus Sample is input to the decision-tree model, obtains the pattern detection to be selected of Q item second Rule, wherein the Q is the integer more than or equal to 1;
It selects to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule Pattern detection rule, wherein the P is the integer more than or equal to 1, and less than or equal to the Q.
Again, in the embodiment of the present invention, it is contemplated that and not all routing information is suitable for building pattern detection rule , therefore also need to be arranged " threshold " Lai Shengcheng pattern detection rule.By the above-mentioned means, being able to ascend pattern detection rule Reliability then is conducive to the safety of lifting scheme so as to accurately perceive the type of file to be detected.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention In another embodiment set,
It is big to be specifically used for the selection path length from the X item first pattern detection to be selected rule for the acquisition module 701 The first sample detected rule described in the Y item of preset length thresholding;
Selection path length is greater than the P item of the preset length thresholding from the Q item second pattern detection to be selected rule The second pattern detection rule.
Further, it in the embodiment of the present invention, can be selected from pattern detection rule to be selected according to path length full The pattern detection rule required enough, i.e. generation first sample detected rule and the second pattern detection rule.By the above-mentioned means, Generated pattern detection regular collection has preferable reliability, it is desirable that path length is greater than preset length thresholding, otherwise will view For underproof routing information, corresponding pattern detection rule would not be also generated, thus the feasibility of lifting scheme and practical Property.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention In another embodiment set,
The acquisition module 701 is specifically used for selecting positive node ratio from the X item first pattern detection to be selected rule Example is greater than first sample detected rule described in the Y item of preset ratio thresholding, wherein the forward direction node ratio indicates positive Number of nodes ratio shared by total node number amount, the forward direction node indicate the node comprising behavior mark;
Positive node ratio is selected to be greater than described in preset ratio thresholding from the Q item second pattern detection to be selected rule The rule of second pattern detection described in P item.
Still further, can be selected from pattern detection rule to be selected according to positive node ratio in the embodiment of the present invention Select out the pattern detection rule met the requirements, i.e. generation first sample detected rule and the second pattern detection rule.By upper Mode is stated, generated pattern detection regular collection has preferable reliability, it is desirable that positive node ratio is greater than preset ratio door Limit, otherwise will be regarded as underproof routing information, would not also generate corresponding pattern detection rule, thus lifting scheme can Row and practicability.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention In another embodiment set,
The generation module 702, for judging whether the target feature vector meets the first sample detected rule, If the target feature vector meets the first sample detected rule, the first matching result is generated;
If the target feature vector is unsatisfactory for the first sample detected rule, judge that the target feature vector is It is no to meet the second pattern detection rule, if the target feature vector meets the second pattern detection rule, generate Second matching result;
If the target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
Secondly, in the embodiment of the present invention, viral diagnosis device can will be in target feature vector and pattern detection regular collection Rule matched, if certain rule mismatch, will continue to be matched with next rule, until match knot Fruit, or determining and all rule all mismatch.By the above-mentioned means, can accurately know the matching knot of file to be detected Fruit, thus the reliability of lifting scheme.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention In another embodiment set,
The determining module 703, if being first matching result for the object matching result, it is determined that the institute It states file to be detected and belongs to secure file;
If the object matching result is second matching result, it is determined that the file to be detected belongs to virus File;
If the object matching result is the third matching result, it is determined that the file to be detected belongs to unknown Secure file.
Again, in the embodiment of the present invention, viral diagnosis device determines the type of file to be detected according to object matching result, That is it is secure file that the first matching result, which is used to indicate file to be detected, and the second matching result is used to indicate file to be detected as disease Malicious file, and it is position secure file that third matching result, which is used to indicate file to be detected,.By the above-mentioned means, can be accurately The type for knowing file to be detected is not only able to determine Virus Type, can also distinguish between out security type and unknown security classes The case where type, thus the practicability of lifting scheme and safety.
Figure 11 is a kind of server architecture schematic diagram provided in an embodiment of the present invention, which can be because of configuration or property Energy is different and generates bigger difference, may include one or more central processing units (central processing Units, CPU) 822 (for example, one or more processors) and memory 832, one or more storages apply journey The storage medium 830 (such as one or more mass memory units) of sequence 842 or data 844.Wherein, 832 He of memory Storage medium 830 can be of short duration storage or persistent storage.The program for being stored in storage medium 830 may include one or one With upper module (diagram does not mark), each module may include to the series of instructions operation in server.Further, in Central processor 822 can be set to communicate with storage medium 830, execute on server 800 a series of in storage medium 830 Instruction operation.
Server 800 can also include one or more power supplys 826, one or more wired or wireless networks Interface 850, one or more input/output interfaces 858, and/or, one or more operating systems 841, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
The step as performed by server can be based on server architecture shown in the Figure 11 in above-described embodiment.
CPU 822 is for executing following steps:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its In, first sample detected rule and the second pattern detection rule, first sample are included in the pattern detection regular collection This detected rule is used to indicate the corresponding relationship between security type and routing information, and the second pattern detection rule is used for table Show that the corresponding relationship between Virus Type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result.
Optionally, CPU 822 is specifically used for executing following steps:
Obtain the log information of the file to be detected, wherein the log information include N number of behavior identify and it is N number of Triggered time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the file to be detected is generated The target feature vector.
Optionally, CPU 822 is also used to execute following steps:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, In, for the decision-tree model for exporting the routing information and sample type, the sample type includes the security classes Type and the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model Then.
Optionally, CPU 822 is specifically used for executing following steps:
The feature vector of the safe sample is input to the decision-tree model, obtains the pattern detection to be selected of X item first Rule, wherein the X is the integer more than or equal to 1;
It selects to meet first described in the Y item of preset rules formation condition from the X item first pattern detection to be selected rule Pattern detection rule, wherein the Y is the integer more than or equal to 1, and less than or equal to the X;
The feature vector of the Virus Sample is input to the decision-tree model, obtains the pattern detection to be selected of Q item second Rule, wherein the Q is the integer more than or equal to 1;
It selects to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule Pattern detection rule, wherein the P is the integer more than or equal to 1, and less than or equal to the Q.
Optionally, CPU 822 is specifically used for executing following steps:
Selection path length is greater than described in the Y item of preset length thresholding from the X item first pattern detection to be selected rule First sample detected rule;
Selection path length is greater than the P item of the preset length thresholding from the Q item second pattern detection to be selected rule The second pattern detection rule.
Optionally, CPU 822 is specifically used for executing following steps:
Positive node ratio is selected to be greater than described in preset ratio thresholding from the X item first pattern detection to be selected rule First sample detected rule described in Y item, wherein the forward direction node ratio indicates positive number of nodes shared by the total node number amount Ratio, it is described forward direction node indicate comprising behavior mark node;
Positive node ratio is selected to be greater than described in preset ratio thresholding from the Q item second pattern detection to be selected rule The rule of second pattern detection described in P item.
Optionally, CPU 822 is specifically used for executing following steps:
Judge whether the target feature vector meets the first sample detected rule, if the target feature vector is full The foot first sample detected rule, then generate the first matching result;
If the target feature vector is unsatisfactory for the first sample detected rule, judge that the target feature vector is It is no to meet the second pattern detection rule, if the target feature vector meets the second pattern detection rule, generate Second matching result;
If the target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
Optionally, CPU 822 is specifically used for executing following steps:
If the object matching result is first matching result, it is determined that the file to be detected belongs to safety File;
If the object matching result is second matching result, it is determined that the file to be detected belongs to virus File;
If the object matching result is the third matching result, it is determined that the file to be detected belongs to unknown Secure file.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided by the present invention, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention Portion or part steps.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic or disk etc. are various can store program The medium of code.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although referring to before Stating embodiment, invention is explained in detail, those skilled in the art should understand that: it still can be to preceding Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features;And these It modifies or replaces, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.

Claims (15)

1. a kind of method of viral diagnosis characterized by comprising
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching result, wherein institute It states in pattern detection regular collection comprising first sample detected rule and the second pattern detection rule, the first sample detection Rule is for indicating the corresponding relationship between security type and routing information, and the second pattern detection rule is for indicating virus Corresponding relationship between type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result.
2. the method according to claim 1, wherein the target feature vector for obtaining file to be detected, packet It includes:
Obtain the log information of the file to be detected, wherein the log information includes N number of behavior mark and N number of triggering Time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the institute of the file to be detected is generated State target feature vector.
3. the method according to claim 1, wherein described special to the target using pattern detection regular collection Before sign vector is matched, the method also includes:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, wherein The decision-tree model for exporting the routing information and sample type, the sample type include the security type with And the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model.
4. according to the method described in claim 3, it is characterized in that, described obtain the safe sample by decision-tree model The corresponding first sample detected rule of feature vector, comprising:
The feature vector of the safe sample is input to the decision-tree model, obtains the pattern detection to be selected rule of X item first, Wherein, the X is the integer more than or equal to 1;
Selection meets first sample described in the Y item of preset rules formation condition from the X item first pattern detection to be selected rule Detected rule, wherein the Y is the integer more than or equal to 1, and less than or equal to the X;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model Then, comprising:
The feature vector of the Virus Sample is input to the decision-tree model, obtains the pattern detection to be selected rule of Q item second, Wherein, the Q is the integer more than or equal to 1;
Selection meets the second sample described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule Detected rule, wherein the P is the integer more than or equal to 1, and less than or equal to the Q.
5. according to the method described in claim 4, it is characterized in that, described from the X item first pattern detection to be selected rule Selection meets first sample detected rule described in the Y item of preset rules formation condition, comprising:
Selection path length is greater than first described in the Y item of preset length thresholding from the X item first pattern detection to be selected rule Pattern detection rule;
It is described to select to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule Pattern detection rule, comprising:
Selection path length is greater than described in the P item of the preset length thresholding from the Q item second pattern detection to be selected rule Second pattern detection rule.
6. method according to claim 4 or 5, which is characterized in that described from the X item first pattern detection to be selected rule Middle selection meets first sample detected rule described in the Y item of preset rules formation condition, comprising:
Positive node ratio is selected to be greater than the Y item of preset ratio thresholding from the X item first pattern detection to be selected rule The first sample detected rule, wherein the forward direction node ratio indicates positive number of nodes shared by the total node number amount Ratio, the forward direction node indicate the node comprising behavior mark;
It is described to select to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule Pattern detection rule, comprising:
Positive node ratio is selected to be greater than the P item of preset ratio thresholding from the Q item second pattern detection to be selected rule The second pattern detection rule.
7. the method according to claim 1, wherein described special to the target using pattern detection regular collection Sign vector is matched, to generate object matching result, comprising:
If the target feature vector meets first sample detected rule, the first matching result is generated;
If the target feature vector is unsatisfactory for the first sample detected rule, judge whether the target feature vector is full Foot the second pattern detection rule generates second if the target feature vector meets the second pattern detection rule Matching result;
If the target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
8. the method according to the description of claim 7 is characterized in that it is described determined according to the object matching result it is described to be checked Survey the viral diagnosis result of file, comprising:
If the object matching result is first matching result, it is determined that the file to be detected belongs to safe text Part;
If the object matching result is second matching result, it is determined that the file to be detected belongs to viral text Part;
If the object matching result is the third matching result, it is determined that the file to be detected belongs to unknown safety File.
9. a kind of viral diagnosis device characterized by comprising
Module is obtained, for obtaining the target feature vector of file to be detected;
Generation module, the target feature vector for being obtained using pattern detection regular collection to the acquisition module are carried out Matching, to generate object matching result, wherein include first sample detected rule and the in the pattern detection regular collection Two pattern detections are regular, and the first sample detected rule is used to indicate the corresponding relationship between security type and routing information, The second pattern detection rule is used to indicate that the corresponding relationship between Virus Type and routing information, the routing information to be used for The probability of occurrence of indication action mark;
Determining module, the object matching result for being generated according to the generation module determine the disease of the file to be detected Malicious testing result.
10. viral diagnosis device according to claim 9, which is characterized in that
The acquisition module, specifically for obtaining the log information of the file to be detected, wherein the log information includes N A behavior mark and N number of triggered time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the institute of the file to be detected is generated State target feature vector.
11. viral diagnosis device according to claim 9, which is characterized in that
The acquisition module is also used to the generation module and is carried out using pattern detection regular collection to the target feature vector Before matching, the feature vector of safe sample and the feature vector of Virus Sample are obtained;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, wherein The decision-tree model for exporting the routing information and sample type, the sample type include the security type with And the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model.
12. a kind of viral diagnosis device, which is characterized in that the viral diagnosis device includes: memory, transceiver, processor And bus system;
Wherein, the memory is for storing program;
The processor is used to execute the program in the memory, includes the following steps:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching result, wherein institute It states in pattern detection regular collection comprising first sample detected rule and the second pattern detection rule, the first sample detection Rule is for indicating the corresponding relationship between security type and routing information, and the second pattern detection rule is for indicating virus Corresponding relationship between type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result;
The bus system is for connecting the memory and the processor, so that the memory and the processor It is communicated.
13. viral diagnosis device according to claim 12, which is characterized in that the processor is specifically used for executing as follows Step:
Obtain the log information of the file to be detected, wherein the log information includes N number of behavior mark and N number of triggering Time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the institute of the file to be detected is generated State target feature vector.
14. viral diagnosis device according to claim 12, which is characterized in that the processor is also used to execute following step It is rapid:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, wherein The decision-tree model for exporting the routing information and sample type, the sample type include the security type with And the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model.
15. a kind of computer readable storage medium, including instruction, when run on a computer, so that computer executes such as Method described in any item of the claim 1 to 8.
CN201810402154.1A 2018-04-28 2018-04-28 Virus detection method and related device Active CN110210218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810402154.1A CN110210218B (en) 2018-04-28 2018-04-28 Virus detection method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810402154.1A CN110210218B (en) 2018-04-28 2018-04-28 Virus detection method and related device

Publications (2)

Publication Number Publication Date
CN110210218A true CN110210218A (en) 2019-09-06
CN110210218B CN110210218B (en) 2023-04-14

Family

ID=67778796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810402154.1A Active CN110210218B (en) 2018-04-28 2018-04-28 Virus detection method and related device

Country Status (1)

Country Link
CN (1) CN110210218B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795638A (en) * 2019-11-13 2020-02-14 北京百度网讯科技有限公司 Method and apparatus for outputting information
CN111310179A (en) * 2020-01-22 2020-06-19 腾讯科技(深圳)有限公司 Method and device for analyzing computer virus variants and computer equipment
CN111753290A (en) * 2020-05-26 2020-10-09 郑州启明星辰信息安全技术有限公司 Software type detection method and related equipment
CN113032742A (en) * 2021-01-26 2021-06-25 北京安华金和科技有限公司 Data desensitization method and device, storage medium and electronic device
CN117152260A (en) * 2023-11-01 2023-12-01 张家港长三角生物安全研究中心 Method and system for detecting residues of disinfection apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8401982B1 (en) * 2010-01-14 2013-03-19 Symantec Corporation Using sequencing and timing information of behavior events in machine learning to detect malware
CN103150509A (en) * 2013-03-15 2013-06-12 长沙文盾信息技术有限公司 Virus detection system based on virtual execution
CN103577756A (en) * 2013-11-05 2014-02-12 北京奇虎科技有限公司 Virus detection method and device based on script type judgment
CN103839003A (en) * 2012-11-22 2014-06-04 腾讯科技(深圳)有限公司 Malicious file detection method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8401982B1 (en) * 2010-01-14 2013-03-19 Symantec Corporation Using sequencing and timing information of behavior events in machine learning to detect malware
CN103839003A (en) * 2012-11-22 2014-06-04 腾讯科技(深圳)有限公司 Malicious file detection method and device
CN103150509A (en) * 2013-03-15 2013-06-12 长沙文盾信息技术有限公司 Virus detection system based on virtual execution
CN103577756A (en) * 2013-11-05 2014-02-12 北京奇虎科技有限公司 Virus detection method and device based on script type judgment

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795638A (en) * 2019-11-13 2020-02-14 北京百度网讯科技有限公司 Method and apparatus for outputting information
CN111310179A (en) * 2020-01-22 2020-06-19 腾讯科技(深圳)有限公司 Method and device for analyzing computer virus variants and computer equipment
CN111753290A (en) * 2020-05-26 2020-10-09 郑州启明星辰信息安全技术有限公司 Software type detection method and related equipment
CN113032742A (en) * 2021-01-26 2021-06-25 北京安华金和科技有限公司 Data desensitization method and device, storage medium and electronic device
CN117152260A (en) * 2023-11-01 2023-12-01 张家港长三角生物安全研究中心 Method and system for detecting residues of disinfection apparatus
CN117152260B (en) * 2023-11-01 2024-02-06 张家港长三角生物安全研究中心 Method and system for detecting residues of disinfection apparatus

Also Published As

Publication number Publication date
CN110210218B (en) 2023-04-14

Similar Documents

Publication Publication Date Title
CN110210218A (en) A kind of method and relevant apparatus of viral diagnosis
Aljawarneh et al. Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model
CN110177108B (en) Abnormal behavior detection method, device and verification system
CN107153789B (en) Utilize the method for random forest grader real-time detection Android Malware
Pirscoveanu et al. Analysis of malware behavior: Type classification using machine learning
CN107392025B (en) Malicious android application program detection method based on deep learning
CN110245496A (en) A kind of source code leak detection method and detector and its training method and system
CN109684840A (en) Based on the sensitive Android malware detection method for calling path
CN104598824B (en) A kind of malware detection methods and device thereof
CN108200054A (en) A kind of malice domain name detection method and device based on dns resolution
CN105809035B (en) The malware detection method and system of real-time behavior is applied based on Android
CN108229262B (en) Pornographic video detection method and device
CN106485146B (en) A kind of information processing method and server
CN112528284A (en) Malicious program detection method and device, storage medium and electronic equipment
CN112149124B (en) Android malicious program detection method and system based on heterogeneous information network
CN110263538A (en) A kind of malicious code detecting method based on system action sequence
CN105446741B (en) A kind of mobile applications discrimination method compared based on API
CN104866764B (en) A kind of Android phone malware detection method based on object reference figure
CN111368289B (en) Malicious software detection method and device
CN109325232A (en) A kind of user behavior exception analysis method, system and storage medium based on LDA
CN110363003A (en) A kind of Android virus static detection method based on deep learning
CN111586071A (en) Encryption attack detection method and device based on recurrent neural network model
CN117081858B (en) Intrusion behavior detection method, system, equipment and medium based on multi-decision tree
CN110210216B (en) Virus detection method and related device
CN113535823A (en) Abnormal access behavior detection method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant