CN110210218A - A kind of method and relevant apparatus of viral diagnosis - Google Patents
A kind of method and relevant apparatus of viral diagnosis Download PDFInfo
- Publication number
- CN110210218A CN110210218A CN201810402154.1A CN201810402154A CN110210218A CN 110210218 A CN110210218 A CN 110210218A CN 201810402154 A CN201810402154 A CN 201810402154A CN 110210218 A CN110210218 A CN 110210218A
- Authority
- CN
- China
- Prior art keywords
- rule
- pattern detection
- sample
- detected
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
Abstract
The embodiment of the invention discloses a kind of methods of viral diagnosis, comprising: obtains the target feature vector of file to be detected;The target feature vector is matched using pattern detection regular collection, to generate object matching result, wherein, include first sample detected rule and the second pattern detection rule in the pattern detection regular collection, the first sample detected rule is used to indicate the corresponding relationship between security type and routing information, the second pattern detection rule is used to indicate that the corresponding relationship between Virus Type and routing information, the routing information to be used to indicate the probability of occurrence of behavior mark;The viral diagnosis result of the file to be detected is determined according to the object matching result.A kind of viral diagnosis device is additionally provided in the embodiment of the present invention.On the one hand the embodiment of the present invention can save the artificial process for extracting condition code and on the other hand can accurately perceive the type of file to be detected, be conducive to the safety of lifting scheme.
Description
Technical field
The present invention relates to field of information security technology more particularly to the methods and relevant apparatus of a kind of viral diagnosis.
Background technique
With the development of computer technology and network technology, viral type is more and more, and destructive and concealment is very strong
Viral long-term existence.Virus is a program or one section of executable code, just as biological virus, have self-reproduction,
The mutual biological virus feature such as phase transmission and activating and regenerating.They can be attached to itself on various types of files, work as file
As soon as be replicated or be transmitted to another user from a user, they spread together in company with file comes.
Currently, generalling use the detection of virus such as under type, firstly, being carried out to the Virus Sample come is manually marked out
Then analysis extracts binary segments as condition code, if file to be detected hits condition code from Virus Sample, then it represents that
This document carries virus.
However, judging whether carry virus in file using aforesaid way, there are the following problems: since condition code is to shift to an earlier date
It determines, once there is new virus, is then difficult to detect by the new virus, in other words, existing scheme can not be to unknown disease
Poison is detected, and information security is unfavorable for.
Summary of the invention
The embodiment of the invention provides a kind of method of viral diagnosis and relevant apparatus, on the one hand can save and manually mention
The process of condition code is taken, on the other hand, the type of file to be detected can be accurately perceived, be conducive to the safety of lifting scheme
Property.
In view of this, the first aspect of the present invention first provides a kind of method of viral diagnosis, comprising:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its
In, first sample detected rule and the second pattern detection rule, first sample are included in the pattern detection regular collection
This detected rule is used to indicate the corresponding relationship between security type and routing information, and the second pattern detection rule is used for table
Show that the corresponding relationship between Virus Type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result.
The second aspect of the present invention first provides a kind of viral diagnosis device, comprising:
Module is obtained, for obtaining the target feature vector of file to be detected;
Generation module, the target feature vector for being obtained using pattern detection regular collection to the acquisition module
Matched, to generate object matching result, wherein in the pattern detection regular collection comprising first sample detected rule with
And second pattern detection rule, the first sample detected rule be used for indicate between security type and routing information it is corresponding close
System, the second pattern detection rule are used to indicate the corresponding relationship between Virus Type and routing information, the routing information
It is used to indicate the probability of occurrence of behavior mark;
Determining module, the object matching result for being generated according to the generation module determine the file to be detected
Viral diagnosis result.
The third aspect of the present invention first provides a kind of viral diagnosis device, comprising: memory, transceiver, processor
And bus system;
Wherein, the memory is for storing program;
The processor is used to execute the program in the memory, includes the following steps:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its
In, first sample detected rule and the second pattern detection rule, first sample are included in the pattern detection regular collection
This detected rule is used to indicate the corresponding relationship between security type and routing information, and the second pattern detection rule is used for table
Show that the corresponding relationship between Virus Type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result;
The bus system is for connecting the memory and the processor, so that the memory and the place
Reason device is communicated.
The fourth aspect of the present invention provides a kind of computer readable storage medium, in the computer readable storage medium
It is stored with instruction, when run on a computer, so that computer executes method described in above-mentioned various aspects.
As can be seen from the above technical solutions, the embodiment of the present invention has the advantage that
In the embodiment of the present invention, a kind of method of viral diagnosis is provided, obtains the target signature of file to be detected first
Vector matches target feature vector using pattern detection regular collection, to generate object matching result, wherein sample
Table is used for comprising first sample detected rule and the second pattern detection rule, first sample detected rule in detected rule set
Show the corresponding relationship between security type and routing information, the second pattern detection rule is for indicating Virus Type and routing information
Between corresponding relationship, routing information be used to indicate behavior mark probability of occurrence, finally according to object matching result determine to
Detect the viral diagnosis result of file.By the above-mentioned means, the artificial process for extracting condition code, directly benefit on the one hand can be saved
It is matched to obtain the matching result of file to be detected with pattern detection regular collection, which can indicate the peace of file to be detected
Quan Xing, on the other hand, pattern detection regular collection include at least the rule for detecting security type and Virus Type, Neng Gouzhun
The type for really perceiving file to be detected is conducive to the safety of lifting scheme.
Detailed description of the invention
Fig. 1 is a configuration diagram of virus detection system in the embodiment of the present invention;
Fig. 2 is a call relation schematic diagram of virus detection system in the embodiment of the present invention;
Fig. 3 is method one embodiment schematic diagram of viral diagnosis in the embodiment of the present invention;
Fig. 4 is the flow diagram that target feature vector is obtained in the embodiment of the present invention;
Fig. 5 is the flow diagram that decision-tree model file is generated in the embodiment of the present invention;
Fig. 6 is the flow diagram that pattern detection rule is generated in the embodiment of the present invention;
Fig. 7 is a schematic diagram of decision-tree model in the embodiment of the present invention;
Fig. 8 is the flow diagram tested in the embodiment of the present invention to file to be detected;
Fig. 9 is a flow diagram of viral diagnosis in application scenarios of the present invention;
Figure 10 is one embodiment schematic diagram of viral diagnosis device in the embodiment of the present invention;
Figure 11 is a structural schematic diagram of viral diagnosis device in the embodiment of the present invention.
Specific embodiment
The embodiment of the invention provides a kind of method of viral diagnosis and relevant apparatus, on the one hand can save and manually mention
The process of condition code is taken, on the other hand, the type of file to be detected can be accurately perceived, be conducive to the safety of lifting scheme
Property.
Description and claims of this specification and term " first ", " second ", " third ", " in above-mentioned attached drawing
The (if present)s such as four " are to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should manage
The data that solution uses in this way are interchangeable under appropriate circumstances, so that the embodiment of the present invention described herein for example can be to remove
Sequence other than those of illustrating or describe herein is implemented.In addition, term " includes " and " having " and theirs is any
Deformation, it is intended that cover it is non-exclusive include, for example, containing the process, method of a series of steps or units, system, production
Product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include be not clearly listed or for this
A little process, methods, the other step or units of product or equipment inherently.
It should be understood that present invention is primarily applicable to the detections of Android (Android) virus, in addition it is also possible to be applied to it
The viral diagnosis of his type, such as Computer parallel processing, apple system (iphone operation system, iOS) virus
Detection and microsoft system (Windows) viral diagnosis etc., this programme will be introduced by taking Android viral diagnosis as an example.
Android can same series core application package issue together, which includes client, SMS (Short Message Service)
(Short Message Service, SMS) program, calendar, map, browser and contact management's program etc..
At the same time, android system also faces the infringement of this Android virus, such as " hundred brain worm wooden horses " (can infect
Promote class application program), " the tail tree horse of lizard " (can infect system library file, replacement system file, injected system process, surreptitiously
Take user information and monitor call with short message etc.) and " permission killer " (can fight security software, monitor short message, play advertisement,
Popularization and brush flow) etc..This programme can not only detect Android virus known to these, can also be to other
Unknown Android virus is detected.
Referring to Fig. 1, Fig. 1 is a configuration diagram of virus detection system in the embodiment of the present invention, as shown, this
Viral diagnosis device in scheme can be deployed in server, after server obtains viral diagnosis result, by the viral diagnosis
As a result it is sent to terminal device, so that user can understand the virus inspection of file to be detected by the display interface of terminal device
Survey result.Optionally, the viral diagnosis device in this programme can also be deployed in terminal device, by terminal device directly to be detected
File is detected, and viral diagnosis result is showed in the display interface of front end.
Viral diagnosis device in the present invention may include four logic modules, and each logic module is for realizing corresponding function
Energy.Referring to Fig. 2, Fig. 2 is a call relation schematic diagram of virus detection system in the embodiment of the present invention, as shown, this
Four logic modules are respectively behavioral data extraction module S1, decision-tree model training module S2, testing process control module S3
And Rule Extraction module S4.Wherein, behavioral data extraction module S1 is program-controlled by decision-tree model training module S2 and detection stream
Two modules of molding block S3 are called.It is understood that behavioral data extraction module S1 can be an independent module, it can also
To be integrated respectively with by two modules of decision-tree model training module S2 and testing process control module S3.Pass through decision-tree model
Training module S2 input is a collection of Android Virus Sample and the safe sample of Android, decision-tree model training module S2 tune
Indicate that then Rule Extraction module S4 is defeated according to decision-tree model with the vector that behavioral data extraction module S1 obtains training sample
Obtaining as a result, further according to preset rules formation condition trade-off decision tree path and sample type, to generate pattern detection out
Regular collection.Testing process control module S3 calls behavioral data extraction module S1, is indicated with obtaining the vector of sample to be detected,
Last calling rule extraction module S4 pattern detection regular collection generated, obtains the safe condition of sample to be detected.
Below by from the angle of viral diagnosis device, the method for viral diagnosis in the present invention is introduced, figure is please referred to
3, method one embodiment of viral diagnosis includes: in the embodiment of the present invention
101, the target feature vector of file to be detected is obtained;
In the present embodiment, firstly, viral diagnosis device receives viral diagnosis instruction, carried in viral diagnosis instruction to be checked
The mark for surveying file, just can determine that file to be detected by the mark.Then, behavior mark is carried out to file to be detected to mention
It takes, and generate target feature vector according to result is extracted.
102, target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its
In, first sample detected rule and the second pattern detection rule, first sample detection rule are included in pattern detection regular collection
Then for indicating corresponding relationship between security type and routing information, the second pattern detection rule for indicate Virus Type with
Corresponding relationship between routing information, routing information are used to indicate the probability of occurrence of behavior mark;
In the present embodiment, viral diagnosis device is using at least one rule in pattern detection regular collection, to target spy
Sign vector is matched, and includes first sample detected rule and the second pattern detection in pattern detection regular collection specifically
Rule, first sample detected rule is used to detect the corresponding relationship between security type and routing information, for example, security type institute
Corresponding routing information is " to identify comprising behavior mark 1, not comprising behavior mark 2, comprising behavior mark 4 and comprising behavior
5".It is understood that the corresponding relationship between above-mentioned security type and routing information is only a signal, it is not construed as
Restriction to the application.And the second pattern detection rule is used to detect the corresponding relationship between Virus Type and routing information, than
Such as, routing information corresponding to Virus Type be " do not include behavior mark 1, comprising behavior mark 3, do not include behavior mark 5 with
And behavior mark 6 " is not included.It is understood that the corresponding relationship between above-mentioned Virus Type and routing information is only one
Signal, is not construed as the restriction to the application.
Include in routing information behavior mark probability of occurrence, can with " 1 " come indicate some behavior mark occur, with
" 0 " come indicate some behavior mark do not occur.
By behavior mark included in target feature vector and pattern detection regular (first sample detected rule or second
Pattern detection rule) indicated by behavior mark matched, specifically, it is assumed that first sample detected rule are as follows: security classes
Type --- comprising behavior mark 1, do not include behavior mark 2, comprising behavior mark 4 and comprising behavior mark 5, wherein and " road
Diameter information " is " comprising behavior mark 1, not including behavior mark 2, comprising behavior mark 4 and comprising behavior mark 5 ".It is false
If target feature vector is [10011], in order to make it easy to understand, the behavior for illustrating target feature vector by table 1 is identified below
Situation.
Table 1
As shown in table 1, by the behavior mark in target feature vector and behavior defined in first sample detected rule
Mark is compared, it is not difficult to find out that, target feature vector includes behavior mark 1, does not include behavior mark 2, comprising behavior mark 4
And comprising behavior mark 5, it is therefore contemplated that target feature vector is matched with first sample detected rule, phase then can be generated
The object matching result answered.
103, the viral diagnosis result of file to be detected is determined according to object matching result.
In the present embodiment, viral diagnosis device determines the viral diagnosis knot of the file to be detected according to object matching result
Fruit, and viral diagnosis result can be sent to client, user can understand whether file to be detected is peace by client
Full situation.
Wherein, viral diagnosis result may include following three kinds of situations, the first is to match with first sample detected rule
Security type, second is Virus Type with the second pattern detection rule match, and the third for i.e. not with first sample
Detected rule matching, and the not UNKNOWN TYPE with the second pattern detection rule match.
In the embodiment of the present invention, a kind of method of viral diagnosis is provided, obtains the target signature of file to be detected first
Vector matches target feature vector using pattern detection regular collection, to generate object matching result, wherein sample
Table is used for comprising first sample detected rule and the second pattern detection rule, first sample detected rule in detected rule set
Show the corresponding relationship between security type and routing information, the second pattern detection rule is for indicating Virus Type and routing information
Between corresponding relationship, routing information be used to indicate behavior mark probability of occurrence, finally according to object matching result determine to
Detect the viral diagnosis result of file.By the above-mentioned means, the artificial process for extracting condition code, directly benefit on the one hand can be saved
It is matched to obtain the matching result of file to be detected with pattern detection regular collection, which can indicate the peace of file to be detected
Quan Xing, on the other hand, pattern detection regular collection include at least the rule for detecting security type and Virus Type, Neng Gouzhun
The type for really perceiving file to be detected is conducive to the safety of lifting scheme.
Optionally, on the basis of above-mentioned Fig. 3 corresponding embodiment, the method for viral diagnosis provided in an embodiment of the present invention
In first alternative embodiment, the target feature vector of file to be detected is obtained, may include:
Obtaining the log information of file to be detected, wherein log information is identified comprising N number of behavior and N number of triggered time,
N is the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to the probability of occurrence that N number of triggered time and each behavior identify, generate the target signature of file to be detected to
Amount.
In the present embodiment, the target feature vector for how obtaining file to be detected will be introduced, target feature vector is behavior
What mark was arranged according to triggered time vertical sequence.
Specifically, referring to Fig. 4, Fig. 4 is a flow diagram for obtaining target feature vector in the embodiment of the present invention,
As shown, obtaining file to be detected in step 201, wherein file to be detected can be picture file, video file, document
File, audio file or application program etc..In step 202, file to be detected is sent in simulator and is run, simulates utensil
Body can be Android simulator, which is a kind of running environment, and execution journal records function under this running environment
Can, when file to be detected is run in simulator, the execution of some function will be triggered, at this time exportable log letter
Breath, wherein log information includes two fields, i.e. behavior identification field and triggered time field.In step 203, viral diagnosis dress
The log information run in simulator can be extracted by setting, finally in step 204, by log information be converted to target signature to
Amount.
It illustrates how log information being converted to target feature vector below in conjunction with table 2.
Table 2
By taking table 2 as an example, the probability that 11 behavior marks (i.e. N is 11) and each behavior mark occur is counted, if going out
Existing is mark, then is denoted as 1, is otherwise denoted as 0, and arrangement obtains one group of feature vector, text to be detected indicated by table 2 from front to back
The target feature vector of part is [1 10110111 0].
It should be noted that this programme can generate the feature vector and Virus Sample of safe sample using aforesaid way
Feature vector, be not repeated herein.
Secondly, in the embodiment of the present invention, then the log information of the available file to be detected of viral diagnosis device counts
The probability of occurrence of each behavior mark, the appearance finally identified according to N number of triggered time and each behavior in N number of behavior mark
Probability generates the target feature vector of file to be detected.By the above-mentioned means, can be according to the probability of occurrence and touching that behavior identifies
Time generation feature vector is sent out, can be subsequent rule so that there is incidence relation between feature vector and behavior mark
With reliable foundation is provided, thus the feasibility of lifting scheme.
Optionally, on the basis of above-mentioned Fig. 3 corresponding embodiment, the method for viral diagnosis provided in an embodiment of the present invention
In second alternative embodiment, before being matched using pattern detection regular collection to target feature vector, can also include:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining safe sample by decision-tree model, wherein decision
Tree-model is used for outgoing route information and sample type, and sample type includes security type and Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining Virus Sample by decision-tree model.
In the present embodiment, it will introduce and how generate pattern detection rule (including first sample detected rule and the second sample
This detected rule), pattern detection rule how is generated below in conjunction with Fig. 5 introduction, referring to Fig. 5, Fig. 5 is the embodiment of the present invention
A middle flow diagram for generating decision-tree model file, as shown, first obtaining a collection of positive sample in step 301 and bearing
Sample, wherein assuming that positive sample is safe sample, and negative sample is Virus Sample.Then, in step 302, using such as Fig. 3
Eigen vector generation method provided by corresponding one embodiment generates the feature vector of each safe sample, and each
The feature vector of a Virus Sample.
In step 303, using decision-tree model to the feature vector of safe sample and the feature vector of Virus Sample
It is trained, wherein each sample has one group of attribute and a classification, these classifications are pre-determined, then passing through
To a classifier, this classifier can provide correct classification to emerging object for acquistion.In decision-tree model, packet
Include Rule of judgment, routing information and the result of decision.In step 304, decision-tree model library file is generated according to routing information
(such as sample label 1 --- routing information 1;Sample label 2 --- routing information 2), wherein decision-tree model library file is for giving birth to
At pattern detection rule, if input is safe sample, what it is according to the output of decision-tree model library file may be the first sample
This detected rule, if input is Virus Sample, what it is according to the output of decision-tree model library file may be the inspection of the second sample
Gauge is then.
Secondly, viral diagnosis device generates first sample detected rule and the second pattern detection rule in the embodiment of the present invention
Method then can be, and first obtains the feature vector of safe sample and the feature vector of Virus Sample, is then input to the two
Decision-tree model determines routing information according to the result of decision of output, to generate pattern detection rule.By the above-mentioned means,
Also there is following advantage using decision-tree model, first, decision tree should be readily appreciated that and realize, can directly embody the spy of data
Point.Second, for decision tree, result that is feasible and working well can be made to mass data within the relatively short time.The
Three, it is easy to evaluate and test model by static test, it can be with rating model confidence level.
Optionally, on the basis of above-mentioned Fig. 3 corresponding second embodiment, viral diagnosis provided in an embodiment of the present invention
Method third alternative embodiment in, the corresponding first sample of the feature vector for obtaining safe sample by decision-tree model is examined
Gauge then, may include:
The feature vector of safe sample is input to decision-tree model, obtains the pattern detection to be selected rule of X item first,
In, X is the integer more than or equal to 1;
Selection meets the Y first sample detection of preset rules formation condition from the pattern detection to be selected rule of X item first
Rule, wherein Y is the integer more than or equal to 1, and less than or equal to X;
Corresponding second pattern detection of feature vector for obtaining Virus Sample by decision-tree model is regular, may include:
The feature vector of Virus Sample is input to decision-tree model, obtains the pattern detection to be selected rule of Q item second,
In, Q is the integer more than or equal to 1;
Selection meets the second pattern detection of P item of preset rules formation condition from the pattern detection to be selected rule of Q item second
Rule, wherein P is the integer more than or equal to 1, and less than or equal to Q.
In the present embodiment, it will introduce and how to generate first sample detected rule and the second pattern detection rule.Specifically, will
The feature vector of all safe samples is input to decision-tree model, can export X item first pattern detection to be selected rule, but this
The pattern detection rule to be selected of X item first might not be all suitable for, for example, the road that certain first pattern detection rules to be selected are included
Diameter information is very short, or effectively node ratio is very low, just needs this when from the pattern detection to be selected rule of X item first
Selection meets the first pattern detection rule to be selected of preset rules formation condition, these meet the first pattern detection to be selected of condition
Rule is Y first sample detected rule.
Similarly, the feature vector of all Virus Samples is input to decision-tree model, Q item second can be exported to sampling
This detected rule, but this pattern detection rule to be selected of Q item second might not be all suitable for, for example, certain second samples to be selected
The routing information that detected rule is included is very short, or effectively node ratio is very low, just needs this when from Q item second
Selection meets the second pattern detection rule to be selected of preset rules formation condition in pattern detection rule to be selected, these meet condition
The second pattern detection rule to be selected be P item the second pattern detection rule.
For the ease of introducing, referring to Fig. 6, Fig. 6 is a process for generating pattern detection rule in the embodiment of the present invention
Schematic diagram, as shown, specifically:
In step 401, decision-tree model file is obtained;
In step 402, can be according to two following conditions filtering decision tree-model file, first condition is according to road
Electrical path length filtering, is considered as the case where being unsatisfactory for regular formation condition for the shorter situation of path length;
In step 403, second condition is filtered according to positive node ratio, by the lower situation of positive node ratio
It is considered as the case where being unsatisfactory for regular formation condition;
In step 404, raw using the decision-tree model file (including sample label and routing information) being obtained by filtration
At pattern detection regular collection (including first sample detected rule and the second pattern detection rule).
Again, in the embodiment of the present invention, it is contemplated that and not all routing information is suitable for building pattern detection rule
, therefore also need to be arranged " threshold " Lai Shengcheng pattern detection rule.By the above-mentioned means, being able to ascend pattern detection rule
Reliability then is conducive to the safety of lifting scheme so as to accurately perceive the type of file to be detected.
Optionally, on the basis of above-mentioned Fig. 3 corresponding third embodiment, viral diagnosis provided in an embodiment of the present invention
The 4th alternative embodiment of method in, selection meets preset rules formation condition from X item first pattern detection to be selected rule
Y first sample detected rule, may include:
Selection path length is greater than Y first sample of preset length thresholding from the pattern detection to be selected rule of X item first
Detected rule;
Selection meets the second pattern detection of P item of preset rules formation condition from the pattern detection to be selected rule of Q item second
Rule, comprising:
Selection path length is greater than the second sample of P item of preset length thresholding from the pattern detection to be selected rule of Q item second
Detected rule.
In the present embodiment, a kind of method for selecting pattern detection rule will be introduced, for the ease of introducing, referring to Fig. 7, figure
7 be a schematic diagram of decision-tree model in the embodiment of the present invention, as shown, Fig. 6 is the decision-tree model that depth is 6, from
Vertex is a paths to each sample label (virus or safety), shares 10 paths.With first sample detected rule
It, will be through from peak to label corresponding to the first sample detected rule for (i.e. safety label path shown in dash area)
It crosses 6 to judge node (i.e. routing information), respectively behavior identifies [4] > 0.35, and behavior identifies [2] > 0.235, behavior mark
[1] > 0.35, behavior identifies [7] > 0.76, and behavior identifies [5] > 0.65, and behavior identifies [72] > 0.75, because in feature vector not
It is 0 is exactly 1, so the case where whether being not in unknown safety.
The first sample detected rule for judging that node generates according to 6 are as follows: there are behavior marks 4, and there are behaviors to identify 2,
There are behavior marks 1, and there are behavior marks 7, and there are behavior marks 5, and there is no behavior marks 72.
It is filtered using path length, path length can be required to be more than or equal to the 2/3 of the depth of tree, it is assumed that decision tree
The depth of model is 30, then the preset length thresholding that this programme is chosen is 30 × 2/3=20.It should be noted that default length
Degree thresholding can be 2/3, be also possible to other reasonable values, and only one signal, is not construed as to the present invention herein
Restriction.
Further, it in the embodiment of the present invention, can be selected from pattern detection rule to be selected according to path length full
The pattern detection rule required enough, i.e. generation first sample detected rule and the second pattern detection rule.By the above-mentioned means,
Generated pattern detection regular collection has preferable reliability, it is desirable that path length is greater than preset length thresholding, otherwise will view
For underproof routing information, corresponding pattern detection rule would not be also generated, thus the feasibility of lifting scheme and practical
Property.
Optionally, provided in an embodiment of the present invention on the basis of the corresponding third of above-mentioned Fig. 3 or the 4th embodiment
In the 5th alternative embodiment of method of viral diagnosis, selection meets preset rules from the pattern detection to be selected rule of X item first
Y first sample detected rule of formation condition may include:
Positive node ratio is selected to be greater than the Y item first of preset ratio thresholding from the pattern detection to be selected rule of X item first
Pattern detection rule, wherein positive node ratio indicates positive number of nodes ratio shared by total node number amount, positive node
Indicate the node comprising behavior mark;
Selection meets the second pattern detection of P item of preset rules formation condition from the pattern detection to be selected rule of Q item second
Rule, comprising:
Positive node ratio is selected to be greater than the P item second of preset ratio thresholding from the pattern detection to be selected rule of Q item second
Pattern detection rule.
In the present embodiment, it is based on the corresponding third alternative embodiment of Fig. 3, additionally provides a kind of selection pattern detection rule
Method.Specifically, for decision tree, it includes that some behavior identifies and do not include some row that routing information, which is by several,
For mark composition, positive node just refers to be identified comprising certain behavior.It is default that this programme requires positive node ratio to be more than or equal to
Ratio thresholding, it is assumed that the length of certain paths is 20, and preset ratio thresholding is 4/5, then the minimum number of positive node is
20 × 4/5=16.
It should be noted that preset ratio thresholding can be 4/5, it is also possible to other reasonable values, herein only one
A signal, is not construed as limitation of the invention.
Still further, can be selected from pattern detection rule to be selected according to positive node ratio in the embodiment of the present invention
Select out the pattern detection rule met the requirements, i.e. generation first sample detected rule and the second pattern detection rule.By upper
Mode is stated, generated pattern detection regular collection has preferable reliability, it is desirable that positive node ratio is greater than preset ratio door
Limit, otherwise will be regarded as underproof routing information, would not also generate corresponding pattern detection rule, thus lifting scheme can
Row and practicability.
Optionally, on the basis of above-mentioned Fig. 3 corresponding embodiment, the method for viral diagnosis provided in an embodiment of the present invention
In 6th alternative embodiment, target feature vector is matched using pattern detection regular collection, to generate object matching
As a result, may include:
Judge whether target feature vector meets first sample detected rule, if target feature vector meets first sample inspection
Gauge then, then generates the first matching result;
If target feature vector is unsatisfactory for first sample detected rule, judge whether target feature vector meets the second sample
This detected rule generates the second matching result if target feature vector meets the second pattern detection rule;
If target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
In the present embodiment, viral diagnosis device can be successively to matching target feature vector and the progress of pattern detection rule
Match.Assuming that pattern detection regular collection includes first sample detected rule and the second pattern detection rule, firstly, viral diagnosis fills
It sets and judges whether target feature vector meets first sample detected rule, if satisfied, then directly generating the first matching result, instead
It, then continue to determine whether to meet next rule, that is, judges whether target feature vector meets the second pattern detection rule, if
Target feature vector meets the second pattern detection rule, then generating the second matching result.If target feature vector is both discontented with
Sufficient first sample detected rule, and it is unsatisfactory for the second pattern detection rule, then third matching result will be generated.
It is understood that in practical applications, the matching to pattern detection rule in selection pattern detection regular collection
Sequence is not construed as limiting, and can first match the second pattern detection rule, then match first sample detected rule, vice versa.
Below in conjunction with Fig. 8, a process for detecting file type to be detected is introduced, referring to Fig. 8, Fig. 8 is the present invention
The flow diagram tested in embodiment to file to be detected, as shown, specifically:
In step 501, a collection of positive sample and negative sample are obtained, wherein positive sample can refer to that safe sample, negative sample can
To refer to Virus Sample.It should be noted that in practical applications, positive sample can also be set as to Virus Sample, negative sample
It is set as safe sample, this depends on user's setting to positive negative sample in advance;
In step 502, safe sample and Virus Sample are sent into simulator respectively, and generate the log letter of simulator output
Breath generates the feature vector of safe sample and the feature vector of Virus Sample;
In step 503, by the feature vector of the feature vector of safe sample and Virus Sample be input to decision-tree model into
Row training;
In step 504, an available model library file, i.e. decision-tree model file after model training are carried out, it should
Decision-tree model file can be called, decision-tree model file here can be understood as one kind by storage replication for subsequent detection
Configuration file;
In step 505, each path information is filtered according to decision-tree model file, filter type may is that as
Fruit path length is greater than or equal to the 2/3 of decision tree depth, and positive node is greater than or equal to the 4/5 of path length, then should
Routing information can be identified as pattern detection rule;
In step 506, the pattern detection rule that step 505 generates is sorted out, pattern detection regular collection is obtained;
In step 507, file to be detected is obtained;
In step 508, extract the target feature vector of the file to be detected, and by the target signature of file to be detected to
Amount, is matched with each pattern detection rule included in pattern detection regular collection;
In step 509, judge whether target feature vector advises with pattern detection included in pattern detection regular collection
It then matches, if matched with Virus Sample detected rule, enters step 511, if mismatched with Virus Sample detected rule,
Then enter step 510;
In step 510, judge whether target feature vector advises with pattern detection included in pattern detection regular collection
Then match, if entering step 513 with safe pattern detection rule match, if mismatched with Virus Sample detected rule,
Then enter step 512;
In step 511, determine that the file to be detected is virus document;
In step 512, the security situation of the file to be detected can not be determined, or think that the file to be detected is safe text
Part;
In step 513, determine that the file to be detected is secure file.
Secondly, in the embodiment of the present invention, viral diagnosis device can will be in target feature vector and pattern detection regular collection
Rule matched, if certain rule mismatch, will continue to be matched with next rule, until match knot
Fruit, or determining and all rule all mismatch.By the above-mentioned means, can accurately know the matching knot of file to be detected
Fruit, thus the reliability of lifting scheme.
Optionally, on the basis of above-mentioned Fig. 3 corresponding 6th embodiment, viral diagnosis provided in an embodiment of the present invention
The 7th alternative embodiment of method in, the viral diagnosis of file to be detected is determined according to object matching result as a result, can wrap
It includes:
If object matching result is the first matching result, it is determined that file to be detected belongs to secure file;
If object matching result is the second matching result, it is determined that file to be detected belongs to virus document;
If object matching result is third matching result, it is determined that file to be detected belongs to unknown secure file.
In the present embodiment, viral diagnosis device can according to pattern detection regular collection viral diagnosis generated as a result,
Know type belonging to file to be detected.
Specifically, it is assumed that the target feature vector of file to be detected is matched with first sample detected rule, then it is determined that mesh
Mark matching result is the first matching result, can determine that file to be detected belongs to the file of security type, just also for subsequent behaviour
Make.Assuming that the target feature vector of file to be detected and the second pattern detection rule match, then it is determined that object matching result is
Second matching result also just can determine that the file to be detected belongs to the file of Virus Type, it usually needs to Virus Type
File is isolated.Assuming that the target feature vector of file to be detected is neither matched with first sample detected rule, and not with
Two pattern detection rule match, then it is assumed that the file to be detected belongs to unknown safe file, that is, is used as apocrypha.
Again, in the embodiment of the present invention, viral diagnosis device determines the type of file to be detected according to object matching result,
That is it is secure file that the first matching result, which is used to indicate file to be detected, and the second matching result is used to indicate file to be detected as disease
Malicious file, and it is position secure file that third matching result, which is used to indicate file to be detected,.By the above-mentioned means, can be accurately
The type for knowing file to be detected is not only able to determine Virus Type, can also distinguish between out security type and unknown security classes
The case where type, thus the practicability of lifting scheme and safety.
In order to make it easy to understand, the process of viral diagnosis is introduced below in conjunction with Fig. 9, referring to Fig. 9, Fig. 9 answers for the present invention
With a flow diagram of viral diagnosis in scene, as shown, specifically:
In step 601, start to carry out viral diagnosis;
In step 602, selection a batch is for generating pattern detection regular collection (first sample detected rule and the second sample
This detected rule) safe sample and Virus Sample;
In step 603, a file to be detected is selected;
In step 604, can specifically be divided into four steps, obtained in step 6041 safe sample, Virus Sample and
Safe sample, Virus Sample and file to be detected are sent into simulator in step 6042 and are run, then by file to be detected
In step 6043, the log information of safe sample, the log information of Virus Sample and to be checked are extracted respectively from simulator
The log information for surveying file believes the feature that the log information of safe sample is converted into safe sample finally in step 6044
Breath, converts the log information of Virus Sample to the characteristic information of Virus Sample, converts the log information of file to be detected to
The target feature vector of file to be detected;
In step 605, by the characteristic information of the characteristic information of Virus Sample and safe sample be input to decision-tree model into
Row training;
In step 606, decision-tree model library file is obtained according to training, it is filtered, the purpose of filtering is mainly sieved
Select the pattern detection rule for meeting regular formation condition.Wherein, which can be by storage replication, and confession is subsequent
Viral diagnosis calls, and decision-tree model library file here can be understood as a kind of configuration file;
In step 607, the target feature vector of the file to be detected is extracted;
In step 608, and the target feature vector of file to be detected is matched with pattern detection rule;
In step 609, judge whether target feature vector advises with pattern detection included in pattern detection regular collection
It then matches, if matched with Virus Sample detected rule, enters step 611, if mismatched with Virus Sample detected rule,
Then enter step 610;
In step 610, judge whether target feature vector advises with pattern detection included in pattern detection regular collection
Then match, if entering step 613 with safe pattern detection rule match, if mismatched with Virus Sample detected rule,
Then enter step 612;
In step 611, determine that the file to be detected is virus document;
In step 612, the security situation of the file to be detected can not be determined;
In step 613, determine that the file to be detected is secure file.
The viral diagnosis device in the present invention is described in detail below, referring to Fig. 10, Figure 10 is that the present invention is implemented
Viral diagnosis device one embodiment schematic diagram in example, viral diagnosis device 70 include:
Module 701 is obtained, for obtaining the target feature vector of file to be detected;
Generation module 702, the target for being obtained using pattern detection regular collection to the acquisition module 701 are special
Sign vector is matched, to generate object matching result, wherein is detected in the pattern detection regular collection comprising first sample
Rule and the second pattern detection rule, the first sample detected rule is for indicating between security type and routing information
Corresponding relationship, the second pattern detection rule are used to indicate the corresponding relationship between Virus Type and routing information, the road
Diameter information is used to indicate the probability of occurrence of behavior mark;
Determining module 703, the object matching result for being generated according to the generation module 702 determine described to be checked
Survey the viral diagnosis result of file.
In the present embodiment, the target feature vector that module 701 obtains file to be detected is obtained, generation module 702 uses sample
The target feature vector that this detected rule set obtains the acquisition module 701 matches, to generate object matching
As a result, wherein described comprising first sample detected rule and the second pattern detection rule in the pattern detection regular collection
First sample detected rule is used to indicate the corresponding relationship between security type and routing information, the second pattern detection rule
For indicating the corresponding relationship between Virus Type and routing information, the appearance that the routing information is used to indicate behavior mark is general
Rate, determining module 703 determine the disease of the file to be detected according to the object matching result that the generation module 702 generates
Malicious testing result.
In the embodiment of the present invention, provide a kind of viral diagnosis device, obtain first the target signature of file to be detected to
Amount, matches target feature vector using pattern detection regular collection, to generate object matching result, wherein sample inspection
It surveys comprising first sample detected rule and the second pattern detection rule in regular collection, first sample detected rule is for indicating
Corresponding relationship between security type and routing information, the second pattern detection rule for indicate Virus Type and routing information it
Between corresponding relationship, routing information be used to indicate behavior mark probability of occurrence, finally determined according to object matching result to be checked
Survey the viral diagnosis result of file.By the above-mentioned means, the artificial process for extracting condition code on the one hand can be saved, directly utilize
Pattern detection regular collection matches to obtain the matching result of file to be detected, which can indicate the safety of file to be detected
Property, on the other hand, pattern detection regular collection includes at least the rule for detecting security type and Virus Type, can be accurate
Ground perceives the type of file to be detected, is conducive to the safety of lifting scheme.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention
In another embodiment set,
The acquisition module 701, specifically for obtaining the log information of the file to be detected, wherein the log letter
Breath includes N number of behavior mark and N number of triggered time, and the N is the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the file to be detected is generated
The target feature vector.
Secondly, in the embodiment of the present invention, then the log information of the available file to be detected of viral diagnosis device counts
The probability of occurrence of each behavior mark, the appearance finally identified according to N number of triggered time and each behavior in N number of behavior mark
Probability generates the target feature vector of file to be detected.By the above-mentioned means, can be according to the probability of occurrence and touching that behavior identifies
Time generation feature vector is sent out, can be subsequent rule so that there is incidence relation between feature vector and behavior mark
With reliable foundation is provided, thus the feasibility of lifting scheme.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention
In another embodiment set,
It is special to the target using pattern detection regular collection to be also used to the generation module 702 for the acquisition module 701
Before sign vector is matched, the feature vector of safe sample and the feature vector of Virus Sample are obtained;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model,
In, for the decision-tree model for exporting the routing information and sample type, the sample type includes the security classes
Type and the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model
Then.
Secondly, viral diagnosis device generates first sample detected rule and the second pattern detection rule in the embodiment of the present invention
Method then can be, and first obtains the feature vector of safe sample and the feature vector of Virus Sample, is then input to the two
Decision-tree model determines routing information according to the result of decision of output, to generate pattern detection rule.By the above-mentioned means,
Also there is following advantage using decision-tree model, first, decision tree should be readily appreciated that and realize, can directly embody the spy of data
Point.Second, for decision tree, result that is feasible and working well can be made to mass data within the relatively short time.The
Three, it is easy to evaluate and test model by static test, it can be with rating model confidence level.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention
In another embodiment set,
The acquisition module 701, specifically for the feature vector of the safe sample is input to the decision-tree model,
Obtain the pattern detection to be selected rule of X item first, wherein the X is the integer more than or equal to 1;
It selects to meet first described in the Y item of preset rules formation condition from the X item first pattern detection to be selected rule
Pattern detection rule, wherein the Y is the integer more than or equal to 1, and less than or equal to the X;
The feature vector of the Virus Sample is input to the decision-tree model, obtains the pattern detection to be selected of Q item second
Rule, wherein the Q is the integer more than or equal to 1;
It selects to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule
Pattern detection rule, wherein the P is the integer more than or equal to 1, and less than or equal to the Q.
Again, in the embodiment of the present invention, it is contemplated that and not all routing information is suitable for building pattern detection rule
, therefore also need to be arranged " threshold " Lai Shengcheng pattern detection rule.By the above-mentioned means, being able to ascend pattern detection rule
Reliability then is conducive to the safety of lifting scheme so as to accurately perceive the type of file to be detected.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention
In another embodiment set,
It is big to be specifically used for the selection path length from the X item first pattern detection to be selected rule for the acquisition module 701
The first sample detected rule described in the Y item of preset length thresholding;
Selection path length is greater than the P item of the preset length thresholding from the Q item second pattern detection to be selected rule
The second pattern detection rule.
Further, it in the embodiment of the present invention, can be selected from pattern detection rule to be selected according to path length full
The pattern detection rule required enough, i.e. generation first sample detected rule and the second pattern detection rule.By the above-mentioned means,
Generated pattern detection regular collection has preferable reliability, it is desirable that path length is greater than preset length thresholding, otherwise will view
For underproof routing information, corresponding pattern detection rule would not be also generated, thus the feasibility of lifting scheme and practical
Property.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention
In another embodiment set,
The acquisition module 701 is specifically used for selecting positive node ratio from the X item first pattern detection to be selected rule
Example is greater than first sample detected rule described in the Y item of preset ratio thresholding, wherein the forward direction node ratio indicates positive
Number of nodes ratio shared by total node number amount, the forward direction node indicate the node comprising behavior mark;
Positive node ratio is selected to be greater than described in preset ratio thresholding from the Q item second pattern detection to be selected rule
The rule of second pattern detection described in P item.
Still further, can be selected from pattern detection rule to be selected according to positive node ratio in the embodiment of the present invention
Select out the pattern detection rule met the requirements, i.e. generation first sample detected rule and the second pattern detection rule.By upper
Mode is stated, generated pattern detection regular collection has preferable reliability, it is desirable that positive node ratio is greater than preset ratio door
Limit, otherwise will be regarded as underproof routing information, would not also generate corresponding pattern detection rule, thus lifting scheme can
Row and practicability.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention
In another embodiment set,
The generation module 702, for judging whether the target feature vector meets the first sample detected rule,
If the target feature vector meets the first sample detected rule, the first matching result is generated;
If the target feature vector is unsatisfactory for the first sample detected rule, judge that the target feature vector is
It is no to meet the second pattern detection rule, if the target feature vector meets the second pattern detection rule, generate
Second matching result;
If the target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
Secondly, in the embodiment of the present invention, viral diagnosis device can will be in target feature vector and pattern detection regular collection
Rule matched, if certain rule mismatch, will continue to be matched with next rule, until match knot
Fruit, or determining and all rule all mismatch.By the above-mentioned means, can accurately know the matching knot of file to be detected
Fruit, thus the reliability of lifting scheme.
Optionally, on the basis of the embodiment corresponding to above-mentioned Figure 10, viral diagnosis dress provided in an embodiment of the present invention
In another embodiment set,
The determining module 703, if being first matching result for the object matching result, it is determined that the institute
It states file to be detected and belongs to secure file;
If the object matching result is second matching result, it is determined that the file to be detected belongs to virus
File;
If the object matching result is the third matching result, it is determined that the file to be detected belongs to unknown
Secure file.
Again, in the embodiment of the present invention, viral diagnosis device determines the type of file to be detected according to object matching result,
That is it is secure file that the first matching result, which is used to indicate file to be detected, and the second matching result is used to indicate file to be detected as disease
Malicious file, and it is position secure file that third matching result, which is used to indicate file to be detected,.By the above-mentioned means, can be accurately
The type for knowing file to be detected is not only able to determine Virus Type, can also distinguish between out security type and unknown security classes
The case where type, thus the practicability of lifting scheme and safety.
Figure 11 is a kind of server architecture schematic diagram provided in an embodiment of the present invention, which can be because of configuration or property
Energy is different and generates bigger difference, may include one or more central processing units (central processing
Units, CPU) 822 (for example, one or more processors) and memory 832, one or more storages apply journey
The storage medium 830 (such as one or more mass memory units) of sequence 842 or data 844.Wherein, 832 He of memory
Storage medium 830 can be of short duration storage or persistent storage.The program for being stored in storage medium 830 may include one or one
With upper module (diagram does not mark), each module may include to the series of instructions operation in server.Further, in
Central processor 822 can be set to communicate with storage medium 830, execute on server 800 a series of in storage medium 830
Instruction operation.
Server 800 can also include one or more power supplys 826, one or more wired or wireless networks
Interface 850, one or more input/output interfaces 858, and/or, one or more operating systems 841, such as
Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
The step as performed by server can be based on server architecture shown in the Figure 11 in above-described embodiment.
CPU 822 is for executing following steps:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching as a result, its
In, first sample detected rule and the second pattern detection rule, first sample are included in the pattern detection regular collection
This detected rule is used to indicate the corresponding relationship between security type and routing information, and the second pattern detection rule is used for table
Show that the corresponding relationship between Virus Type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result.
Optionally, CPU 822 is specifically used for executing following steps:
Obtain the log information of the file to be detected, wherein the log information include N number of behavior identify and it is N number of
Triggered time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the file to be detected is generated
The target feature vector.
Optionally, CPU 822 is also used to execute following steps:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model,
In, for the decision-tree model for exporting the routing information and sample type, the sample type includes the security classes
Type and the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model
Then.
Optionally, CPU 822 is specifically used for executing following steps:
The feature vector of the safe sample is input to the decision-tree model, obtains the pattern detection to be selected of X item first
Rule, wherein the X is the integer more than or equal to 1;
It selects to meet first described in the Y item of preset rules formation condition from the X item first pattern detection to be selected rule
Pattern detection rule, wherein the Y is the integer more than or equal to 1, and less than or equal to the X;
The feature vector of the Virus Sample is input to the decision-tree model, obtains the pattern detection to be selected of Q item second
Rule, wherein the Q is the integer more than or equal to 1;
It selects to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule
Pattern detection rule, wherein the P is the integer more than or equal to 1, and less than or equal to the Q.
Optionally, CPU 822 is specifically used for executing following steps:
Selection path length is greater than described in the Y item of preset length thresholding from the X item first pattern detection to be selected rule
First sample detected rule;
Selection path length is greater than the P item of the preset length thresholding from the Q item second pattern detection to be selected rule
The second pattern detection rule.
Optionally, CPU 822 is specifically used for executing following steps:
Positive node ratio is selected to be greater than described in preset ratio thresholding from the X item first pattern detection to be selected rule
First sample detected rule described in Y item, wherein the forward direction node ratio indicates positive number of nodes shared by the total node number amount
Ratio, it is described forward direction node indicate comprising behavior mark node;
Positive node ratio is selected to be greater than described in preset ratio thresholding from the Q item second pattern detection to be selected rule
The rule of second pattern detection described in P item.
Optionally, CPU 822 is specifically used for executing following steps:
Judge whether the target feature vector meets the first sample detected rule, if the target feature vector is full
The foot first sample detected rule, then generate the first matching result;
If the target feature vector is unsatisfactory for the first sample detected rule, judge that the target feature vector is
It is no to meet the second pattern detection rule, if the target feature vector meets the second pattern detection rule, generate
Second matching result;
If the target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
Optionally, CPU 822 is specifically used for executing following steps:
If the object matching result is first matching result, it is determined that the file to be detected belongs to safety
File;
If the object matching result is second matching result, it is determined that the file to be detected belongs to virus
File;
If the object matching result is the third matching result, it is determined that the file to be detected belongs to unknown
Secure file.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided by the present invention, it should be understood that disclosed system, device and method can be with
It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components
It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or
The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of device or unit
It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention
Portion or part steps.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (read-only memory,
ROM), random access memory (random access memory, RAM), magnetic or disk etc. are various can store program
The medium of code.
The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although referring to before
Stating embodiment, invention is explained in detail, those skilled in the art should understand that: it still can be to preceding
Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features;And these
It modifies or replaces, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.
Claims (15)
1. a kind of method of viral diagnosis characterized by comprising
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching result, wherein institute
It states in pattern detection regular collection comprising first sample detected rule and the second pattern detection rule, the first sample detection
Rule is for indicating the corresponding relationship between security type and routing information, and the second pattern detection rule is for indicating virus
Corresponding relationship between type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result.
2. the method according to claim 1, wherein the target feature vector for obtaining file to be detected, packet
It includes:
Obtain the log information of the file to be detected, wherein the log information includes N number of behavior mark and N number of triggering
Time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the institute of the file to be detected is generated
State target feature vector.
3. the method according to claim 1, wherein described special to the target using pattern detection regular collection
Before sign vector is matched, the method also includes:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, wherein
The decision-tree model for exporting the routing information and sample type, the sample type include the security type with
And the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model.
4. according to the method described in claim 3, it is characterized in that, described obtain the safe sample by decision-tree model
The corresponding first sample detected rule of feature vector, comprising:
The feature vector of the safe sample is input to the decision-tree model, obtains the pattern detection to be selected rule of X item first,
Wherein, the X is the integer more than or equal to 1;
Selection meets first sample described in the Y item of preset rules formation condition from the X item first pattern detection to be selected rule
Detected rule, wherein the Y is the integer more than or equal to 1, and less than or equal to the X;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model
Then, comprising:
The feature vector of the Virus Sample is input to the decision-tree model, obtains the pattern detection to be selected rule of Q item second,
Wherein, the Q is the integer more than or equal to 1;
Selection meets the second sample described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule
Detected rule, wherein the P is the integer more than or equal to 1, and less than or equal to the Q.
5. according to the method described in claim 4, it is characterized in that, described from the X item first pattern detection to be selected rule
Selection meets first sample detected rule described in the Y item of preset rules formation condition, comprising:
Selection path length is greater than first described in the Y item of preset length thresholding from the X item first pattern detection to be selected rule
Pattern detection rule;
It is described to select to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule
Pattern detection rule, comprising:
Selection path length is greater than described in the P item of the preset length thresholding from the Q item second pattern detection to be selected rule
Second pattern detection rule.
6. method according to claim 4 or 5, which is characterized in that described from the X item first pattern detection to be selected rule
Middle selection meets first sample detected rule described in the Y item of preset rules formation condition, comprising:
Positive node ratio is selected to be greater than the Y item of preset ratio thresholding from the X item first pattern detection to be selected rule
The first sample detected rule, wherein the forward direction node ratio indicates positive number of nodes shared by the total node number amount
Ratio, the forward direction node indicate the node comprising behavior mark;
It is described to select to meet second described in the P item of preset rules formation condition from the Q item second pattern detection to be selected rule
Pattern detection rule, comprising:
Positive node ratio is selected to be greater than the P item of preset ratio thresholding from the Q item second pattern detection to be selected rule
The second pattern detection rule.
7. the method according to claim 1, wherein described special to the target using pattern detection regular collection
Sign vector is matched, to generate object matching result, comprising:
If the target feature vector meets first sample detected rule, the first matching result is generated;
If the target feature vector is unsatisfactory for the first sample detected rule, judge whether the target feature vector is full
Foot the second pattern detection rule generates second if the target feature vector meets the second pattern detection rule
Matching result;
If the target feature vector is unsatisfactory for the second pattern detection rule, third matching result is generated.
8. the method according to the description of claim 7 is characterized in that it is described determined according to the object matching result it is described to be checked
Survey the viral diagnosis result of file, comprising:
If the object matching result is first matching result, it is determined that the file to be detected belongs to safe text
Part;
If the object matching result is second matching result, it is determined that the file to be detected belongs to viral text
Part;
If the object matching result is the third matching result, it is determined that the file to be detected belongs to unknown safety
File.
9. a kind of viral diagnosis device characterized by comprising
Module is obtained, for obtaining the target feature vector of file to be detected;
Generation module, the target feature vector for being obtained using pattern detection regular collection to the acquisition module are carried out
Matching, to generate object matching result, wherein include first sample detected rule and the in the pattern detection regular collection
Two pattern detections are regular, and the first sample detected rule is used to indicate the corresponding relationship between security type and routing information,
The second pattern detection rule is used to indicate that the corresponding relationship between Virus Type and routing information, the routing information to be used for
The probability of occurrence of indication action mark;
Determining module, the object matching result for being generated according to the generation module determine the disease of the file to be detected
Malicious testing result.
10. viral diagnosis device according to claim 9, which is characterized in that
The acquisition module, specifically for obtaining the log information of the file to be detected, wherein the log information includes N
A behavior mark and N number of triggered time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the institute of the file to be detected is generated
State target feature vector.
11. viral diagnosis device according to claim 9, which is characterized in that
The acquisition module is also used to the generation module and is carried out using pattern detection regular collection to the target feature vector
Before matching, the feature vector of safe sample and the feature vector of Virus Sample are obtained;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, wherein
The decision-tree model for exporting the routing information and sample type, the sample type include the security type with
And the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model.
12. a kind of viral diagnosis device, which is characterized in that the viral diagnosis device includes: memory, transceiver, processor
And bus system;
Wherein, the memory is for storing program;
The processor is used to execute the program in the memory, includes the following steps:
Obtain the target feature vector of file to be detected;
The target feature vector is matched using pattern detection regular collection, to generate object matching result, wherein institute
It states in pattern detection regular collection comprising first sample detected rule and the second pattern detection rule, the first sample detection
Rule is for indicating the corresponding relationship between security type and routing information, and the second pattern detection rule is for indicating virus
Corresponding relationship between type and routing information, the routing information are used to indicate the probability of occurrence of behavior mark;
The viral diagnosis result of the file to be detected is determined according to the object matching result;
The bus system is for connecting the memory and the processor, so that the memory and the processor
It is communicated.
13. viral diagnosis device according to claim 12, which is characterized in that the processor is specifically used for executing as follows
Step:
Obtain the log information of the file to be detected, wherein the log information includes N number of behavior mark and N number of triggering
Time, the N are the integer more than or equal to 1;
Count the probability of occurrence of each behavior mark in N number of behavior mark;
According to N number of triggered time and the probability of occurrence of each behavior mark, the institute of the file to be detected is generated
State target feature vector.
14. viral diagnosis device according to claim 12, which is characterized in that the processor is also used to execute following step
It is rapid:
Obtain the feature vector of safe sample and the feature vector of Virus Sample;
The corresponding first sample detected rule of feature vector for obtaining the safe sample by decision-tree model, wherein
The decision-tree model for exporting the routing information and sample type, the sample type include the security type with
And the Virus Type;
The corresponding second pattern detection rule of feature vector for obtaining the Virus Sample by the decision-tree model.
15. a kind of computer readable storage medium, including instruction, when run on a computer, so that computer executes such as
Method described in any item of the claim 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810402154.1A CN110210218B (en) | 2018-04-28 | 2018-04-28 | Virus detection method and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810402154.1A CN110210218B (en) | 2018-04-28 | 2018-04-28 | Virus detection method and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110210218A true CN110210218A (en) | 2019-09-06 |
CN110210218B CN110210218B (en) | 2023-04-14 |
Family
ID=67778796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810402154.1A Active CN110210218B (en) | 2018-04-28 | 2018-04-28 | Virus detection method and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110210218B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795638A (en) * | 2019-11-13 | 2020-02-14 | 北京百度网讯科技有限公司 | Method and apparatus for outputting information |
CN111310179A (en) * | 2020-01-22 | 2020-06-19 | 腾讯科技(深圳)有限公司 | Method and device for analyzing computer virus variants and computer equipment |
CN111753290A (en) * | 2020-05-26 | 2020-10-09 | 郑州启明星辰信息安全技术有限公司 | Software type detection method and related equipment |
CN113032742A (en) * | 2021-01-26 | 2021-06-25 | 北京安华金和科技有限公司 | Data desensitization method and device, storage medium and electronic device |
CN117152260A (en) * | 2023-11-01 | 2023-12-01 | 张家港长三角生物安全研究中心 | Method and system for detecting residues of disinfection apparatus |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8401982B1 (en) * | 2010-01-14 | 2013-03-19 | Symantec Corporation | Using sequencing and timing information of behavior events in machine learning to detect malware |
CN103150509A (en) * | 2013-03-15 | 2013-06-12 | 长沙文盾信息技术有限公司 | Virus detection system based on virtual execution |
CN103577756A (en) * | 2013-11-05 | 2014-02-12 | 北京奇虎科技有限公司 | Virus detection method and device based on script type judgment |
CN103839003A (en) * | 2012-11-22 | 2014-06-04 | 腾讯科技(深圳)有限公司 | Malicious file detection method and device |
-
2018
- 2018-04-28 CN CN201810402154.1A patent/CN110210218B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8401982B1 (en) * | 2010-01-14 | 2013-03-19 | Symantec Corporation | Using sequencing and timing information of behavior events in machine learning to detect malware |
CN103839003A (en) * | 2012-11-22 | 2014-06-04 | 腾讯科技(深圳)有限公司 | Malicious file detection method and device |
CN103150509A (en) * | 2013-03-15 | 2013-06-12 | 长沙文盾信息技术有限公司 | Virus detection system based on virtual execution |
CN103577756A (en) * | 2013-11-05 | 2014-02-12 | 北京奇虎科技有限公司 | Virus detection method and device based on script type judgment |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795638A (en) * | 2019-11-13 | 2020-02-14 | 北京百度网讯科技有限公司 | Method and apparatus for outputting information |
CN111310179A (en) * | 2020-01-22 | 2020-06-19 | 腾讯科技(深圳)有限公司 | Method and device for analyzing computer virus variants and computer equipment |
CN111753290A (en) * | 2020-05-26 | 2020-10-09 | 郑州启明星辰信息安全技术有限公司 | Software type detection method and related equipment |
CN113032742A (en) * | 2021-01-26 | 2021-06-25 | 北京安华金和科技有限公司 | Data desensitization method and device, storage medium and electronic device |
CN117152260A (en) * | 2023-11-01 | 2023-12-01 | 张家港长三角生物安全研究中心 | Method and system for detecting residues of disinfection apparatus |
CN117152260B (en) * | 2023-11-01 | 2024-02-06 | 张家港长三角生物安全研究中心 | Method and system for detecting residues of disinfection apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN110210218B (en) | 2023-04-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110210218A (en) | A kind of method and relevant apparatus of viral diagnosis | |
Aljawarneh et al. | Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model | |
CN110177108B (en) | Abnormal behavior detection method, device and verification system | |
CN107153789B (en) | Utilize the method for random forest grader real-time detection Android Malware | |
Pirscoveanu et al. | Analysis of malware behavior: Type classification using machine learning | |
CN107392025B (en) | Malicious android application program detection method based on deep learning | |
CN110245496A (en) | A kind of source code leak detection method and detector and its training method and system | |
CN109684840A (en) | Based on the sensitive Android malware detection method for calling path | |
CN104598824B (en) | A kind of malware detection methods and device thereof | |
CN108200054A (en) | A kind of malice domain name detection method and device based on dns resolution | |
CN105809035B (en) | The malware detection method and system of real-time behavior is applied based on Android | |
CN108229262B (en) | Pornographic video detection method and device | |
CN106485146B (en) | A kind of information processing method and server | |
CN112528284A (en) | Malicious program detection method and device, storage medium and electronic equipment | |
CN112149124B (en) | Android malicious program detection method and system based on heterogeneous information network | |
CN110263538A (en) | A kind of malicious code detecting method based on system action sequence | |
CN105446741B (en) | A kind of mobile applications discrimination method compared based on API | |
CN104866764B (en) | A kind of Android phone malware detection method based on object reference figure | |
CN111368289B (en) | Malicious software detection method and device | |
CN109325232A (en) | A kind of user behavior exception analysis method, system and storage medium based on LDA | |
CN110363003A (en) | A kind of Android virus static detection method based on deep learning | |
CN111586071A (en) | Encryption attack detection method and device based on recurrent neural network model | |
CN117081858B (en) | Intrusion behavior detection method, system, equipment and medium based on multi-decision tree | |
CN110210216B (en) | Virus detection method and related device | |
CN113535823A (en) | Abnormal access behavior detection method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |