CN105893842A - Method and device used for detecting infective viruses - Google Patents
Method and device used for detecting infective viruses Download PDFInfo
- Publication number
- CN105893842A CN105893842A CN201510038791.1A CN201510038791A CN105893842A CN 105893842 A CN105893842 A CN 105893842A CN 201510038791 A CN201510038791 A CN 201510038791A CN 105893842 A CN105893842 A CN 105893842A
- Authority
- CN
- China
- Prior art keywords
- entropy
- entrance
- type virus
- immediate
- characteristic vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention provides a method and a device used for detecting infective viruses. The method comprises the following steps: extracting the feature vector of a file to be detected, wherein the feature vector comprises the distribution frequency of an entropy of the immediate data of an entry point; and utilizing an infective virus identification model obtained by machine learning on the basis of the feature vector to detect whether the file to be detected is an infective virus file or not on the basis of the feature vector. The problems of high manual cost of manual analysis identification and manual rule startup are overcome, and the method for detecting the infective viruses on the basis of the infective virus identification model greatly improves detection speed, and can effectively detect unknown infective viruses.
Description
Technical field
The present invention relates to computer realm, particularly relate to a kind of for detect infection type virus method and
Device.
Background technology
Infection type virus is a type of virus that in virus, mutation is most.In prior art, for
Infection type virus mostly uses manual analysis to mate or manually starts the modes such as rule and detects.Due to sense
Dye type virus during the viral code propagating self, can constantly vary the generation of virus itself
Code form and execution logic, accordingly, it would be desirable to artificial more feature or the rule of constantly adding is to reach to carry
The purpose of the recall rate of high infection type virus, this is accomplished by putting into substantial amounts of human resources, comes at manual
Manage this infection type being continually changing virus.This manual analysis is mated or manually starts the mode of rule not
But there is the problem that human cost is high, and Viral diagnosis speed is difficult to ensure that, it is also difficult to promptly and accurately
Find unknown infection type virus.
Summary of the invention
One of present invention solves the technical problem that the method and dress being to provide for detecting infection type virus
Put, while reducing human cost, detect infection type virus fast and accurately.
An embodiment according to an aspect of the present invention, it is provided that a kind of for detecting infection type virus
Method, including:
Extract the characteristic vector of file to be detected;Described characteristic vector includes: the immediate of entrance
The distribution frequency of entropy;
Utilize and carry out, based on described characteristic vector, the infection type virus identification model that machine learning obtains and depend on
Detect whether described file to be detected is infection type virus document according to described characteristic vector.
Alternatively, the characteristic vector extracting file to be detected includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The immediate of all instructions that the function that statistics is traveled through is comprised;
Calculate the entropy of described immediate;
Add up the distribution frequency of the entropy of described immediate, obtain the immediate of entrance entropy point
Cloth frequency.
Alternatively, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
Alternatively, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Alternatively, described Structural Characteristics include following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Alternatively, described characteristic vector also includes:
The instruction frequency of the code of entrance.
Alternatively, the characteristic vector extracting file to be detected includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtains the code of entrance
Instruction frequency.
An embodiment according to a further aspect of the invention, it is provided that one is used for detecting infection type virus
Device, including:
For extracting the unit of the characteristic vector of file to be detected;Described characteristic vector includes: entrance
The distribution frequency of entropy of immediate;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing
Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document.
Alternatively, the unit of the characteristic vector for extracting file to be detected includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the subelement of the immediate of all instructions that the function traveled through is comprised;
For calculating the subelement of the entropy of described immediate;
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance
The subelement of distribution frequency.
Alternatively, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
Alternatively, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Alternatively, described Structural Characteristics include following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Alternatively, described characteristic vector also includes:
The instruction frequency of the code of entrance.
Alternatively, the unit of the characteristic vector for extracting file to be detected includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance
The subelement of the instruction frequency of code.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The distribution frequency of the entropy of the immediate of mouth point, utilizes and carries out engineering acquistion based on described characteristic vector
Whether the infection type virus identification model arrived, detecting described file to be detected according to described characteristic vector is
Infection type virus document.Which overcome manual analysis identification and manually start rule high the asking of cost of labor
Topic, and detection should be substantially increased by infection type method for detecting virus based on infection type virus identification model
Speed, and can effectively detect unknown infection type virus.
Although those of ordinary skill in the art it will be appreciated that detailed description below by referenced in schematic embodiment,
Accompanying drawing is carried out, but the present invention is not limited in these embodiments.But, the scope of the present invention is extensive
, and it is intended to be bound only by appended claims restriction the scope of the present invention.
Accompanying drawing explanation
The detailed description that non-limiting example is made made with reference to the following drawings by reading, this
The other features, objects and advantages of invention will become more apparent upon:
Fig. 1 is according to an embodiment of the invention for detecting the flow chart of the method for infection type virus.
Fig. 2 is the stream of infection type virus identification model training method in accordance with another embodiment of the present invention
Cheng Tu.
Fig. 3 is in accordance with another embodiment of the present invention for detecting the flow process of the method for infection type virus
Figure.
Fig. 4 is the stream of infection type virus identification model training method in accordance with another embodiment of the present invention
Cheng Tu.
Fig. 5 is to use depth-first principle traversal entry point instruction to show according to an embodiment of the invention
It is intended to.
Fig. 6 is the instruction frequency scatter chart of the code of entrance according to an embodiment of the invention.
Fig. 7 is the immediate of the entrance not being infected file according to an embodiment of the invention
The entropy of immediate of entrance of distribution frequency and infected type virus infected file of entropy
Distribution frequency contrast schematic diagram.
Fig. 8 is to show for detecting the structure of device of infection type virus according to an embodiment of the invention
It is intended to.
Fig. 9 is the structural representation of characteristic vector pickup unit according to an embodiment of the invention.
Figure 10 is the structural representation of characteristic vector pickup unit in accordance with another embodiment of the present invention.
In accompanying drawing, same or analogous reference represents same or analogous parts.
Detailed description of the invention
Infection type virus is in program or the dynamic library file (one of DLL) that self is added in other,
Thus realize the function run with infected Program Synchronization, and then destroy and self infecting computer
Propagate.Infection type virus, due to the characteristic of himself, needs to be attached on other host programs transport
OK, and in order to hide the killing of antivirus software, self all can be split, become by usual infection type virus
After shape or encryption, then self part or all is attached on host program.An once disease
Poison file performs, and the most program files in system are probably just all added viral code by it,
And then it is broadcast to other computer, therefore, the method for artificial cognition is difficult to identify sense fast and accurately
Dye type virus, and the more difficult infection type virus finding the unknown.The embodiment of the present application is for infection type virus
Proposing a kind of detection method, the method is to detect sense based on the infection type virus identification model trained
Dye type virus.
Below in conjunction with the accompanying drawings the present invention is described in further detail.
Fig. 1 is according to an embodiment of the invention for detecting the flow chart of the method for infection type virus.
Method in the present invention is mainly completed by the operating system in computer equipment or processing controller.
Operating system or processing controller are referred to as the device being used for detecting infection type virus.This computer equipment
Include but not limited at least one of the following: subscriber equipment, the network equipment.Subscriber equipment include but
It is not limited to computer, smart mobile phone, PDA etc..The network equipment includes but not limited to single network service
The server group or based on cloud computing by a large amount of computers or network of device, multiple webserver composition
The cloud that server is constituted, wherein, cloud computing is the one of Distributed Calculation, loosely-coupled by a group
One super virtual machine of computer collection composition.
As shown in fig. 1, this method being used for detecting infection type virus mainly comprises the steps:
S100, extract the characteristic vector of file to be detected;Described characteristic vector includes: standing of entrance
The distribution frequency of the entropy i.e. counted;
S110, utilization carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains
According to described characteristic vector, type detects whether described file to be detected is infection type virus document.
Firstly, it is necessary to explanation, performing the operation of infection type Viral diagnosis is to know based on infection type virus
Other model realizes, it is, before performing the operation of this infection type Viral diagnosis, need to train
One infection type virus identification model.But during owing to being not to perform the operation of infection type Viral diagnosis every time
Being required for performing this training operation, therefore, the operation of this training infection type virus identification model is not
The steps necessary of detection infection type virus.Introduce the training of lower infection type virus identification model first below
Method.As shown in Figure 2, the infection type virus identification model instruction provided for one embodiment of the application
Practicing the flow chart of method, this training method can comprise the steps:
S200, obtain infected type virus infect infection type Virus Sample;
The method and quantity that obtain this infection type Virus Sample are not particularly limited by the embodiment of the present application,
And it is understood that its infection type Virus Sample quantity obtained is the most, then the infection type trained
Virus identifies that the accuracy of Model Identification virus is the highest.
In addition, it is necessary to explanation, the training infection type virus identification model that the embodiment of the present application provides
Method, the infection type Virus Sample that can be based only upon acquisition trains, it is, training process is only
Black file is used to complete;Can also infection type Virus Sample based on 1:1 and non-infection Virus Sample
Train, it is, training process uses black file to complete with the ratio of text of an annotated book part 1: 1.Herein
The file that described black file the most infected type virus infects, text of an annotated book part is be not infected normal
File.
S210, the characteristic vector of extraction infection type Virus Sample, described characteristic vector includes: entrance
The distribution frequency of entropy of immediate;
Likely revising due to infection type virus must be through flow process, and therefore, the embodiment of the present application can be by carrying
The distribution frequency of the entropy of the immediate of taking mouth point, carries out machine learning, thus identifies infection type
Whether virus is to revising through flow process.
Immediate described in the embodiment of the present application is defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8
Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is
16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction
Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just
The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy
Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to
Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively
The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical
The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond this prescribed limit, then may be used
Think that this document the most infected type virus infects.Therefore, the embodiment of the present application extracts entrance
The distribution frequency of the entropy of immediate.
Wherein, the method for the distribution frequency extracting the entropy of the immediate of entrance includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;Visible, in the immediate extracting entrance
The distribution frequency of entropy time, be the principle traversal entrance using depth-first, obtain entrance
The instruction of specified quantity.As shown in Figure 5, decompiling enters the schematic diagram of concrete traversal method one by one
The instruction code of mouth point, the point of each represented by circles is the position redirecting function call place, uses respectively
C1, c2, c3... represent, use depth-first principle to begin stepping through from c1, run into call function, deeply
Angle value adds 1, and enters function;If depth value reaches designated value (i.e. designated depth), such as, arrive
Prescribed depth value 4, then to running into call function, its depth value no longer adds 1, only records function name,
And do not enter function, until the instruction in all functions traveled through reaches specified quantity, such as 2000
Individual, then travel through.Order to the point that entrance employing depth-first principle in Fig. 5 travels through according to this
As shown in dotted arrow in Fig. 5, particularly as follows: c1-c2-c4-c8.It should be noted that facilitating
Cheng Zhongruo is in the case of not up to prescribed depth value, and the instruction number that the function traveled through comprises reaches
Specified quantity, then can stop traversal, it is not necessary to traverse prescribed depth value.Such as, convenient to c3
Time, the instruction number that function c1, c2 and the c3 traveled through is comprised reaches specified quantity 2000
Individual, then stop traversal, no longer travel through c4.Can be got by above-mentioned traversing operation and specify at entrance
The instruction of quantity.Afterwards, the immediate of all instructions that the function that statistics is traveled through is comprised;Afterwards
Calculate the entropy of described immediate;When wherein calculating the entropy of immediate, due to for an immediate
The institute that its corresponding numerical value represented specifically uses the representation of which kind of system cannot be accurately identified
Belong to system, therefore can calculate the binary system entropy of this immediate, decimal scale entropy and hexadecimal simultaneously
Entropy, if the immediate of the non-infection virus manually write, must have the entropy of the immediate of a kind of system
It is worth smaller.Finally, the distribution frequency of the entropy of described immediate is added up.It is not infected file
The entrance of distribution frequency and infected type virus infected file of entropy of immediate of entrance
Immediate entropy distribution frequency contrast schematic diagram as shown in Figure 7, in Fig. 7, abscissa represents
The entropy of immediate, vertical coordinate represents the number of times that each entropy occurs.The most infected
The occurrence number of the high entropy of the file that type virus infects is more.
Can be extracted by aforesaid operations the immediate of the entrance of infection type Virus Sample entropy point
Cloth frequency, as characteristic vector.
The machine learning classification algorithm that S220, utilization are preset calculates, and obtains infection type virus identification
Model.
This step is namely by above-mentioned acquired sample, and the characteristic vector extracted is input to machine
In the sorting algorithm of study, thus obtain infection type virus identification model.
The sorting algorithm used is not particularly limited by the embodiment of the present application, and it can use existing
Any one sorting algorithm, such as decision Tree algorithms, SVM (Support Vector Machine, support to
Amount machine) algorithm etc..
The infection type virus identification mould for detecting infection type virus has been obtained by above-mentioned training method
Type.
Do further below for each step S100 in above-mentioned infection type method for detecting virus~S110
Illustrate.
Wherein step S100, is the characteristic vector extracting file to be detected;Described characteristic vector includes:
The distribution frequency of the entropy of the immediate of entrance;
It is understood that the mould that the characteristic vector extracted when detecting infection type virus uses with it
The characteristic vector that type extracts when training is identical.Therefore the literary composition to be detected that step S100 is extracted
The characteristic vector of part also includes: the distribution frequency of the entropy of the immediate of entrance.
For the extracting method of distribution frequency of entropy of immediate of entrance with training infection the most above
Described in the introduction of type virus identification model, do not repeat them here.
Step S110 is to utilize infection type virus identification model to examine based on the features described above vector extracted
Survey whether file to be detected is infection type virus document, namely detect file to be detected the most infected
Type virus infects.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The distribution frequency of the entropy of the immediate of mouth point, utilizes and carries out engineering acquistion based on described characteristic vector
Whether the infection type virus identification model arrived, detecting described file to be detected according to described characteristic vector is
Infection type virus document.Which overcome manual analysis identification and manually start rule high the asking of cost of labor
Topic, and detection should be substantially increased by infection type method for detecting virus based on infection type virus identification model
Speed, and can effectively detect unknown infection type virus.
What another embodiment of the application provided is used for detecting the method for infection type virus as shown in Figure 3,
It can comprise the steps:
S300, extract the characteristic vector of file to be detected;Described characteristic vector includes: standing of entrance
The distribution frequency of entropy i.e. counted, also includes: Structural Characteristics that easy infected type virus infects and/
Or the instruction frequency of the code of entrance;
S310, utilization carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains
According to described characteristic vector, type detects whether described file to be detected is infection type virus document.
Same, it is real based on infection type virus identification model for performing the operation of this infection type Viral diagnosis
Existing, it is, before performing the operation of this infection type Viral diagnosis, need to train an infection type
Virus identifies model.But owing to being required for performing when being not and perform the operation of infection type Viral diagnosis every time
This training operates, and therefore, the operation of this training infection type virus identification model is not detection infection type
The steps necessary of virus.Introduce the training method of lower infection type virus identification model first below.Such as figure
Shown in 4, for the stream of the infection type virus identification model training method that another embodiment of the application provides
Cheng Tu, this training method can comprise the steps:
S400, obtain infected type virus infect infection type Virus Sample;
The method and quantity that obtain this infection type Virus Sample are not particularly limited by the embodiment of the present application,
And it is understood that its infection type Virus Sample quantity obtained is the most, then the infection type trained
Virus identifies that the accuracy of Model Identification virus is the highest.The training infection type that the embodiment of the present application provides is sick
Poison identifies the method for model, and the infection type Virus Sample that can be based only upon acquisition is trained, it is,
Training process completes only with black file;Can also infection type Virus Sample based on 1: 1 and non-sense
Dye Virus Sample is trained, it is, training process uses black file to come with the ratio of text of an annotated book part 1: 1
Complete.The file that black file the most infected type virus described herein infects, text of an annotated book part is the most viral
The normal file infected.
S410, the characteristic vector of extraction infection type Virus Sample, described characteristic vector includes: entrance
The distribution frequency of entropy of immediate, also include: the Structural Characteristics that easy infected type virus infects
And/or the instruction frequency of the code of entrance;
Likely revising due to infection type virus must be through flow process, and therefore, the embodiment of the present application can be by carrying
The distribution frequency of the entropy of the immediate of taking mouth point, carries out machine learning, thus identifies infection type
Whether virus is to revising through flow process.
Immediate described in the embodiment of the present application is defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8
Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is
16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction
Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just
The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy
Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to
Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively
The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical
The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond this prescribed limit, then may be used
Think that this document the most infected type virus infects.Therefore, the embodiment of the present application extracts entrance
The distribution frequency of the entropy of immediate.
Wherein, the method for the distribution frequency extracting the entropy of the immediate of entrance includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;Visible, in the immediate extracting entrance
The distribution frequency of entropy time, be the principle traversal entrance using depth-first, obtain entrance
The instruction of specified quantity.As shown in Figure 5, decompiling enters the schematic diagram of concrete traversal method one by one
The instruction code of mouth point, the point of each represented by circles is the position redirecting function call place, uses respectively
C1, c2, c3... represent, use depth-first principle to begin stepping through from c1, run into call function, deeply
Angle value adds 1, and enters function;If depth value reaches designated value (i.e. designated depth), such as, arrive
Prescribed depth value 4, then to running into call function, its depth value no longer adds 1, only records function name,
And do not enter function, until the instruction in all functions traveled through reaches specified quantity, such as 2000
Individual, then travel through.Order to the point that entrance employing depth-first principle in Fig. 5 travels through according to this
As shown in dotted arrow in Fig. 5, particularly as follows: c1-c2-c4-c8.It should be noted that facilitating
Cheng Zhongruo is in the case of not up to prescribed depth value, and the instruction number that the function traveled through comprises reaches
Specified quantity, then can stop traversal, it is not necessary to traverse prescribed depth value.Such as, convenient to c3
Time, the instruction number that function c1, c2 and the c3 traveled through is comprised reaches specified quantity 2000
Individual, then stop traversal, no longer travel through c4.Can be got by above-mentioned traversing operation and specify at entrance
The instruction of quantity.Afterwards, the immediate of all instructions that the function that statistics is traveled through is comprised;Afterwards
Calculate the entropy of described immediate;Represent corresponding to it owing to cannot accurately identify for an immediate
Numerical value specifically use the affiliated system of representation of which kind of system, therefore can calculate this simultaneously and stand
Binary system entropy, decimal scale entropy and the hexadecimal entropy i.e. counted, if the non-infection manually write
The immediate of virus, the entropy that must have the immediate of a kind of system is smaller.Finally, add up described vertical
The distribution frequency of the entropy i.e. counted.Be not infected the immediate of the entrance of file entropy point
The distribution frequency pair of the entropy of the immediate of the entrance of cloth frequency and infected type virus infected file
, in Fig. 7, abscissa represents the entropy of immediate than schematic diagram more as shown in Figure 7, and vertical coordinate represents each
The number of times that entropy occurs.The high entropy of the file that the most infected type virus infects
Occurrence number is more.
Can be extracted by aforesaid operations the immediate of the entrance of infection type Virus Sample entropy point
Cloth frequency, as characteristic vector.
The characteristic vector that the embodiment of the present application is extracted may also include that the instruction frequency of the code of entrance,
Thus identify that infection type virus is to whether revising through flow process.
Wherein, the instruction frequency of the code extracting entrance need to first navigate to entrance, and extracts entrance
The instruction of point, then add up it and respectively instruct the frequency of occurrences.What the embodiment of the present application provided extracts entrance
The method of the instruction frequency of code includes:
First, use depth-first principle to begin stepping through the function of designated depth from entrance, until institute
Till the instruction number that the function of traversal is comprised reaches specified quantity;
Afterwards, the frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtain entrance
The instruction frequency of code.
The schematic diagram of concrete traversal method as shown in Figure 5, the instruction generation of decompiling entrance one by one
Code, the point of each represented by circles is the position redirecting function call place, respectively with c1, c2, c3...
Representing, using depth-first principle to begin stepping through from c1, run into call function, depth value adds 1, and
Enter function;If depth value reaches designated value (i.e. designated depth), such as, arrive prescribed depth value 4,
Then to running into call function, its depth value no longer adds 1, only records function name, and does not enter function,
Until the instruction in all functions traveled through reaches specified quantity, such as 2000, then travel through.
To dotted arrow in the order such as Fig. 5 of the point that entrance employing depth-first principle in Fig. 5 travels through according to this
Shown in, particularly as follows: c1-c2-c4-c8.If it should be noted that on not up to rule in convenience processes
In the case of depthkeeping angle value, the instruction number that the function traveled through comprises has reached specified quantity, then can stop
Only traversal, it is not necessary to traverse prescribed depth value.Such as, when facility to c3, the function traveled through
The instruction number that c1, c2 and c3 are comprised reaches specified quantity 2000, then stop traversal, no
Travel through c4 again.
The instruction of specified quantity at entrance can be got, its instruction obtained by above-mentioned traversing operation
Following information can be included: ID that instruction name, instruction occurrence number, instruction are corresponding etc..And it is logical
Crossing depth-first principle and perform this traversing operation can be follow-up effectively to find infection type virus whether to must
Modify through flow process and provide convenient.
As shown in Figure 6, the frequency of occurrences of instruction, the wherein horizontal seat of this curve can be represented by curve
Mark represents the ID that each instruction is corresponding, and vertical coordinate represents the occurrence number of instruction.Such as, for add,
Adc, mov tri-instruction, the ID of its correspondence can be respectively defined as 1,2,3.
The instruction frequency of the code of the entrance of infection type Virus Sample can be extracted by aforesaid operations, make
It is characterized vector.
Further, since infected type virus infect file some Structural Characteristics can relative to not by
The file infected changes, and the Structural Characteristics wherein changed is referred to as easy infected type virus
The Structural Characteristics infected.Therefore, the embodiment of the present application can obtain these knots when carrying out machine learning
Structure feature.
The Structural Characteristics that easy to be infected type virus described in the embodiment of the present application infects include with down to
Few one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Saving for place, above-mentioned entrance, for the file not being infected, its place, entrance is saved
General at first joint, and after infected type virus infects, it is possible at last joint or at each joint
Between gap, therefore, this place, entrance joint can be as judging viral one of them of infection type
Condition.
For performing the number of joint, for the file not being infected, it can perform joint generally
One, and after infected type virus infects, it is possible to increase the number that can perform joint, namely can hold
Row non-one of number of joint, therefore, this can perform the number of joint also can be as judging viral its of infection type
In a condition.
For can perform joint title, owing to the title of the performed joint of a file generally comprises fixing
Several, following four is the title of commonly used performed joint: txt, dat, rsrc, loc,
If not the described title performing joint, it is believed that this document is apocrypha, it is possible to infected type is sick
Poison infects, and therefore, this can perform the title of joint equally can be as judging viral one of them of infection type
Condition.
For the entropy of place, entrance joint, place, the entrance joint of the file not typically being infected
Entropy can be in a less scope, such as, generally 2.0~3.0, if infected type is viral
Infecting, this entropy typically can exceed this scope, and such as, this entropy becomes a bigger value, therefore,
Judge that the entropy saved at place, entrance equally can be as an identification condition of infection type virus.
For Jie Nei position, place, entrance, infected type virus can be judged whether according to alignment relation
Infecting, the feature of the entrance of the file not being infected is the typically meeting position in close alignment,
If the then non-position near alignment, place, entrance Jie Nei position, it is likely that be because infection type virus
Cause, therefore, using this Jie Nei position, place, entrance as infection type virus one identification condition.
By above-mentioned analysis it can be seen that all may recognize that infection by any of which Structural Characteristics
Type virus.Therefore, said structure feature can obtain any of which or multiple.It is appreciated that
, identify that infection type virus can be more accurate by above-mentioned multiple Structural Characteristics, therefore it obtains
Structural Characteristics the most, then the training pattern carrying out obtaining during machine learning carries out the standard of Viral diagnosis
Exactness is the highest.
It addition, the application one embodiment can also utilize machine learning to be that each Structural Characteristics determines
Weighted value, thus identify infection type virus according to the weighted value of each Structural Characteristics and correspondence.
Above are only the several instantiations in the Structural Characteristics cited by inventor, due to impossible
All of Structural Characteristics is exhaustive at this, and therefore, other infected type virus infects and changes
Structural Characteristics also in the protection domain of the application.
Therefore the embodiment of the present application is removed when extracting the characteristic vector of infection type Virus Sample and is extracted entrance
Point immediate entropy distribution frequency outside, also can extract entrance code instruction frequency and/
Or Structural Characteristics.Aforesaid operations obtains the characteristic vector that infection type Virus Sample needs to extract.
The machine learning classification algorithm that S420, utilization are preset calculates, and obtains infection type virus identification
Model.
This step is namely by above-mentioned acquired sample, and the characteristic vector extracted is input to machine
In the sorting algorithm of study, thus obtain infection type virus identification model.
The sorting algorithm used is not particularly limited by the embodiment of the present application, and it can use existing
Any one sorting algorithm, such as decision Tree algorithms, SVM (Support Vector Machine, support to
Amount machine) algorithm etc..
The infection type virus identification mould for detecting infection type virus has been obtained by above-mentioned training method
Type.
Do further below for each step S300 in above-mentioned infection type method for detecting virus~S310
Illustrate.
Wherein step S300, is the characteristic vector extracting file to be detected;Described characteristic vector includes:
The distribution frequency of the entropy of the immediate of entrance, also includes: the structure that easy infected type virus infects
The instruction frequency of the code of property feature and/or entrance;
It is understood that the mould that the characteristic vector extracted when detecting infection type virus uses with it
The characteristic vector that type extracts when training is identical.Therefore the literary composition to be detected that step S300 is extracted
The characteristic vector extracted when the characteristic vector of part and above-mentioned infection type virus identification model is identical.Its
Including the distribution frequency of entropy of the immediate of entrance, for entrance immediate entropy point
Here is omitted for the acquisition methods of cloth frequency, described in previous step S410.
Also include Structural Characteristics, described Structural Characteristics include but not limited to following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
The change that above-mentioned each Structural Characteristics is occurred after infected type virus infects, at this no longer
Repeat.
May also include the instruction frequency of the code of entrance, extract the instruction frequency of the code of this entrance
Method with the most above training infection type virus identification model described in.
Step S310 is to utilize infection type virus identification model to examine based on the features described above vector extracted
Survey whether file to be detected is infection type virus document, namely detect file to be detected the most infected
Type virus infects.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The distribution frequency of the entropy of the immediate of mouth point, also includes: it is structural that easy infected type virus infects
The instruction frequency of the code of feature and/or entrance, utilizes and carries out machine learning based on described characteristic vector
Whether the infection type virus identification model obtained, detect described file to be detected according to described characteristic vector
For infection type virus document.Which overcome manual analysis identification and manually start rule cost of labor high
Problem, and inspection should be substantially increased by infection type method for detecting virus based on infection type virus identification model
Degree of testing the speed, and can effectively detect unknown infection type virus.
Based on the thinking that said method is same, the embodiment of the present application also provides for a kind of for detecting infection type
The device of virus, as shown in Figure 8, for this device one example structure schematic diagram, this device master
Including:
For extracting the unit 80 of the characteristic vector of file to be detected;Described characteristic vector includes: entrance
The distribution frequency of the entropy of the immediate of point, hereinafter referred to as characteristic vector pickup unit 80;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing
Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document
81, hereinafter referred to as virus detection element 81.
Below the function of said two units is described in further detail.
Likely revising due to infection type virus must be through flow process, therefore, and the embodiment of the present application characteristic vector
Extraction unit 80 can be used as infection type by the distribution frequency extracting the entropy of the immediate of entrance
The characteristic vector of Viral diagnosis, thus identify that infection type virus is to whether revising through flow process.
Wherein, the immediate described in the embodiment of the present application is defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8
Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is
16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction
Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just
The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy
Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to
Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively
The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical
The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond prescribed limit, then can be recognized
Infect for this document the most infected type virus.
For extracting the distribution frequency of the entropy of the immediate of entrance, described characteristic vector pickup unit 80
Can be as shown in Figure 10, farther include following subelement:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement 803 till specified quantity, hereinafter referred to as enters
Subelement 803 is extracted in mouth point instruction;Visible, at the distribution frequency of the entropy of the immediate extracting entrance
During rate, also it is the principle traversal entrance using depth-first, obtains the immediate of each instruction, right
The method of function of entrance designated depth is traveled through with in above example in the employing effective principle of the degree of depth
Shown in, here is omitted.
For adding up the subelement 804 of the immediate of all instructions that the function traveled through is comprised, with
Lower abbreviation immediate statistics subelement 804;
For calculating the entropy meter of the subelement 805 of the entropy of described immediate, hereinafter referred to as immediate
Operator unit 805;Owing to its represented numerical value cannot be accurately identified specifically for an immediate
Using the representation of which kind of system, therefore the entropy computation subunit 805 of immediate can calculate simultaneously
The binary system entropy of this immediate, decimal scale entropy and hexadecimal entropy, if that manually writes is non-
Infecting the immediate of virus, the entropy that must have the immediate of a kind of system is smaller.
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance
The subelement 806 of distribution frequency, hereinafter referred to as distribution frequency statistics subelement 806.The most viral
The distribution frequency of the entropy of the immediate of the entrance of infected file and infected type virus infected file
Entrance immediate entropy distribution frequency contrast schematic diagram as shown in Figure 7, pass through Fig. 7
Can be seen that the occurrence number of the high entropy of the file that infected type virus infects is more.
It addition, for detection infection type virus to must be through the amendment of flow process, described characteristic vector pickup unit
The structure of the another kind of embodiment of 80 can be as shown in Figure 9, it may include following subelement is used for extracting this
The instruction frequency of the code of entrance:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement 801 till specified quantity, hereinafter referred to as enters
Subelement 801 is extracted in mouth point instruction;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance
The subelement 802 of the instruction frequency of code, hereinafter referred to as entry point instruction frequency statistics subelement 802.
Wherein, the instruction frequency of the code that entry point instruction extraction subelement 801 extracts entrance needs elder generation
Navigate to entrance, and extract the instruction of entrance.The entry point instruction that the embodiment of the present application provides carries
The method of instruction taking the code that subelement 801 extracts entrance includes:
Use depth-first principle, begin stepping through the function of designated depth from entrance, until being traveled through
The instruction number that comprised of all functions reach specified quantity till.
The schematic diagram of concrete traversal method as shown in Figure 5, the instruction generation of decompiling entrance one by one
Code, the point of each represented by circles is the position redirecting function call place, respectively with c1, c2, c3...
Representing, using depth-first principle to begin stepping through from c1, run into call function, depth value adds 1, and
Enter function;If depth value reaches designated value (i.e. designated depth), such as, arrive prescribed depth value 4,
Then to running into call function, its depth value no longer adds 1, only records function name, and does not enter function,
Until the instruction in all functions traveled through reaches specified quantity, such as 2000, then travel through.
The order of the point that entrance employing depth-first principle in Fig. 5 travels through according to this be should be: c1-c2-c4-c8.
If it should be noted that in convenience processes in the case of not up to prescribed depth value, the letter traveled through
The instruction number that number comprises has reached specified quantity, then can stop traversal, it is not necessary to traverse prescribed depth
Value.Such as, when facility to c3, the instruction number that function c1, c2 and the c3 traveled through is comprised
Amount reaches specified quantity 2000, then stop traversal, no longer travel through c4.
The instruction of specified quantity at entrance can be got, its instruction obtained by above-mentioned traversing operation
Following information can be included: ID that instruction name, instruction occurrence number, instruction are corresponding etc..And it is logical
Crossing depth-first principle and perform this traversing operation can be follow-up effectively to find infection type virus whether to must
Modify through flow process and provide convenient.
Entry point instruction frequency statistics subelement 802 can be added up entry point instruction and extract subelement 801 institute
The frequency of occurrences of all instructions that the function of traversal is comprised, obtains the instruction frequency of the code of entrance.
As shown in Figure 6, the frequency of occurrences of described instruction, the wherein horizontal seat of this curve can be represented by curve
Mark represents the ID that each instruction is corresponding, and vertical coordinate represents the occurrence number of instruction.Such as, for add,
Adc, mov tri-instruction, the ID of its correspondence can be respectively defined as 1,2,3.
Further, since infected type virus infect file some Structural Characteristics can relative to not by
The file infected changes, and the Structural Characteristics wherein changed is referred to as easy infected type virus
The Structural Characteristics infected.Therefore, the embodiment of the present application characteristic vector pickup unit 80 also can extract
The Structural Characteristics of file to be detected, its extracted Structural Characteristics is that easy infected type virus infects
Structural Characteristics, including following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Saving for place, above-mentioned entrance, for the file not being infected, its place, entrance is saved
General at first joint, and after infected type virus infects, it is possible at last joint or at each joint
Between gap, therefore, this place, entrance joint can be as judging viral one of them of infection type
Condition.
For performing the number of joint, for the file not being infected, it can perform joint generally
One, and after infected type virus infects, it is possible to increase the number that can perform joint, namely can hold
Row non-one of number of joint, therefore, this can perform the number of joint also can be as judging viral its of infection type
In a condition.
For can perform joint title, owing to the title of the performed joint of a file generally comprises fixing
Several, following four is the title of commonly used performed joint: txt, dat, rsrc, loc,
If not the described title performing joint, it is believed that this document is apocrypha, it is possible to infected type is sick
Poison infects, and therefore, this can perform the title of joint equally can be as judging viral one of them of infection type
Condition.
For the entropy of place, entrance joint, place, the entrance joint of the file not typically being infected
Entropy can be in a less scope, such as, generally 2.0~3.0, if infected type is viral
Infecting, this entropy typically can exceed this scope, and such as, this entropy becomes a bigger value, therefore,
Judge that the entropy saved at place, entrance equally can be as an identification condition of infection type virus.
For Jie Nei position, place, entrance, infected type virus can be judged whether according to alignment relation
Infecting, the feature of the entrance of the file not being infected is the typically meeting position in close alignment,
If the then non-position near alignment, place, entrance Jie Nei position, it is likely that be because infection type virus
Cause, therefore, using this Jie Nei position, place, entrance as infection type virus one identification condition.
By above-mentioned analysis it can be seen that all may recognize that infection by any of which Structural Characteristics
Type virus.Therefore, characteristic vector pickup unit 80 can obtain wherein in said structure feature
Any one or multiple.It is it is understood that identify infection type by above-mentioned multiple Structural Characteristics
Virus can be more accurate, and therefore its Structural Characteristics obtained is the most, then carry out obtaining during machine learning
The accuracy that training pattern carries out Viral diagnosis is the highest.
It addition, the application one embodiment can also utilize machine learning to be that each Structural Characteristics determines
Weighted value, thus identify infection type virus according to the weighted value of each Structural Characteristics and correspondence.
Above are only the several instantiations in the Structural Characteristics cited by inventor, due to impossible
All of Structural Characteristics is exhaustive at this, and therefore, other infected type virus infects and changes
Structural Characteristics also in the protection domain of the application.
Therefore, the characteristic vector pickup unit 80 of the embodiment of the present application can only extract file to be detected
The distribution frequency of the entropy of the immediate of entrance, also can extract the code command frequency of entrance simultaneously
And/or Structural Characteristics.
Virus detection element 81 is to utilize infection type virus identification model feature based vector extraction unit
The 80 features described above vectors extracted detect whether file to be detected is infection type virus document, namely
Detect file to be detected the most infected type virus to infect.
Wherein, the training method of described infection type virus identification model is same above described in embodiment of the method,
Here is omitted.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The distribution frequency of the entropy of the immediate of mouth point, utilizes and carries out engineering acquistion based on described characteristic vector
Whether the infection type virus identification model arrived, detecting described file to be detected according to described characteristic vector is
Infection type virus document.Which overcome manual analysis identification and manually start rule high the asking of cost of labor
Topic, and detection should be substantially increased by infection type method for detecting virus based on infection type virus identification model
Speed, and can effectively detect unknown infection type virus.
It should be noted that the present invention can be carried out in the assembly of hardware at software and/or software,
Such as, special IC (ASIC), general purpose computer can be used or any other is similar hard
Part equipment realizes.In one embodiment, the software program of the present invention can be performed by processor
To realize steps described above or function.Similarly, the software program of the present invention (includes the number being correlated with
According to structure) can be stored in computer readable recording medium storing program for performing, such as, and RAM memory, magnetic
Or CD-ROM driver or floppy disc and similar devices.It addition, some steps of the present invention or function can use
Hardware realizes, and such as, performs the circuit of each step or function as coordinating with processor.
It addition, the part of the present invention can be applied to computer program, such as computer program
Instruction, when it is computer-executed, by the operation of this computer, can call or provide basis
The method of the present invention and/or technical scheme.And call the programmed instruction of the method for the present invention, may be deposited
Store up fixing or movably in record medium, and/or by broadcast or other signal bearing medias
Data stream and be transmitted, and/or be stored in the computer equipment that runs according to described programmed instruction
In working storage.Here, include a device according to one embodiment of present invention, this device bag
Include the memorizer for storing computer program instructions and for performing the processor of programmed instruction, wherein,
When this computer program instructions is performed by this processor, trigger this plant running based on aforementioned according to this
The method of multiple embodiments of invention and/or technical scheme.
It is obvious to a person skilled in the art that the invention is not restricted to the thin of above-mentioned one exemplary embodiment
Joint, and without departing from the spirit or essential characteristics of the present invention, it is possible to concrete with other
Form realizes the present invention.Therefore, no matter from the point of view of which point, embodiment all should be regarded as exemplary
, and be nonrestrictive, the scope of the present invention is limited by claims rather than described above
It is fixed, it is intended that all changes fallen in the implication of equivalency and scope of claim are included
In the present invention.Any reference in claim should not be considered as limit involved right want
Ask.Furthermore, it is to be understood that " an including " word is not excluded for other unit or step, odd number is not excluded for plural number.System
In system claim, multiple unit or the device of statement can also be passed through software by a unit or device
Or hardware realizes.The first, the second word such as grade is used for representing title, and is not offered as any specific
Order.
Claims (14)
1. the method being used for detecting infection type virus, wherein, including:
Extract the characteristic vector of file to be detected;Described characteristic vector includes: the immediate of entrance
The distribution frequency of entropy;
Utilize and carry out, based on described characteristic vector, the infection type virus identification model that machine learning obtains and depend on
Detect whether described file to be detected is infection type virus document according to described characteristic vector.
Method the most according to claim 1, wherein, extracts the characteristic vector of file to be detected
Including:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The immediate of all instructions that the function that statistics is traveled through is comprised;
Calculate the entropy of described immediate;
Add up the distribution frequency of the entropy of described immediate, obtain the immediate of entrance entropy point
Cloth frequency.
Method the most according to claim 2, wherein, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
Method the most according to claim 1, wherein, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Method the most according to claim 3, wherein, described Structural Characteristics include with down to
Few one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
6. according to the method described in claim 1 or 4, wherein, described characteristic vector also includes:
The instruction frequency of the code of entrance.
Method the most according to claim 6, wherein, extracts the characteristic vector of file to be detected
Including:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtains the code of entrance
Instruction frequency.
8. for detect infection type virus a device, wherein, including:
For extracting the unit of the characteristic vector of file to be detected;Described characteristic vector includes: entrance
The distribution frequency of entropy of immediate;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing
Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document.
Device the most according to claim 8, wherein, for extracting the feature of file to be detected
The unit of vector includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the subelement of the immediate of all instructions that the function traveled through is comprised;
For calculating the subelement of the entropy of described immediate;
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance
The subelement of distribution frequency.
Device the most according to claim 9, wherein, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
11. devices according to claim 8, wherein, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
12. devices according to claim 11, wherein, described Structural Characteristics includes following
At least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Device described in 13. according to Claim 8 or 11, wherein, described characteristic vector also includes:
The instruction frequency of the code of entrance.
14. devices according to claim 13, wherein, for extracting the spy of file to be detected
The unit levying vector includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance
The subelement of the instruction frequency of code.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510038791.1A CN105893842A (en) | 2015-01-26 | 2015-01-26 | Method and device used for detecting infective viruses |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510038791.1A CN105893842A (en) | 2015-01-26 | 2015-01-26 | Method and device used for detecting infective viruses |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105893842A true CN105893842A (en) | 2016-08-24 |
Family
ID=56999167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510038791.1A Pending CN105893842A (en) | 2015-01-26 | 2015-01-26 | Method and device used for detecting infective viruses |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105893842A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109992969A (en) * | 2019-03-25 | 2019-07-09 | 腾讯科技(深圳)有限公司 | A kind of malicious file detection method, device and detection platform |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100162400A1 (en) * | 2008-12-11 | 2010-06-24 | Scansafe Limited | Malware detection |
CN103577755A (en) * | 2013-11-01 | 2014-02-12 | 浙江工业大学 | Malicious script static detection method based on SVM (support vector machine) |
CN103685307A (en) * | 2013-12-25 | 2014-03-26 | 北京奇虎科技有限公司 | Method, system, client and server for detecting phishing fraud webpage based on feature library |
CN104077524A (en) * | 2013-03-25 | 2014-10-01 | 腾讯科技(深圳)有限公司 | Training method used for virus identification and virus identification method and device |
-
2015
- 2015-01-26 CN CN201510038791.1A patent/CN105893842A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100162400A1 (en) * | 2008-12-11 | 2010-06-24 | Scansafe Limited | Malware detection |
CN104077524A (en) * | 2013-03-25 | 2014-10-01 | 腾讯科技(深圳)有限公司 | Training method used for virus identification and virus identification method and device |
CN103577755A (en) * | 2013-11-01 | 2014-02-12 | 浙江工业大学 | Malicious script static detection method based on SVM (support vector machine) |
CN103685307A (en) * | 2013-12-25 | 2014-03-26 | 北京奇虎科技有限公司 | Method, system, client and server for detecting phishing fraud webpage based on feature library |
Non-Patent Citations (1)
Title |
---|
罗文华: "基于逆向技术的恶意程序检测方法研究", 《警察技术》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109992969A (en) * | 2019-03-25 | 2019-07-09 | 腾讯科技(深圳)有限公司 | A kind of malicious file detection method, device and detection platform |
CN109992969B (en) * | 2019-03-25 | 2023-03-21 | 腾讯科技(深圳)有限公司 | Malicious file detection method and device and detection platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2019203208B2 (en) | Duplicate and similar bug report detection and retrieval using neural networks | |
US9384385B2 (en) | Face recognition using gradient based feature analysis | |
KR101228899B1 (en) | Method and Apparatus for categorizing and analyzing Malicious Code Using Vector Calculation | |
JP2018173890A (en) | Information processing device, information processing method, and program | |
CN103761476A (en) | Characteristic extraction method and device | |
US20150227364A1 (en) | Technique for plagiarism detection in program source code files based on design pattern | |
CN103886229B (en) | Method and device for extracting PE file features | |
CN107292168A (en) | Detect method and device, the server of program code | |
CN104850493A (en) | Method and device for detecting loophole of source code | |
JP6987209B2 (en) | Duplicate document detection method and system using document similarity measurement model based on deep learning | |
WO2019194343A1 (en) | Mobile apparatus and method of classifying sentence into plurality of classes | |
EP4113463A1 (en) | Methods, systems, articles of manufacture and apparatus to decode receipts based on neural graph architecture | |
CN104680065A (en) | Virus detection method, virus detection device and virus detection equipment | |
CN107193732A (en) | A kind of verification function locating method compared based on path | |
Zhu et al. | Determining image base of firmware files for ARM devices | |
CN111600894A (en) | Network attack detection method and device | |
US20220318383A1 (en) | Methods and apparatus for malware classification through convolutional neural networks using raw bytes | |
CN105631336B (en) | Detect the system and method for the malicious file in mobile device | |
KR101541603B1 (en) | Method and apparatus for determing plagiarism of program using control flow graph | |
JPWO2019092868A1 (en) | Information processing equipment, information processing methods and programs | |
CN105893842A (en) | Method and device used for detecting infective viruses | |
KR102299525B1 (en) | Product Evolution Mining Method And Apparatus Thereof | |
KR102053869B1 (en) | Method and apparatus for detecting malignant code of linux environment | |
CN104657662B (en) | Method and device for detecting infection type virus | |
Liu et al. | Exploring sensor usage behaviors of android applications based on data flow analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160824 |