CN105893842A - Method and device used for detecting infective viruses - Google Patents

Method and device used for detecting infective viruses Download PDF

Info

Publication number
CN105893842A
CN105893842A CN201510038791.1A CN201510038791A CN105893842A CN 105893842 A CN105893842 A CN 105893842A CN 201510038791 A CN201510038791 A CN 201510038791A CN 105893842 A CN105893842 A CN 105893842A
Authority
CN
China
Prior art keywords
entropy
entrance
type virus
immediate
characteristic vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510038791.1A
Other languages
Chinese (zh)
Inventor
陈治宇
周吉文
周杰
李伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anyi Hengtong Beijing Technology Co Ltd
Original Assignee
Anyi Hengtong Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anyi Hengtong Beijing Technology Co Ltd filed Critical Anyi Hengtong Beijing Technology Co Ltd
Priority to CN201510038791.1A priority Critical patent/CN105893842A/en
Publication of CN105893842A publication Critical patent/CN105893842A/en
Pending legal-status Critical Current

Links

Abstract

The invention provides a method and a device used for detecting infective viruses. The method comprises the following steps: extracting the feature vector of a file to be detected, wherein the feature vector comprises the distribution frequency of an entropy of the immediate data of an entry point; and utilizing an infective virus identification model obtained by machine learning on the basis of the feature vector to detect whether the file to be detected is an infective virus file or not on the basis of the feature vector. The problems of high manual cost of manual analysis identification and manual rule startup are overcome, and the method for detecting the infective viruses on the basis of the infective virus identification model greatly improves detection speed, and can effectively detect unknown infective viruses.

Description

For detecting the method and device of infection type virus
Technical field
The present invention relates to computer realm, particularly relate to a kind of for detect infection type virus method and Device.
Background technology
Infection type virus is a type of virus that in virus, mutation is most.In prior art, for Infection type virus mostly uses manual analysis to mate or manually starts the modes such as rule and detects.Due to sense Dye type virus during the viral code propagating self, can constantly vary the generation of virus itself Code form and execution logic, accordingly, it would be desirable to artificial more feature or the rule of constantly adding is to reach to carry The purpose of the recall rate of high infection type virus, this is accomplished by putting into substantial amounts of human resources, comes at manual Manage this infection type being continually changing virus.This manual analysis is mated or manually starts the mode of rule not But there is the problem that human cost is high, and Viral diagnosis speed is difficult to ensure that, it is also difficult to promptly and accurately Find unknown infection type virus.
Summary of the invention
One of present invention solves the technical problem that the method and dress being to provide for detecting infection type virus Put, while reducing human cost, detect infection type virus fast and accurately.
An embodiment according to an aspect of the present invention, it is provided that a kind of for detecting infection type virus Method, including:
Extract the characteristic vector of file to be detected;Described characteristic vector includes: the immediate of entrance The distribution frequency of entropy;
Utilize and carry out, based on described characteristic vector, the infection type virus identification model that machine learning obtains and depend on Detect whether described file to be detected is infection type virus document according to described characteristic vector.
Alternatively, the characteristic vector extracting file to be detected includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through Till the instruction number that function is comprised reaches specified quantity;
The immediate of all instructions that the function that statistics is traveled through is comprised;
Calculate the entropy of described immediate;
Add up the distribution frequency of the entropy of described immediate, obtain the immediate of entrance entropy point Cloth frequency.
Alternatively, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
Alternatively, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Alternatively, described Structural Characteristics include following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint Entropy, Jie Nei position, place, entrance.
Alternatively, described characteristic vector also includes:
The instruction frequency of the code of entrance.
Alternatively, the characteristic vector extracting file to be detected includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through Till the instruction number that function is comprised reaches specified quantity;
The frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtains the code of entrance Instruction frequency.
An embodiment according to a further aspect of the invention, it is provided that one is used for detecting infection type virus Device, including:
For extracting the unit of the characteristic vector of file to be detected;Described characteristic vector includes: entrance The distribution frequency of entropy of immediate;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document.
Alternatively, the unit of the characteristic vector for extracting file to be detected includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the subelement of the immediate of all instructions that the function traveled through is comprised;
For calculating the subelement of the entropy of described immediate;
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance The subelement of distribution frequency.
Alternatively, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
Alternatively, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Alternatively, described Structural Characteristics include following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint Entropy, Jie Nei position, place, entrance.
Alternatively, described characteristic vector also includes:
The instruction frequency of the code of entrance.
Alternatively, the unit of the characteristic vector for extracting file to be detected includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance The subelement of the instruction frequency of code.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into The distribution frequency of the entropy of the immediate of mouth point, utilizes and carries out engineering acquistion based on described characteristic vector Whether the infection type virus identification model arrived, detecting described file to be detected according to described characteristic vector is Infection type virus document.Which overcome manual analysis identification and manually start rule high the asking of cost of labor Topic, and detection should be substantially increased by infection type method for detecting virus based on infection type virus identification model Speed, and can effectively detect unknown infection type virus.
Although those of ordinary skill in the art it will be appreciated that detailed description below by referenced in schematic embodiment, Accompanying drawing is carried out, but the present invention is not limited in these embodiments.But, the scope of the present invention is extensive , and it is intended to be bound only by appended claims restriction the scope of the present invention.
Accompanying drawing explanation
The detailed description that non-limiting example is made made with reference to the following drawings by reading, this The other features, objects and advantages of invention will become more apparent upon:
Fig. 1 is according to an embodiment of the invention for detecting the flow chart of the method for infection type virus.
Fig. 2 is the stream of infection type virus identification model training method in accordance with another embodiment of the present invention Cheng Tu.
Fig. 3 is in accordance with another embodiment of the present invention for detecting the flow process of the method for infection type virus Figure.
Fig. 4 is the stream of infection type virus identification model training method in accordance with another embodiment of the present invention Cheng Tu.
Fig. 5 is to use depth-first principle traversal entry point instruction to show according to an embodiment of the invention It is intended to.
Fig. 6 is the instruction frequency scatter chart of the code of entrance according to an embodiment of the invention.
Fig. 7 is the immediate of the entrance not being infected file according to an embodiment of the invention The entropy of immediate of entrance of distribution frequency and infected type virus infected file of entropy Distribution frequency contrast schematic diagram.
Fig. 8 is to show for detecting the structure of device of infection type virus according to an embodiment of the invention It is intended to.
Fig. 9 is the structural representation of characteristic vector pickup unit according to an embodiment of the invention.
Figure 10 is the structural representation of characteristic vector pickup unit in accordance with another embodiment of the present invention.
In accompanying drawing, same or analogous reference represents same or analogous parts.
Detailed description of the invention
Infection type virus is in program or the dynamic library file (one of DLL) that self is added in other, Thus realize the function run with infected Program Synchronization, and then destroy and self infecting computer Propagate.Infection type virus, due to the characteristic of himself, needs to be attached on other host programs transport OK, and in order to hide the killing of antivirus software, self all can be split, become by usual infection type virus After shape or encryption, then self part or all is attached on host program.An once disease Poison file performs, and the most program files in system are probably just all added viral code by it, And then it is broadcast to other computer, therefore, the method for artificial cognition is difficult to identify sense fast and accurately Dye type virus, and the more difficult infection type virus finding the unknown.The embodiment of the present application is for infection type virus Proposing a kind of detection method, the method is to detect sense based on the infection type virus identification model trained Dye type virus.
Below in conjunction with the accompanying drawings the present invention is described in further detail.
Fig. 1 is according to an embodiment of the invention for detecting the flow chart of the method for infection type virus. Method in the present invention is mainly completed by the operating system in computer equipment or processing controller. Operating system or processing controller are referred to as the device being used for detecting infection type virus.This computer equipment Include but not limited at least one of the following: subscriber equipment, the network equipment.Subscriber equipment include but It is not limited to computer, smart mobile phone, PDA etc..The network equipment includes but not limited to single network service The server group or based on cloud computing by a large amount of computers or network of device, multiple webserver composition The cloud that server is constituted, wherein, cloud computing is the one of Distributed Calculation, loosely-coupled by a group One super virtual machine of computer collection composition.
As shown in fig. 1, this method being used for detecting infection type virus mainly comprises the steps:
S100, extract the characteristic vector of file to be detected;Described characteristic vector includes: standing of entrance The distribution frequency of the entropy i.e. counted;
S110, utilization carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains According to described characteristic vector, type detects whether described file to be detected is infection type virus document.
Firstly, it is necessary to explanation, performing the operation of infection type Viral diagnosis is to know based on infection type virus Other model realizes, it is, before performing the operation of this infection type Viral diagnosis, need to train One infection type virus identification model.But during owing to being not to perform the operation of infection type Viral diagnosis every time Being required for performing this training operation, therefore, the operation of this training infection type virus identification model is not The steps necessary of detection infection type virus.Introduce the training of lower infection type virus identification model first below Method.As shown in Figure 2, the infection type virus identification model instruction provided for one embodiment of the application Practicing the flow chart of method, this training method can comprise the steps:
S200, obtain infected type virus infect infection type Virus Sample;
The method and quantity that obtain this infection type Virus Sample are not particularly limited by the embodiment of the present application, And it is understood that its infection type Virus Sample quantity obtained is the most, then the infection type trained Virus identifies that the accuracy of Model Identification virus is the highest.
In addition, it is necessary to explanation, the training infection type virus identification model that the embodiment of the present application provides Method, the infection type Virus Sample that can be based only upon acquisition trains, it is, training process is only Black file is used to complete;Can also infection type Virus Sample based on 1:1 and non-infection Virus Sample Train, it is, training process uses black file to complete with the ratio of text of an annotated book part 1: 1.Herein The file that described black file the most infected type virus infects, text of an annotated book part is be not infected normal File.
S210, the characteristic vector of extraction infection type Virus Sample, described characteristic vector includes: entrance The distribution frequency of entropy of immediate;
Likely revising due to infection type virus must be through flow process, and therefore, the embodiment of the present application can be by carrying The distribution frequency of the entropy of the immediate of taking mouth point, carries out machine learning, thus identifies infection type Whether virus is to revising through flow process.
Immediate described in the embodiment of the present application is defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8 Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is 16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond this prescribed limit, then may be used Think that this document the most infected type virus infects.Therefore, the embodiment of the present application extracts entrance The distribution frequency of the entropy of immediate.
Wherein, the method for the distribution frequency extracting the entropy of the immediate of entrance includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through Till the instruction number that function is comprised reaches specified quantity;Visible, in the immediate extracting entrance The distribution frequency of entropy time, be the principle traversal entrance using depth-first, obtain entrance The instruction of specified quantity.As shown in Figure 5, decompiling enters the schematic diagram of concrete traversal method one by one The instruction code of mouth point, the point of each represented by circles is the position redirecting function call place, uses respectively C1, c2, c3... represent, use depth-first principle to begin stepping through from c1, run into call function, deeply Angle value adds 1, and enters function;If depth value reaches designated value (i.e. designated depth), such as, arrive Prescribed depth value 4, then to running into call function, its depth value no longer adds 1, only records function name, And do not enter function, until the instruction in all functions traveled through reaches specified quantity, such as 2000 Individual, then travel through.Order to the point that entrance employing depth-first principle in Fig. 5 travels through according to this As shown in dotted arrow in Fig. 5, particularly as follows: c1-c2-c4-c8.It should be noted that facilitating Cheng Zhongruo is in the case of not up to prescribed depth value, and the instruction number that the function traveled through comprises reaches Specified quantity, then can stop traversal, it is not necessary to traverse prescribed depth value.Such as, convenient to c3 Time, the instruction number that function c1, c2 and the c3 traveled through is comprised reaches specified quantity 2000 Individual, then stop traversal, no longer travel through c4.Can be got by above-mentioned traversing operation and specify at entrance The instruction of quantity.Afterwards, the immediate of all instructions that the function that statistics is traveled through is comprised;Afterwards Calculate the entropy of described immediate;When wherein calculating the entropy of immediate, due to for an immediate The institute that its corresponding numerical value represented specifically uses the representation of which kind of system cannot be accurately identified Belong to system, therefore can calculate the binary system entropy of this immediate, decimal scale entropy and hexadecimal simultaneously Entropy, if the immediate of the non-infection virus manually write, must have the entropy of the immediate of a kind of system It is worth smaller.Finally, the distribution frequency of the entropy of described immediate is added up.It is not infected file The entrance of distribution frequency and infected type virus infected file of entropy of immediate of entrance Immediate entropy distribution frequency contrast schematic diagram as shown in Figure 7, in Fig. 7, abscissa represents The entropy of immediate, vertical coordinate represents the number of times that each entropy occurs.The most infected The occurrence number of the high entropy of the file that type virus infects is more.
Can be extracted by aforesaid operations the immediate of the entrance of infection type Virus Sample entropy point Cloth frequency, as characteristic vector.
The machine learning classification algorithm that S220, utilization are preset calculates, and obtains infection type virus identification Model.
This step is namely by above-mentioned acquired sample, and the characteristic vector extracted is input to machine In the sorting algorithm of study, thus obtain infection type virus identification model.
The sorting algorithm used is not particularly limited by the embodiment of the present application, and it can use existing Any one sorting algorithm, such as decision Tree algorithms, SVM (Support Vector Machine, support to Amount machine) algorithm etc..
The infection type virus identification mould for detecting infection type virus has been obtained by above-mentioned training method Type.
Do further below for each step S100 in above-mentioned infection type method for detecting virus~S110 Illustrate.
Wherein step S100, is the characteristic vector extracting file to be detected;Described characteristic vector includes: The distribution frequency of the entropy of the immediate of entrance;
It is understood that the mould that the characteristic vector extracted when detecting infection type virus uses with it The characteristic vector that type extracts when training is identical.Therefore the literary composition to be detected that step S100 is extracted The characteristic vector of part also includes: the distribution frequency of the entropy of the immediate of entrance.
For the extracting method of distribution frequency of entropy of immediate of entrance with training infection the most above Described in the introduction of type virus identification model, do not repeat them here.
Step S110 is to utilize infection type virus identification model to examine based on the features described above vector extracted Survey whether file to be detected is infection type virus document, namely detect file to be detected the most infected Type virus infects.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into The distribution frequency of the entropy of the immediate of mouth point, utilizes and carries out engineering acquistion based on described characteristic vector Whether the infection type virus identification model arrived, detecting described file to be detected according to described characteristic vector is Infection type virus document.Which overcome manual analysis identification and manually start rule high the asking of cost of labor Topic, and detection should be substantially increased by infection type method for detecting virus based on infection type virus identification model Speed, and can effectively detect unknown infection type virus.
What another embodiment of the application provided is used for detecting the method for infection type virus as shown in Figure 3, It can comprise the steps:
S300, extract the characteristic vector of file to be detected;Described characteristic vector includes: standing of entrance The distribution frequency of entropy i.e. counted, also includes: Structural Characteristics that easy infected type virus infects and/ Or the instruction frequency of the code of entrance;
S310, utilization carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains According to described characteristic vector, type detects whether described file to be detected is infection type virus document.
Same, it is real based on infection type virus identification model for performing the operation of this infection type Viral diagnosis Existing, it is, before performing the operation of this infection type Viral diagnosis, need to train an infection type Virus identifies model.But owing to being required for performing when being not and perform the operation of infection type Viral diagnosis every time This training operates, and therefore, the operation of this training infection type virus identification model is not detection infection type The steps necessary of virus.Introduce the training method of lower infection type virus identification model first below.Such as figure Shown in 4, for the stream of the infection type virus identification model training method that another embodiment of the application provides Cheng Tu, this training method can comprise the steps:
S400, obtain infected type virus infect infection type Virus Sample;
The method and quantity that obtain this infection type Virus Sample are not particularly limited by the embodiment of the present application, And it is understood that its infection type Virus Sample quantity obtained is the most, then the infection type trained Virus identifies that the accuracy of Model Identification virus is the highest.The training infection type that the embodiment of the present application provides is sick Poison identifies the method for model, and the infection type Virus Sample that can be based only upon acquisition is trained, it is, Training process completes only with black file;Can also infection type Virus Sample based on 1: 1 and non-sense Dye Virus Sample is trained, it is, training process uses black file to come with the ratio of text of an annotated book part 1: 1 Complete.The file that black file the most infected type virus described herein infects, text of an annotated book part is the most viral The normal file infected.
S410, the characteristic vector of extraction infection type Virus Sample, described characteristic vector includes: entrance The distribution frequency of entropy of immediate, also include: the Structural Characteristics that easy infected type virus infects And/or the instruction frequency of the code of entrance;
Likely revising due to infection type virus must be through flow process, and therefore, the embodiment of the present application can be by carrying The distribution frequency of the entropy of the immediate of taking mouth point, carries out machine learning, thus identifies infection type Whether virus is to revising through flow process.
Immediate described in the embodiment of the present application is defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8 Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is 16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond this prescribed limit, then may be used Think that this document the most infected type virus infects.Therefore, the embodiment of the present application extracts entrance The distribution frequency of the entropy of immediate.
Wherein, the method for the distribution frequency extracting the entropy of the immediate of entrance includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through Till the instruction number that function is comprised reaches specified quantity;Visible, in the immediate extracting entrance The distribution frequency of entropy time, be the principle traversal entrance using depth-first, obtain entrance The instruction of specified quantity.As shown in Figure 5, decompiling enters the schematic diagram of concrete traversal method one by one The instruction code of mouth point, the point of each represented by circles is the position redirecting function call place, uses respectively C1, c2, c3... represent, use depth-first principle to begin stepping through from c1, run into call function, deeply Angle value adds 1, and enters function;If depth value reaches designated value (i.e. designated depth), such as, arrive Prescribed depth value 4, then to running into call function, its depth value no longer adds 1, only records function name, And do not enter function, until the instruction in all functions traveled through reaches specified quantity, such as 2000 Individual, then travel through.Order to the point that entrance employing depth-first principle in Fig. 5 travels through according to this As shown in dotted arrow in Fig. 5, particularly as follows: c1-c2-c4-c8.It should be noted that facilitating Cheng Zhongruo is in the case of not up to prescribed depth value, and the instruction number that the function traveled through comprises reaches Specified quantity, then can stop traversal, it is not necessary to traverse prescribed depth value.Such as, convenient to c3 Time, the instruction number that function c1, c2 and the c3 traveled through is comprised reaches specified quantity 2000 Individual, then stop traversal, no longer travel through c4.Can be got by above-mentioned traversing operation and specify at entrance The instruction of quantity.Afterwards, the immediate of all instructions that the function that statistics is traveled through is comprised;Afterwards Calculate the entropy of described immediate;Represent corresponding to it owing to cannot accurately identify for an immediate Numerical value specifically use the affiliated system of representation of which kind of system, therefore can calculate this simultaneously and stand Binary system entropy, decimal scale entropy and the hexadecimal entropy i.e. counted, if the non-infection manually write The immediate of virus, the entropy that must have the immediate of a kind of system is smaller.Finally, add up described vertical The distribution frequency of the entropy i.e. counted.Be not infected the immediate of the entrance of file entropy point The distribution frequency pair of the entropy of the immediate of the entrance of cloth frequency and infected type virus infected file , in Fig. 7, abscissa represents the entropy of immediate than schematic diagram more as shown in Figure 7, and vertical coordinate represents each The number of times that entropy occurs.The high entropy of the file that the most infected type virus infects Occurrence number is more.
Can be extracted by aforesaid operations the immediate of the entrance of infection type Virus Sample entropy point Cloth frequency, as characteristic vector.
The characteristic vector that the embodiment of the present application is extracted may also include that the instruction frequency of the code of entrance, Thus identify that infection type virus is to whether revising through flow process.
Wherein, the instruction frequency of the code extracting entrance need to first navigate to entrance, and extracts entrance The instruction of point, then add up it and respectively instruct the frequency of occurrences.What the embodiment of the present application provided extracts entrance The method of the instruction frequency of code includes:
First, use depth-first principle to begin stepping through the function of designated depth from entrance, until institute Till the instruction number that the function of traversal is comprised reaches specified quantity;
Afterwards, the frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtain entrance The instruction frequency of code.
The schematic diagram of concrete traversal method as shown in Figure 5, the instruction generation of decompiling entrance one by one Code, the point of each represented by circles is the position redirecting function call place, respectively with c1, c2, c3... Representing, using depth-first principle to begin stepping through from c1, run into call function, depth value adds 1, and Enter function;If depth value reaches designated value (i.e. designated depth), such as, arrive prescribed depth value 4, Then to running into call function, its depth value no longer adds 1, only records function name, and does not enter function, Until the instruction in all functions traveled through reaches specified quantity, such as 2000, then travel through. To dotted arrow in the order such as Fig. 5 of the point that entrance employing depth-first principle in Fig. 5 travels through according to this Shown in, particularly as follows: c1-c2-c4-c8.If it should be noted that on not up to rule in convenience processes In the case of depthkeeping angle value, the instruction number that the function traveled through comprises has reached specified quantity, then can stop Only traversal, it is not necessary to traverse prescribed depth value.Such as, when facility to c3, the function traveled through The instruction number that c1, c2 and c3 are comprised reaches specified quantity 2000, then stop traversal, no Travel through c4 again.
The instruction of specified quantity at entrance can be got, its instruction obtained by above-mentioned traversing operation Following information can be included: ID that instruction name, instruction occurrence number, instruction are corresponding etc..And it is logical Crossing depth-first principle and perform this traversing operation can be follow-up effectively to find infection type virus whether to must Modify through flow process and provide convenient.
As shown in Figure 6, the frequency of occurrences of instruction, the wherein horizontal seat of this curve can be represented by curve Mark represents the ID that each instruction is corresponding, and vertical coordinate represents the occurrence number of instruction.Such as, for add, Adc, mov tri-instruction, the ID of its correspondence can be respectively defined as 1,2,3.
The instruction frequency of the code of the entrance of infection type Virus Sample can be extracted by aforesaid operations, make It is characterized vector.
Further, since infected type virus infect file some Structural Characteristics can relative to not by The file infected changes, and the Structural Characteristics wherein changed is referred to as easy infected type virus The Structural Characteristics infected.Therefore, the embodiment of the present application can obtain these knots when carrying out machine learning Structure feature.
The Structural Characteristics that easy to be infected type virus described in the embodiment of the present application infects include with down to Few one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint Entropy, Jie Nei position, place, entrance.
Saving for place, above-mentioned entrance, for the file not being infected, its place, entrance is saved General at first joint, and after infected type virus infects, it is possible at last joint or at each joint Between gap, therefore, this place, entrance joint can be as judging viral one of them of infection type Condition.
For performing the number of joint, for the file not being infected, it can perform joint generally One, and after infected type virus infects, it is possible to increase the number that can perform joint, namely can hold Row non-one of number of joint, therefore, this can perform the number of joint also can be as judging viral its of infection type In a condition.
For can perform joint title, owing to the title of the performed joint of a file generally comprises fixing Several, following four is the title of commonly used performed joint: txt, dat, rsrc, loc, If not the described title performing joint, it is believed that this document is apocrypha, it is possible to infected type is sick Poison infects, and therefore, this can perform the title of joint equally can be as judging viral one of them of infection type Condition.
For the entropy of place, entrance joint, place, the entrance joint of the file not typically being infected Entropy can be in a less scope, such as, generally 2.0~3.0, if infected type is viral Infecting, this entropy typically can exceed this scope, and such as, this entropy becomes a bigger value, therefore, Judge that the entropy saved at place, entrance equally can be as an identification condition of infection type virus.
For Jie Nei position, place, entrance, infected type virus can be judged whether according to alignment relation Infecting, the feature of the entrance of the file not being infected is the typically meeting position in close alignment, If the then non-position near alignment, place, entrance Jie Nei position, it is likely that be because infection type virus Cause, therefore, using this Jie Nei position, place, entrance as infection type virus one identification condition.
By above-mentioned analysis it can be seen that all may recognize that infection by any of which Structural Characteristics Type virus.Therefore, said structure feature can obtain any of which or multiple.It is appreciated that , identify that infection type virus can be more accurate by above-mentioned multiple Structural Characteristics, therefore it obtains Structural Characteristics the most, then the training pattern carrying out obtaining during machine learning carries out the standard of Viral diagnosis Exactness is the highest.
It addition, the application one embodiment can also utilize machine learning to be that each Structural Characteristics determines Weighted value, thus identify infection type virus according to the weighted value of each Structural Characteristics and correspondence.
Above are only the several instantiations in the Structural Characteristics cited by inventor, due to impossible All of Structural Characteristics is exhaustive at this, and therefore, other infected type virus infects and changes Structural Characteristics also in the protection domain of the application.
Therefore the embodiment of the present application is removed when extracting the characteristic vector of infection type Virus Sample and is extracted entrance Point immediate entropy distribution frequency outside, also can extract entrance code instruction frequency and/ Or Structural Characteristics.Aforesaid operations obtains the characteristic vector that infection type Virus Sample needs to extract.
The machine learning classification algorithm that S420, utilization are preset calculates, and obtains infection type virus identification Model.
This step is namely by above-mentioned acquired sample, and the characteristic vector extracted is input to machine In the sorting algorithm of study, thus obtain infection type virus identification model.
The sorting algorithm used is not particularly limited by the embodiment of the present application, and it can use existing Any one sorting algorithm, such as decision Tree algorithms, SVM (Support Vector Machine, support to Amount machine) algorithm etc..
The infection type virus identification mould for detecting infection type virus has been obtained by above-mentioned training method Type.
Do further below for each step S300 in above-mentioned infection type method for detecting virus~S310 Illustrate.
Wherein step S300, is the characteristic vector extracting file to be detected;Described characteristic vector includes: The distribution frequency of the entropy of the immediate of entrance, also includes: the structure that easy infected type virus infects The instruction frequency of the code of property feature and/or entrance;
It is understood that the mould that the characteristic vector extracted when detecting infection type virus uses with it The characteristic vector that type extracts when training is identical.Therefore the literary composition to be detected that step S300 is extracted The characteristic vector extracted when the characteristic vector of part and above-mentioned infection type virus identification model is identical.Its Including the distribution frequency of entropy of the immediate of entrance, for entrance immediate entropy point Here is omitted for the acquisition methods of cloth frequency, described in previous step S410.
Also include Structural Characteristics, described Structural Characteristics include but not limited to following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint Entropy, Jie Nei position, place, entrance.
The change that above-mentioned each Structural Characteristics is occurred after infected type virus infects, at this no longer Repeat.
May also include the instruction frequency of the code of entrance, extract the instruction frequency of the code of this entrance Method with the most above training infection type virus identification model described in.
Step S310 is to utilize infection type virus identification model to examine based on the features described above vector extracted Survey whether file to be detected is infection type virus document, namely detect file to be detected the most infected Type virus infects.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into The distribution frequency of the entropy of the immediate of mouth point, also includes: it is structural that easy infected type virus infects The instruction frequency of the code of feature and/or entrance, utilizes and carries out machine learning based on described characteristic vector Whether the infection type virus identification model obtained, detect described file to be detected according to described characteristic vector For infection type virus document.Which overcome manual analysis identification and manually start rule cost of labor high Problem, and inspection should be substantially increased by infection type method for detecting virus based on infection type virus identification model Degree of testing the speed, and can effectively detect unknown infection type virus.
Based on the thinking that said method is same, the embodiment of the present application also provides for a kind of for detecting infection type The device of virus, as shown in Figure 8, for this device one example structure schematic diagram, this device master Including:
For extracting the unit 80 of the characteristic vector of file to be detected;Described characteristic vector includes: entrance The distribution frequency of the entropy of the immediate of point, hereinafter referred to as characteristic vector pickup unit 80;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document 81, hereinafter referred to as virus detection element 81.
Below the function of said two units is described in further detail.
Likely revising due to infection type virus must be through flow process, therefore, and the embodiment of the present application characteristic vector Extraction unit 80 can be used as infection type by the distribution frequency extracting the entropy of the immediate of entrance The characteristic vector of Viral diagnosis, thus identify that infection type virus is to whether revising through flow process.
Wherein, the immediate described in the embodiment of the present application is defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8 Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is 16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond prescribed limit, then can be recognized Infect for this document the most infected type virus.
For extracting the distribution frequency of the entropy of the immediate of entrance, described characteristic vector pickup unit 80 Can be as shown in Figure 10, farther include following subelement:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time The instruction number that the function gone through is comprised reaches the subelement 803 till specified quantity, hereinafter referred to as enters Subelement 803 is extracted in mouth point instruction;Visible, at the distribution frequency of the entropy of the immediate extracting entrance During rate, also it is the principle traversal entrance using depth-first, obtains the immediate of each instruction, right The method of function of entrance designated depth is traveled through with in above example in the employing effective principle of the degree of depth Shown in, here is omitted.
For adding up the subelement 804 of the immediate of all instructions that the function traveled through is comprised, with Lower abbreviation immediate statistics subelement 804;
For calculating the entropy meter of the subelement 805 of the entropy of described immediate, hereinafter referred to as immediate Operator unit 805;Owing to its represented numerical value cannot be accurately identified specifically for an immediate Using the representation of which kind of system, therefore the entropy computation subunit 805 of immediate can calculate simultaneously The binary system entropy of this immediate, decimal scale entropy and hexadecimal entropy, if that manually writes is non- Infecting the immediate of virus, the entropy that must have the immediate of a kind of system is smaller.
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance The subelement 806 of distribution frequency, hereinafter referred to as distribution frequency statistics subelement 806.The most viral The distribution frequency of the entropy of the immediate of the entrance of infected file and infected type virus infected file Entrance immediate entropy distribution frequency contrast schematic diagram as shown in Figure 7, pass through Fig. 7 Can be seen that the occurrence number of the high entropy of the file that infected type virus infects is more.
It addition, for detection infection type virus to must be through the amendment of flow process, described characteristic vector pickup unit The structure of the another kind of embodiment of 80 can be as shown in Figure 9, it may include following subelement is used for extracting this The instruction frequency of the code of entrance:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time The instruction number that the function gone through is comprised reaches the subelement 801 till specified quantity, hereinafter referred to as enters Subelement 801 is extracted in mouth point instruction;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance The subelement 802 of the instruction frequency of code, hereinafter referred to as entry point instruction frequency statistics subelement 802.
Wherein, the instruction frequency of the code that entry point instruction extraction subelement 801 extracts entrance needs elder generation Navigate to entrance, and extract the instruction of entrance.The entry point instruction that the embodiment of the present application provides carries The method of instruction taking the code that subelement 801 extracts entrance includes:
Use depth-first principle, begin stepping through the function of designated depth from entrance, until being traveled through The instruction number that comprised of all functions reach specified quantity till.
The schematic diagram of concrete traversal method as shown in Figure 5, the instruction generation of decompiling entrance one by one Code, the point of each represented by circles is the position redirecting function call place, respectively with c1, c2, c3... Representing, using depth-first principle to begin stepping through from c1, run into call function, depth value adds 1, and Enter function;If depth value reaches designated value (i.e. designated depth), such as, arrive prescribed depth value 4, Then to running into call function, its depth value no longer adds 1, only records function name, and does not enter function, Until the instruction in all functions traveled through reaches specified quantity, such as 2000, then travel through. The order of the point that entrance employing depth-first principle in Fig. 5 travels through according to this be should be: c1-c2-c4-c8. If it should be noted that in convenience processes in the case of not up to prescribed depth value, the letter traveled through The instruction number that number comprises has reached specified quantity, then can stop traversal, it is not necessary to traverse prescribed depth Value.Such as, when facility to c3, the instruction number that function c1, c2 and the c3 traveled through is comprised Amount reaches specified quantity 2000, then stop traversal, no longer travel through c4.
The instruction of specified quantity at entrance can be got, its instruction obtained by above-mentioned traversing operation Following information can be included: ID that instruction name, instruction occurrence number, instruction are corresponding etc..And it is logical Crossing depth-first principle and perform this traversing operation can be follow-up effectively to find infection type virus whether to must Modify through flow process and provide convenient.
Entry point instruction frequency statistics subelement 802 can be added up entry point instruction and extract subelement 801 institute The frequency of occurrences of all instructions that the function of traversal is comprised, obtains the instruction frequency of the code of entrance. As shown in Figure 6, the frequency of occurrences of described instruction, the wherein horizontal seat of this curve can be represented by curve Mark represents the ID that each instruction is corresponding, and vertical coordinate represents the occurrence number of instruction.Such as, for add, Adc, mov tri-instruction, the ID of its correspondence can be respectively defined as 1,2,3.
Further, since infected type virus infect file some Structural Characteristics can relative to not by The file infected changes, and the Structural Characteristics wherein changed is referred to as easy infected type virus The Structural Characteristics infected.Therefore, the embodiment of the present application characteristic vector pickup unit 80 also can extract The Structural Characteristics of file to be detected, its extracted Structural Characteristics is that easy infected type virus infects Structural Characteristics, including following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint Entropy, Jie Nei position, place, entrance.
Saving for place, above-mentioned entrance, for the file not being infected, its place, entrance is saved General at first joint, and after infected type virus infects, it is possible at last joint or at each joint Between gap, therefore, this place, entrance joint can be as judging viral one of them of infection type Condition.
For performing the number of joint, for the file not being infected, it can perform joint generally One, and after infected type virus infects, it is possible to increase the number that can perform joint, namely can hold Row non-one of number of joint, therefore, this can perform the number of joint also can be as judging viral its of infection type In a condition.
For can perform joint title, owing to the title of the performed joint of a file generally comprises fixing Several, following four is the title of commonly used performed joint: txt, dat, rsrc, loc, If not the described title performing joint, it is believed that this document is apocrypha, it is possible to infected type is sick Poison infects, and therefore, this can perform the title of joint equally can be as judging viral one of them of infection type Condition.
For the entropy of place, entrance joint, place, the entrance joint of the file not typically being infected Entropy can be in a less scope, such as, generally 2.0~3.0, if infected type is viral Infecting, this entropy typically can exceed this scope, and such as, this entropy becomes a bigger value, therefore, Judge that the entropy saved at place, entrance equally can be as an identification condition of infection type virus.
For Jie Nei position, place, entrance, infected type virus can be judged whether according to alignment relation Infecting, the feature of the entrance of the file not being infected is the typically meeting position in close alignment, If the then non-position near alignment, place, entrance Jie Nei position, it is likely that be because infection type virus Cause, therefore, using this Jie Nei position, place, entrance as infection type virus one identification condition.
By above-mentioned analysis it can be seen that all may recognize that infection by any of which Structural Characteristics Type virus.Therefore, characteristic vector pickup unit 80 can obtain wherein in said structure feature Any one or multiple.It is it is understood that identify infection type by above-mentioned multiple Structural Characteristics Virus can be more accurate, and therefore its Structural Characteristics obtained is the most, then carry out obtaining during machine learning The accuracy that training pattern carries out Viral diagnosis is the highest.
It addition, the application one embodiment can also utilize machine learning to be that each Structural Characteristics determines Weighted value, thus identify infection type virus according to the weighted value of each Structural Characteristics and correspondence.
Above are only the several instantiations in the Structural Characteristics cited by inventor, due to impossible All of Structural Characteristics is exhaustive at this, and therefore, other infected type virus infects and changes Structural Characteristics also in the protection domain of the application.
Therefore, the characteristic vector pickup unit 80 of the embodiment of the present application can only extract file to be detected The distribution frequency of the entropy of the immediate of entrance, also can extract the code command frequency of entrance simultaneously And/or Structural Characteristics.
Virus detection element 81 is to utilize infection type virus identification model feature based vector extraction unit The 80 features described above vectors extracted detect whether file to be detected is infection type virus document, namely Detect file to be detected the most infected type virus to infect.
Wherein, the training method of described infection type virus identification model is same above described in embodiment of the method, Here is omitted.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into The distribution frequency of the entropy of the immediate of mouth point, utilizes and carries out engineering acquistion based on described characteristic vector Whether the infection type virus identification model arrived, detecting described file to be detected according to described characteristic vector is Infection type virus document.Which overcome manual analysis identification and manually start rule high the asking of cost of labor Topic, and detection should be substantially increased by infection type method for detecting virus based on infection type virus identification model Speed, and can effectively detect unknown infection type virus.
It should be noted that the present invention can be carried out in the assembly of hardware at software and/or software, Such as, special IC (ASIC), general purpose computer can be used or any other is similar hard Part equipment realizes.In one embodiment, the software program of the present invention can be performed by processor To realize steps described above or function.Similarly, the software program of the present invention (includes the number being correlated with According to structure) can be stored in computer readable recording medium storing program for performing, such as, and RAM memory, magnetic Or CD-ROM driver or floppy disc and similar devices.It addition, some steps of the present invention or function can use Hardware realizes, and such as, performs the circuit of each step or function as coordinating with processor.
It addition, the part of the present invention can be applied to computer program, such as computer program Instruction, when it is computer-executed, by the operation of this computer, can call or provide basis The method of the present invention and/or technical scheme.And call the programmed instruction of the method for the present invention, may be deposited Store up fixing or movably in record medium, and/or by broadcast or other signal bearing medias Data stream and be transmitted, and/or be stored in the computer equipment that runs according to described programmed instruction In working storage.Here, include a device according to one embodiment of present invention, this device bag Include the memorizer for storing computer program instructions and for performing the processor of programmed instruction, wherein, When this computer program instructions is performed by this processor, trigger this plant running based on aforementioned according to this The method of multiple embodiments of invention and/or technical scheme.
It is obvious to a person skilled in the art that the invention is not restricted to the thin of above-mentioned one exemplary embodiment Joint, and without departing from the spirit or essential characteristics of the present invention, it is possible to concrete with other Form realizes the present invention.Therefore, no matter from the point of view of which point, embodiment all should be regarded as exemplary , and be nonrestrictive, the scope of the present invention is limited by claims rather than described above It is fixed, it is intended that all changes fallen in the implication of equivalency and scope of claim are included In the present invention.Any reference in claim should not be considered as limit involved right want Ask.Furthermore, it is to be understood that " an including " word is not excluded for other unit or step, odd number is not excluded for plural number.System In system claim, multiple unit or the device of statement can also be passed through software by a unit or device Or hardware realizes.The first, the second word such as grade is used for representing title, and is not offered as any specific Order.

Claims (14)

1. the method being used for detecting infection type virus, wherein, including:
Extract the characteristic vector of file to be detected;Described characteristic vector includes: the immediate of entrance The distribution frequency of entropy;
Utilize and carry out, based on described characteristic vector, the infection type virus identification model that machine learning obtains and depend on Detect whether described file to be detected is infection type virus document according to described characteristic vector.
Method the most according to claim 1, wherein, extracts the characteristic vector of file to be detected Including:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through Till the instruction number that function is comprised reaches specified quantity;
The immediate of all instructions that the function that statistics is traveled through is comprised;
Calculate the entropy of described immediate;
Add up the distribution frequency of the entropy of described immediate, obtain the immediate of entrance entropy point Cloth frequency.
Method the most according to claim 2, wherein, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
Method the most according to claim 1, wherein, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Method the most according to claim 3, wherein, described Structural Characteristics include with down to Few one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint Entropy, Jie Nei position, place, entrance.
6. according to the method described in claim 1 or 4, wherein, described characteristic vector also includes:
The instruction frequency of the code of entrance.
Method the most according to claim 6, wherein, extracts the characteristic vector of file to be detected Including:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through Till the instruction number that function is comprised reaches specified quantity;
The frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtains the code of entrance Instruction frequency.
8. for detect infection type virus a device, wherein, including:
For extracting the unit of the characteristic vector of file to be detected;Described characteristic vector includes: entrance The distribution frequency of entropy of immediate;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document.
Device the most according to claim 8, wherein, for extracting the feature of file to be detected The unit of vector includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the subelement of the immediate of all instructions that the function traveled through is comprised;
For calculating the subelement of the entropy of described immediate;
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance The subelement of distribution frequency.
Device the most according to claim 9, wherein, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
11. devices according to claim 8, wherein, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
12. devices according to claim 11, wherein, described Structural Characteristics includes following At least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint Entropy, Jie Nei position, place, entrance.
Device described in 13. according to Claim 8 or 11, wherein, described characteristic vector also includes:
The instruction frequency of the code of entrance.
14. devices according to claim 13, wherein, for extracting the spy of file to be detected The unit levying vector includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance The subelement of the instruction frequency of code.
CN201510038791.1A 2015-01-26 2015-01-26 Method and device used for detecting infective viruses Pending CN105893842A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510038791.1A CN105893842A (en) 2015-01-26 2015-01-26 Method and device used for detecting infective viruses

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510038791.1A CN105893842A (en) 2015-01-26 2015-01-26 Method and device used for detecting infective viruses

Publications (1)

Publication Number Publication Date
CN105893842A true CN105893842A (en) 2016-08-24

Family

ID=56999167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510038791.1A Pending CN105893842A (en) 2015-01-26 2015-01-26 Method and device used for detecting infective viruses

Country Status (1)

Country Link
CN (1) CN105893842A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992969A (en) * 2019-03-25 2019-07-09 腾讯科技(深圳)有限公司 A kind of malicious file detection method, device and detection platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100162400A1 (en) * 2008-12-11 2010-06-24 Scansafe Limited Malware detection
CN103577755A (en) * 2013-11-01 2014-02-12 浙江工业大学 Malicious script static detection method based on SVM (support vector machine)
CN103685307A (en) * 2013-12-25 2014-03-26 北京奇虎科技有限公司 Method, system, client and server for detecting phishing fraud webpage based on feature library
CN104077524A (en) * 2013-03-25 2014-10-01 腾讯科技(深圳)有限公司 Training method used for virus identification and virus identification method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100162400A1 (en) * 2008-12-11 2010-06-24 Scansafe Limited Malware detection
CN104077524A (en) * 2013-03-25 2014-10-01 腾讯科技(深圳)有限公司 Training method used for virus identification and virus identification method and device
CN103577755A (en) * 2013-11-01 2014-02-12 浙江工业大学 Malicious script static detection method based on SVM (support vector machine)
CN103685307A (en) * 2013-12-25 2014-03-26 北京奇虎科技有限公司 Method, system, client and server for detecting phishing fraud webpage based on feature library

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗文华: "基于逆向技术的恶意程序检测方法研究", 《警察技术》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109992969A (en) * 2019-03-25 2019-07-09 腾讯科技(深圳)有限公司 A kind of malicious file detection method, device and detection platform
CN109992969B (en) * 2019-03-25 2023-03-21 腾讯科技(深圳)有限公司 Malicious file detection method and device and detection platform

Similar Documents

Publication Publication Date Title
AU2019203208B2 (en) Duplicate and similar bug report detection and retrieval using neural networks
US9384385B2 (en) Face recognition using gradient based feature analysis
KR101228899B1 (en) Method and Apparatus for categorizing and analyzing Malicious Code Using Vector Calculation
JP2018173890A (en) Information processing device, information processing method, and program
CN103761476A (en) Characteristic extraction method and device
US20150227364A1 (en) Technique for plagiarism detection in program source code files based on design pattern
CN103886229B (en) Method and device for extracting PE file features
CN107292168A (en) Detect method and device, the server of program code
CN104850493A (en) Method and device for detecting loophole of source code
JP6987209B2 (en) Duplicate document detection method and system using document similarity measurement model based on deep learning
WO2019194343A1 (en) Mobile apparatus and method of classifying sentence into plurality of classes
EP4113463A1 (en) Methods, systems, articles of manufacture and apparatus to decode receipts based on neural graph architecture
CN104680065A (en) Virus detection method, virus detection device and virus detection equipment
CN107193732A (en) A kind of verification function locating method compared based on path
Zhu et al. Determining image base of firmware files for ARM devices
CN111600894A (en) Network attack detection method and device
US20220318383A1 (en) Methods and apparatus for malware classification through convolutional neural networks using raw bytes
CN105631336B (en) Detect the system and method for the malicious file in mobile device
KR101541603B1 (en) Method and apparatus for determing plagiarism of program using control flow graph
JPWO2019092868A1 (en) Information processing equipment, information processing methods and programs
CN105893842A (en) Method and device used for detecting infective viruses
KR102299525B1 (en) Product Evolution Mining Method And Apparatus Thereof
KR102053869B1 (en) Method and apparatus for detecting malignant code of linux environment
CN104657662B (en) Method and device for detecting infection type virus
Liu et al. Exploring sensor usage behaviors of android applications based on data flow analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160824