CN105893843A - Method and device used for detecting infective viruses - Google Patents
Method and device used for detecting infective viruses Download PDFInfo
- Publication number
- CN105893843A CN105893843A CN201510038898.6A CN201510038898A CN105893843A CN 105893843 A CN105893843 A CN 105893843A CN 201510038898 A CN201510038898 A CN 201510038898A CN 105893843 A CN105893843 A CN 105893843A
- Authority
- CN
- China
- Prior art keywords
- entrance
- entropy
- type virus
- characteristic vector
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/567—Computer malware detection or handling, e.g. anti-virus arrangements using dedicated hardware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/564—Static detection by virus signature recognition
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Virology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention provides a method and a device for detecting infective viruses. The method comprises the following steps: extracting a feature vector of a file to be detected, wherein the feature vector comprises the instruction frequency of a code of an entry point; and utilizing an infective virus identification model obtained by machine learning on the basis of the feature vector to detect whether the file to be detected is an infective virus file or not. The problems of high manual cost of manual analysis identification and manual rule startup are overcome, and the method for detecting the infective viruses on the basis of the infective virus identification model greatly improves detection speed, and can effectively detect unknown infective viruses.
Description
Technical field
The present invention relates to computer realm, particularly relate to a kind of for detect infection type virus method and
Device.
Background technology
Infection type virus is a type of virus that in virus, mutation is most.In prior art, for
Infection type virus mostly uses manual analysis to mate or manually starts the modes such as rule and detects.Due to sense
Dye type virus during the viral code propagating self, can constantly vary the generation of virus itself
Code form and execution logic, accordingly, it would be desirable to artificial more feature or the rule of constantly adding is to reach to carry
The purpose of the recall rate of high infection type virus, this is accomplished by putting into substantial amounts of human resources, comes at manual
Manage this infection type being continually changing virus.This manual analysis is mated or manually starts the mode of rule not
But there is the problem that human cost is high, and Viral diagnosis speed is difficult to ensure that, it is also difficult to promptly and accurately
Find unknown infection type virus.
Summary of the invention
One of present invention solves the technical problem that the method and dress being to provide for detecting infection type virus
Put, while reducing human cost, detect infection type virus fast and accurately.
An embodiment according to an aspect of the present invention, it is provided that a kind of for detecting infection type virus
Method, including:
Extract the characteristic vector of file to be detected;Described characteristic vector includes: the finger of the code of entrance
Make frequency;
Utilize and carry out, based on described characteristic vector, the infection type virus identification model that machine learning obtains and depend on
Detect whether described file to be detected is infection type virus document according to described characteristic vector.
Alternatively, the characteristic vector extracting file to be detected includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtains the code of entrance
Instruction frequency.
Alternatively, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Alternatively, described Structural Characteristics include following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Alternatively, described characteristic vector also includes:
The distribution frequency of the entropy of the immediate of entrance.
Alternatively, the characteristic vector extracting file to be detected includes:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The immediate of all instructions that the function that statistics is traveled through is comprised;
Calculate the entropy of described immediate;
Add up the distribution frequency of the entropy of described immediate, obtain the immediate of entrance entropy point
Cloth frequency.
Alternatively, described entropy includes:
Calculate the binary system entropy of described immediate, decimal scale entropy and hexadecimal entropy.
An embodiment according to a further aspect of the invention, it is provided that one is used for detecting infection type virus
Device, including:
For extracting the unit of the characteristic vector of file to be detected;Described characteristic vector includes: entrance
The instruction frequency of code;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing
Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document.
Alternatively, the unit of the characteristic vector for extracting file to be detected includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance
The subelement of the instruction frequency of code.
Alternatively, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Alternatively, described Structural Characteristics include following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Alternatively, described characteristic vector also includes:
The distribution frequency of the entropy of the immediate of entrance.
Alternatively, the unit of the characteristic vector for extracting file to be detected includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the subelement of the immediate of all instructions that the function traveled through is comprised;
For calculating the subelement of the entropy of described immediate;
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance
The subelement of distribution frequency.
Alternatively, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The instruction frequency of the code of mouth point, utilizes and carries out, based on described characteristic vector, the infection that machine learning obtains
According to described characteristic vector, type virus identification model, detects whether described file to be detected is that infection type is sick
Poison file.Which overcome manual analysis identification and manually start the problem that rule cost of labor is high, and should
Infection type method for detecting virus based on infection type virus identification model substantially increases detection speed, and
Can effectively detect unknown infection type virus.
Although those of ordinary skill in the art it will be appreciated that detailed description below by referenced in schematic embodiment,
Accompanying drawing is carried out, but the present invention is not limited in these embodiments.But, the scope of the present invention is extensive
, and it is intended to be bound only by appended claims restriction the scope of the present invention.
Accompanying drawing explanation
The detailed description that non-limiting example is made made with reference to the following drawings by reading, this
The other features, objects and advantages of invention will become more apparent upon:
Fig. 1 is according to an embodiment of the invention for detecting the flow chart of the method for infection type virus.
Fig. 2 is the stream of infection type virus identification model training method in accordance with another embodiment of the present invention
Cheng Tu.
Fig. 3 is in accordance with another embodiment of the present invention for detecting the flow process of the method for infection type virus
Figure.
Fig. 4 is the stream of infection type virus identification model training method in accordance with another embodiment of the present invention
Cheng Tu.
Fig. 5 is to use depth-first principle traversal entry point instruction to show according to an embodiment of the invention
It is intended to.
Fig. 6 is the instruction frequency scatter chart of the code of entrance according to an embodiment of the invention.
Fig. 7 is the immediate of the entrance not being infected file according to an embodiment of the invention
The entropy of immediate of entrance of distribution frequency and infected type virus infected file of entropy
Distribution frequency contrast schematic diagram.
Fig. 8 is to show for detecting the structure of device of infection type virus according to an embodiment of the invention
It is intended to.
Fig. 9 is the structural representation of characteristic vector pickup unit according to an embodiment of the invention.
Figure 10 is the structural representation of characteristic vector pickup unit in accordance with another embodiment of the present invention.
In accompanying drawing, same or analogous reference represents same or analogous parts.
Detailed description of the invention
Infection type virus is in program or the dynamic library file (one of DLL) that self is added in other,
Thus realize the function run with infected Program Synchronization, and then destroy and self infecting computer
Propagate.Infection type virus, due to the characteristic of himself, needs to be attached on other host programs transport
OK, and in order to hide the killing of antivirus software, self all can be split, become by usual infection type virus
After shape or encryption, then self part or all is attached on host program.An once disease
Poison file performs, and the most program files in system are probably just all added viral code by it,
And then it is broadcast to other computer, therefore, the method for artificial cognition is difficult to identify sense fast and accurately
Dye type virus, and the more difficult infection type virus finding the unknown.The embodiment of the present application is for infection type virus
Proposing a kind of detection method, the method is to detect sense based on the infection type virus identification model trained
Dye type virus.
Below in conjunction with the accompanying drawings the present invention is described in further detail.
Fig. 1 is according to an embodiment of the invention for detecting the flow chart of the method for infection type virus.
Method in the present invention is mainly completed by the operating system in computer equipment or processing controller.
Operating system or processing controller are referred to as the device being used for detecting infection type virus.This computer equipment
Include but not limited at least one of the following: subscriber equipment, the network equipment.Subscriber equipment include but
It is not limited to computer, smart mobile phone, PDA etc..The network equipment includes but not limited to single network service
The server group or based on cloud computing by a large amount of computers or network of device, multiple webserver composition
The cloud that server is constituted, wherein, cloud computing is the one of Distributed Calculation, loosely-coupled by a group
One super virtual machine of computer collection composition.
As shown in fig. 1, this method being used for detecting infection type virus mainly comprises the steps:
S100, extract the characteristic vector of file to be detected;Described characteristic vector includes: the generation of entrance
The instruction frequency of code;
S110, utilization carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains
According to described characteristic vector, type detects whether described file to be detected is infection type virus document.
Firstly, it is necessary to explanation, performing the operation of infection type Viral diagnosis is to know based on infection type virus
Other model realizes, it is, before performing the operation of this infection type Viral diagnosis, need to train
One infection type virus identification model.But during owing to being not to perform the operation of infection type Viral diagnosis every time
Being required for performing this training operation, therefore, the operation of this training infection type virus identification model is not
The steps necessary of detection infection type virus.Introduce the training of lower infection type virus identification model first below
Method.As shown in Figure 2, the infection type virus identification model instruction provided for one embodiment of the application
Practicing the flow chart of method, this training method can comprise the steps:
S200, obtain infected type virus infect infection type Virus Sample;
The method and quantity that obtain this infection type Virus Sample are not particularly limited by the embodiment of the present application,
And it is understood that its infection type Virus Sample quantity obtained is the most, then the infection type trained
Virus identifies that the accuracy of Model Identification virus is the highest.
In addition, it is necessary to explanation, the training infection type virus identification model that the embodiment of the present application provides
Method, the infection type Virus Sample that can be based only upon acquisition trains, it is, training process is only
Black file is used to complete;Can also infection type Virus Sample based on 1: 1 and non-infection Virus Sample
Train, it is, training process uses black file to complete with the ratio of text of an annotated book part 1: 1.Herein
The file that described black file the most infected type virus infects, text of an annotated book part is be not infected normal
File.
S210, the characteristic vector of extraction infection type Virus Sample, described characteristic vector includes: entrance
The instruction frequency of code;
Likely revising due to infection type virus must be through flow process, and therefore, the embodiment of the present application can be by carrying
The instruction frequency of the code of taking mouth point, carries out machine learning, thus identifies that infection type virus is to must
Whether revise through flow process.
Wherein, the instruction frequency of the code extracting entrance need to first navigate to entrance, and extracts entrance
The instruction of point, then add up it and respectively instruct the frequency of occurrences.What the embodiment of the present application provided extracts entrance
The method of the instruction frequency of code includes:
First, use depth-first principle to begin stepping through the function of designated depth from entrance, until institute
Till the instruction number that the function of traversal is comprised reaches specified quantity;
Afterwards, the frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtain entrance
The instruction frequency of code.
The schematic diagram of concrete traversal method as shown in Figure 5, the instruction generation of decompiling entrance one by one
Code, the point of each represented by circles is the position redirecting function call place, respectively with c1, c2, c3 ...
Representing, using depth-first principle to begin stepping through from c1, run into call function, depth value adds 1, and
Enter function;If depth value reaches designated value (i.e. designated depth), such as, arrive prescribed depth value 4,
Then to running into call function, its depth value no longer adds 1, only records function name, and does not enter function,
Until the instruction in all functions traveled through reaches specified quantity, such as 2000, then travel through.
To dotted arrow in the order such as Fig. 5 of the point that entrance employing depth-first principle in Fig. 5 travels through according to this
Shown in, particularly as follows: c1-c2-c4-c8.If it should be noted that on not up to rule in convenience processes
In the case of depthkeeping angle value, the instruction number that the function traveled through comprises has reached specified quantity, then can stop
Only traversal, it is not necessary to traverse prescribed depth value.Such as, when facility to c3, the function traveled through
The instruction number that c1, c2 and c3 are comprised reaches specified quantity 2000, then stop traversal, no
Travel through c4 again.
The instruction of specified quantity at entrance can be got, its instruction obtained by above-mentioned traversing operation
Following information can be included: ID that instruction name, instruction occurrence number, instruction are corresponding etc..And it is logical
Crossing depth-first principle and perform this traversing operation can be follow-up effectively to find infection type virus whether to must
Modify through flow process and provide convenient.
As shown in Figure 6, the frequency of occurrences of instruction, the wherein horizontal seat of this curve can be represented by curve
Mark represents the ID that each instruction is corresponding, and vertical coordinate represents the occurrence number of instruction.Such as, for add,
Adc, mov tri-instruction, the ID of its correspondence can be respectively defined as 1,2,3.
The instruction frequency of the code of the entrance of infection type Virus Sample can be extracted by aforesaid operations, make
It is characterized vector.
The machine learning classification algorithm that S220, utilization are preset calculates, and obtains infection type virus identification
Model.
This step is namely by above-mentioned acquired sample, and the characteristic vector extracted is input to machine
In the sorting algorithm of study, thus obtain infection type virus identification model.
The sorting algorithm used is not particularly limited by the embodiment of the present application, and it can use existing
Any one sorting algorithm, such as decision Tree algorithms, SVM (Support Vector Machine, support to
Amount machine) algorithm etc..
The infection type virus identification mould for detecting infection type virus has been obtained by above-mentioned training method
Type.
Do further below for each step S100 in above-mentioned infection type method for detecting virus~S110
Illustrate.
Wherein step S100, is the characteristic vector extracting file to be detected;Described characteristic vector includes:
The instruction frequency of the code of entrance;
It is understood that the mould that the characteristic vector extracted when detecting infection type virus uses with it
The characteristic vector that type extracts when training is identical.Therefore the literary composition to be detected that step S100 is extracted
The characteristic vector of part also includes: the instruction frequency of the code of entrance.
Extracting method for the instruction frequency of the code of entrance is known with training infection type virus above
Described in the introduction of other model, do not repeat them here.
Step S110 is to utilize infection type virus identification model to examine based on the features described above vector extracted
Survey whether file to be detected is infection type virus document, namely detect file to be detected the most infected
Type virus infects.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The instruction frequency of the code of mouth point, utilizes and carries out, based on described characteristic vector, the infection that machine learning obtains
According to described characteristic vector, type virus identification model, detects whether described file to be detected is that infection type is sick
Poison file.Which overcome manual analysis identification and manually start the problem that rule cost of labor is high, and should
Infection type method for detecting virus based on infection type virus identification model substantially increases detection speed, and
Can effectively detect unknown infection type virus.
What another embodiment of the application provided is used for detecting the method for infection type virus as shown in Figure 3,
It can comprise the steps:
S300, extract the characteristic vector of file to be detected;Described characteristic vector includes: the generation of entrance
The instruction frequency of code, also includes: Structural Characteristics that easy infected type virus infects and/or entrance
The distribution frequency of the entropy of immediate;
S310, utilization carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains
According to described characteristic vector, type detects whether described file to be detected is infection type virus document.
Same, it is real based on infection type virus identification model for performing the operation of this infection type Viral diagnosis
Existing, it is, before performing the operation of this infection type Viral diagnosis, need to train an infection type
Virus identifies model.But owing to being required for performing when being not and perform the operation of infection type Viral diagnosis every time
This training operates, and therefore, the operation of this training infection type virus identification model is not detection infection type
The steps necessary of virus.Introduce the training method of lower infection type virus identification model first below.Such as figure
Shown in 4, for the stream of the infection type virus identification model training method that another embodiment of the application provides
Cheng Tu, this training method can comprise the steps:
S400, obtain infected type virus infect infection type Virus Sample;
The method and quantity that obtain this infection type Virus Sample are not particularly limited by the embodiment of the present application,
And it is understood that its infection type Virus Sample quantity obtained is the most, then the infection type trained
Virus identifies that the accuracy of Model Identification virus is the highest.The training infection type that the embodiment of the present application provides is sick
Poison identifies the method for model, and the infection type Virus Sample that can be based only upon acquisition is trained, it is,
Training process completes only with black file;Can also infection type Virus Sample based on 1: 1 and non-sense
Dye Virus Sample is trained, it is, training process uses black file to come with the ratio of text of an annotated book part 1: 1
Complete.The file that black file the most infected type virus described herein infects, text of an annotated book part is the most viral
The normal file infected.
S410, the characteristic vector of extraction infection type Virus Sample, described characteristic vector includes: entrance
The instruction frequency of code, also include: Structural Characteristics that easy infected type virus infects and/or entrance
The distribution frequency of the entropy of the immediate of point;
Likely revising due to infection type virus must be through flow process, and therefore, the embodiment of the present application can be by carrying
The instruction frequency of the code of taking mouth point, carries out machine learning, thus identifies that infection type virus is to must
Whether revise through flow process.
Wherein, the instruction frequency of the code extracting entrance need to first navigate to entrance, and extracts entrance
The instruction of point, then add up it and respectively instruct the frequency of occurrences.What the embodiment of the present application provided extracts entrance
The method of the instruction frequency of code includes:
First, use depth-first principle to begin stepping through the function of designated depth from entrance, until institute
Till the instruction number that the function of traversal is comprised reaches specified quantity;
Afterwards, the frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtain entrance
The instruction frequency of code.
The schematic diagram of concrete traversal method as shown in Figure 5, the instruction generation of decompiling entrance one by one
Code, the point of each represented by circles is the position redirecting function call place, respectively with c1, c2, c3 ...
Representing, using depth-first principle to begin stepping through from c1, run into call function, depth value adds 1, and
Enter function;If depth value reaches designated value (i.e. designated depth), such as, arrive prescribed depth value 4,
Then to running into call function, its depth value no longer adds 1, only records function name, and does not enter function,
Until the instruction in all functions traveled through reaches specified quantity, such as 2000, then travel through.
To dotted arrow in the order such as Fig. 5 of the point that entrance employing depth-first principle in Fig. 5 travels through according to this
Shown in, particularly as follows: c1-c2-c4-c8.If it should be noted that on not up to rule in convenience processes
In the case of depthkeeping angle value, the instruction number that the function traveled through comprises has reached specified quantity, then can stop
Only traversal, it is not necessary to traverse prescribed depth value.Such as, when facility to c3, the function traveled through
The instruction number that c1, c2 and c3 are comprised reaches specified quantity 2000, then stop traversal, no
Travel through c4 again.
The instruction of specified quantity at entrance can be got, its instruction obtained by above-mentioned traversing operation
Following information can be included: ID that instruction name, instruction occurrence number, instruction are corresponding etc..And it is logical
Crossing depth-first principle and perform this traversing operation can be follow-up effectively to find infection type virus whether to must
Modify through flow process and provide convenient.
As shown in Figure 6, the frequency of occurrences of instruction, the wherein horizontal seat of this curve can be represented by curve
Mark represents the ID that each instruction is corresponding, and vertical coordinate represents the occurrence number of instruction.Such as, for add,
Adc, mov tri-instruction, the ID of its correspondence can be respectively defined as 1,2,3.
The instruction frequency of the code of the entrance of infection type Virus Sample can be extracted by aforesaid operations, make
It is characterized vector.
The characteristic vector that the embodiment of the present application is extracted may also include that dividing of the entropy of the immediate of entrance
Cloth frequency, carries out machine learning, thus identifies that infection type virus is to whether revising through flow process.
Immediate described in the embodiment of the present application is defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8
Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is
16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction
Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just
The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy
Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to
Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively
The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical
The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond prescribed limit, then can be recognized
Infect for this document the most infected type virus.Therefore, the embodiment of the present application also can extract entrance
The distribution frequency of entropy of immediate.Wherein, the distribution frequency of the entropy of the immediate of entrance is extracted
The method of rate includes: use depth-first principle to begin stepping through the function of designated depth from entrance, directly
The instruction number comprised to the function traveled through reaches specified quantity;The function that statistics is traveled through
The immediate of all instructions comprised;Calculate the entropy of described immediate;Add up described immediate
The distribution frequency of entropy.
Visible, when extracting the distribution frequency of the entropy of immediate of entrance, also it is to use the degree of depth excellent
First principle traversal entrance, obtains the immediate of each instruction.Calculate the entropy of each immediate the most again
Value is when wherein calculating the entropy of immediate, right owing to cannot accurately identify its institute for an immediate
The numerical value that should represent specifically uses the affiliated system of the representation of which kind of system, therefore can count simultaneously
Calculate the binary system entropy of this immediate, decimal scale entropy and hexadecimal entropy, if manually write
The non-immediate infecting virus, the entropy that must have the immediate of a kind of system is smaller..The most viral
The distribution frequency of the entropy of the immediate of the entrance of infected file and infected type virus infected file
Entrance immediate entropy distribution frequency contrast schematic diagram as shown in Figure 7, horizontal in Fig. 7
The entropy of coordinate representation immediate, vertical coordinate represents the number of times that each entropy occurs.Can be seen by Fig. 7
The occurrence number of the high entropy going out the file that infected type virus infects is more.
Further, since infected type virus infect file some Structural Characteristics can relative to not by
The file infected changes, and the Structural Characteristics wherein changed is referred to as easy infected type virus
The Structural Characteristics infected.Therefore, the embodiment of the present application can obtain these knots when carrying out machine learning
Structure feature.
The Structural Characteristics that easy to be infected type virus described in the embodiment of the present application infects include with down to
Few one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Saving for place, above-mentioned entrance, for the file not being infected, its place, entrance is saved
General at first joint, and after infected type virus infects, it is possible at last joint or at each joint
Between gap, therefore, this place, entrance joint can be as judging viral one of them of infection type
Condition.
For performing the number of joint, for the file not being infected, it can perform joint generally
One, and after infected type virus infects, it is possible to increase the number that can perform joint, namely can hold
Row non-one of number of joint, therefore, this can perform the number of joint also can be as judging viral its of infection type
In a condition.
For can perform joint title, owing to the title of the performed joint of a file generally comprises fixing
Several, following four is the title of commonly used performed joint: txt, dat, rsrc, loc,
If not the described title performing joint, it is believed that this document is apocrypha, it is possible to infected type is sick
Poison infects, and therefore, this can perform the title of joint equally can be as judging viral one of them of infection type
Condition.
For the entropy of place, entrance joint, place, the entrance joint of the file not typically being infected
Entropy can be in a less scope, such as, generally 2.0~3.0, if infected type is viral
Infecting, this entropy typically can exceed this scope, and such as, this entropy becomes a bigger value, therefore,
Judge that the entropy saved at place, entrance equally can be as an identification condition of infection type virus.
For Jie Nei position, place, entrance, infected type virus can be judged whether according to alignment relation
Infecting, the feature of the entrance of the file not being infected is the typically meeting position in close alignment,
If the then non-position near alignment, place, entrance Jie Nei position, it is likely that be because infection type virus
Cause, therefore, using this Jie Nei position, place, entrance as infection type virus one identification condition.
By above-mentioned analysis it can be seen that all may recognize that infection by any of which Structural Characteristics
Type virus.Therefore, said structure feature can obtain any of which or multiple.It is appreciated that
, identify that infection type virus can be more accurate by above-mentioned multiple Structural Characteristics, therefore it obtains
Structural Characteristics the most, then the training pattern carrying out obtaining during machine learning carries out the standard of Viral diagnosis
Exactness is the highest.
It addition, the application one embodiment can also utilize machine learning to be that each Structural Characteristics determines
Weighted value, thus identify infection type virus according to the weighted value of each Structural Characteristics and correspondence.
Above are only the several instantiations in the Structural Characteristics cited by inventor, due to impossible
All of Structural Characteristics is exhaustive at this, and therefore, other infected type virus infects and changes
Structural Characteristics also in the protection domain of the application.
Aforesaid operations obtains the characteristic vector that infection type Virus Sample needs to extract.
The machine learning classification algorithm that S420, utilization are preset calculates, and obtains infection type virus identification
Model.
This step is namely by above-mentioned acquired sample, and the characteristic vector extracted is input to machine
In the sorting algorithm of study, thus obtain infection type virus identification model.
The sorting algorithm used is not particularly limited by the embodiment of the present application, and it can use existing
Any one sorting algorithm, such as decision Tree algorithms, SVM (Support Vector Machine, support to
Amount machine) algorithm etc..
The infection type virus identification mould for detecting infection type virus has been obtained by above-mentioned training method
Type.
Do further below for each step S300 in above-mentioned infection type method for detecting virus~S310
Illustrate.
Wherein step S300, is the characteristic vector extracting file to be detected;Described characteristic vector includes:
The instruction frequency of the code of entrance, also includes: easy infected type virus infect Structural Characteristics and
/ or the distribution frequency of entropy of immediate of entrance;
It is understood that the mould that the characteristic vector extracted when detecting infection type virus uses with it
The characteristic vector that type extracts when training is identical.Therefore the literary composition to be detected that step S300 is extracted
The characteristic vector extracted when the characteristic vector of part and above-mentioned infection type virus identification model is identical.Its
Including the instruction frequency of the code of entrance, the method for the instruction frequency extracting the code of this entrance is same
Above described in training infection type virus identification model.
Described Structural Characteristics include but not limited to following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
The change that above-mentioned each Structural Characteristics is occurred after infected type virus infects, at this no longer
Repeat.
May also include the distribution frequency of the entropy of the immediate of entrance, for the immediate of entrance
Here is omitted for the acquisition methods of the distribution frequency of entropy, described in previous step S410.
Step S310 is to utilize infection type virus identification model to examine based on the features described above vector extracted
Survey whether file to be detected is infection type virus document, namely detect file to be detected the most infected
Type virus infects.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The instruction frequency of code of mouthful point, also includes: Structural Characteristics that easy infected type virus infects and/
Or the distribution frequency of the entropy of the immediate of entrance, utilize and carry out engineering based on described characteristic vector
The infection type virus identification model that acquistion is arrived, detecting described file to be detected according to described characteristic vector is
No for infection type virus document.Which overcome manual analysis identification and manually start rule cost of labor height
Problem, and should infection type method for detecting virus based on infection type virus identification model substantially increase
Detection speed, and can effectively detect unknown infection type virus.
Based on the thinking that said method is same, the embodiment of the present application also provides for a kind of for detecting infection type
The device of virus, as shown in Figure 8, for this device one example structure schematic diagram, this device master
Including:
For extracting the unit 80 of the characteristic vector of file to be detected;Described characteristic vector includes: entrance
The instruction frequency of the code of point, hereinafter referred to as characteristic vector pickup unit 80;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing
Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document
81, hereinafter referred to as virus detection element 81.
Below the function of said two units is described in further detail.
Likely revising due to infection type virus must be through flow process, therefore, and the embodiment of the present application characteristic vector
Extraction unit 80 can be used as infection type Viral diagnosis by extracting the instruction frequency of the code of entrance
Characteristic vector, thus identify infection type virus to whether revising through flow process.Therefore, the application one
Plant the structure of this this feature vector extraction unit 80 of embodiment as shown in Figure 9, can farther include
Following subelement is for extracting the instruction frequency of the code of this entrance:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement 801 till specified quantity, hereinafter referred to as enters
Subelement 801 is extracted in mouth point instruction;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance
The subelement 802 of the instruction frequency of code, hereinafter referred to as entry point instruction frequency statistics subelement 802.
Wherein, the instruction frequency of the code that entry point instruction extraction subelement 801 extracts entrance needs elder generation
Navigate to entrance, and extract the instruction of entrance.The entry point instruction that the embodiment of the present application provides carries
The method of instruction taking the code that subelement 801 extracts entrance includes:
Use depth-first principle, begin stepping through the function of designated depth from entrance, until being traveled through
The instruction number that comprised of all functions reach specified quantity till.
The schematic diagram of concrete traversal method as shown in Figure 5, the instruction generation of decompiling entrance one by one
Code, the point of each represented by circles is the position redirecting function call place, respectively with c1, c2, c3 ...
Representing, using depth-first principle to begin stepping through from c1, run into call function, depth value adds 1, and
Enter function;If depth value reaches designated value (i.e. designated depth), such as, arrive prescribed depth value 4,
Then to running into call function, its depth value no longer adds 1, only records function name, and does not enter function,
Until the instruction in all functions traveled through reaches specified quantity, such as 2000, then travel through.
The order of the point that entrance employing depth-first principle in Fig. 5 travels through according to this be should be: c1-c2-c4-c8.
If it should be noted that in convenience processes in the case of not up to prescribed depth value, the letter traveled through
The instruction number that number comprises has reached specified quantity, then can stop traversal, it is not necessary to traverse prescribed depth
Value.Such as, when facility to c3, the instruction number that function c1, c2 and the c3 traveled through is comprised
Amount reaches specified quantity 2000, then stop traversal, no longer travel through c4.
The instruction of specified quantity at entrance can be got, its instruction obtained by above-mentioned traversing operation
Following information can be included: ID that instruction name, instruction occurrence number, instruction are corresponding etc..And it is logical
Crossing depth-first principle and perform this traversing operation can be follow-up effectively to find infection type virus whether to must
Modify through flow process and provide convenient.
Entry point instruction frequency statistics subelement 802 can be added up entry point instruction and extract subelement 801 institute
The frequency of occurrences of all instructions that the function of traversal is comprised, obtains the instruction frequency of the code of entrance.
As shown in Figure 6, the frequency of occurrences of described instruction, the wherein horizontal seat of this curve can be represented by curve
Mark represents the ID that each instruction is corresponding, and vertical coordinate represents the occurrence number of instruction.Such as, for add,
Adc, mov tri-instruction, the ID of its correspondence can be respectively defined as 1,2,3.
Further, since infected type virus infect file some Structural Characteristics can relative to not by
The file infected changes, and the Structural Characteristics wherein changed is referred to as easy infected type virus
The Structural Characteristics infected.Therefore, the embodiment of the present application characteristic vector pickup unit 80 also can extract
The Structural Characteristics of file to be detected, its extracted Structural Characteristics is that easy infected type virus infects
Structural Characteristics, including following at least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Saving for place, above-mentioned entrance, for the file not being infected, its place, entrance is saved
General at first joint, and after infected type virus infects, it is possible at last joint or at each joint
Between gap, therefore, this place, entrance joint can be as judging viral one of them of infection type
Condition.
For performing the number of joint, for the file not being infected, it can perform joint generally
One, and after infected type virus infects, it is possible to increase the number that can perform joint, namely can hold
Row non-one of number of joint, therefore, this can perform the number of joint also can be as judging viral its of infection type
In a condition.
For can perform joint title, owing to the title of the performed joint of a file generally comprises fixing
Several, following four is the title of commonly used performed joint: txt, dat, rsrc, loc,
If not the described title performing joint, it is believed that this document is apocrypha, it is possible to infected type is sick
Poison infects, and therefore, this can perform the title of joint equally can be as judging viral one of them of infection type
Condition.
For the entropy of place, entrance joint, place, the entrance joint of the file not typically being infected
Entropy can be in a less scope, such as, generally 2.0~3.0, if infected type is viral
Infecting, this entropy typically can exceed this scope, and such as, this entropy becomes a bigger value, therefore,
Judge that the entropy saved at place, entrance equally can be as an identification condition of infection type virus.
For Jie Nei position, place, entrance, infected type virus can be judged whether according to alignment relation
Infecting, the feature of the entrance of the file not being infected is the typically meeting position in close alignment,
If the then non-position near alignment, place, entrance Jie Nei position, it is likely that be because infection type virus
Cause, therefore, using this Jie Nei position, place, entrance as infection type virus one identification condition.
By above-mentioned analysis it can be seen that all may recognize that infection by any of which Structural Characteristics
Type virus.Therefore, characteristic vector pickup unit 80 can obtain wherein in said structure feature
Any one or multiple.It is it is understood that identify infection type by above-mentioned multiple Structural Characteristics
Virus can be more accurate, and therefore its Structural Characteristics obtained is the most, then carry out obtaining during machine learning
The accuracy that training pattern carries out Viral diagnosis is the highest.
It addition, the application one embodiment can also utilize machine learning to be that each Structural Characteristics determines
Weighted value, thus identify infection type virus according to the weighted value of each Structural Characteristics and correspondence.
Above are only the several instantiations in the Structural Characteristics cited by inventor, due to impossible
All of Structural Characteristics is exhaustive at this, and therefore, other infected type virus infects and changes
Structural Characteristics also in the protection domain of the application.
For detection infection type virus to must be through the amendment of flow process, the another embodiment of the application be also by carrying
The distribution frequency of the entropy of the immediate of taking mouth point realizes, wherein, described in the embodiment of the present application
Immediate be defined as follows:
Generally the number provided in immediate addressing mode instruction is called immediate.Immediate can be 8
Position, 16 or 32, this numerical value is after operation code (i.e. instruction).If immediate is
16 or 32, then, it will be stored by the principle of " height is low ".Such as:
MOV AH, 80H ADD AX, 1234H MOV ECX, 123456H
MOV B1,12H MOV W1,3456H ADD D1,32123456H
Wherein: B1, W1 and D1 are byte, word and double-word location respectively.More than the in instruction
Two operands (source operand) are all immediates.
In theory of information, entropy is to probabilistic a kind of tolerance.Quantity of information is the biggest, and uncertainty is just
The least, entropy is the least;Quantity of information is the least, and uncertainty is the biggest, and entropy is the biggest.Spy according to entropy
Property, we can judge the randomness of an event and unordered degree by calculating entropy, it is also possible to
Judge the dispersion degree of certain index with entropy, the dispersion degree of index is the biggest, and this index is to comprehensively
The impact evaluated is the biggest.
Owing to the entropy of the immediate of the normal file not being infected is generally less value, if vertical
The entropy i.e. counted is higher, and the number of times that immediate as higher in entropy occurs beyond prescribed limit, then can be recognized
Infect for this document the most infected type virus.
For extracting the distribution frequency of the entropy of the immediate of entrance, described characteristic vector pickup unit 80
Can be as shown in Figure 10, farther include following subelement:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement 803 till specified quantity, hereinafter referred to as enters
Subelement 803 is extracted in mouth point instruction;Visible, at the distribution frequency of the entropy of the immediate extracting entrance
During rate, also it is the principle traversal entrance using depth-first, obtains the immediate of each instruction.
For adding up the subelement 804 of the immediate of all instructions that the function traveled through is comprised, with
Lower abbreviation immediate statistics subelement 804;
For calculating the entropy meter of the subelement 805 of the entropy of described immediate, hereinafter referred to as immediate
Operator unit 805;Wherein owing to cannot accurately identify the number represented corresponding to it for an immediate
Value specifically uses the affiliated system of the representation of which kind of system, and therefore the entropy of immediate calculates son
Unit 805 can calculate the binary system entropy of this immediate, decimal scale entropy and hexadecimal entropy simultaneously,
If the immediate of the non-infection virus manually write, the entropy of the immediate of a kind of system must be had to compare
Little.
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance
The subelement 806 of distribution frequency, hereinafter referred to as distribution frequency statistics subelement 806.The most viral
The distribution frequency of the entropy of the immediate of the entrance of infected file and infected type virus infected file
Entrance immediate entropy distribution frequency contrast schematic diagram as shown in Figure 7, pass through Fig. 7
Can be seen that the occurrence number of the high entropy of the file that infected type virus infects is more.
Virus detection element 81 is to utilize infection type virus identification model feature based vector extraction unit
The 80 features described above vectors extracted detect whether file to be detected is infection type virus document, namely
Detect file to be detected the most infected type virus to infect.
Wherein, the training method of described infection type virus identification model is same above described in embodiment of the method,
Here is omitted.
The embodiment of the present application by extract file to be detected characteristic vector, described characteristic vector include into
The instruction frequency of the code of mouth point, utilizes and carries out, based on described characteristic vector, the infection that machine learning obtains
According to described characteristic vector, type virus identification model, detects whether described file to be detected is that infection type is sick
Poison file.Which overcome manual analysis identification and manually start the problem that rule cost of labor is high, and should
Infection type method for detecting virus based on infection type virus identification model substantially increases detection speed, and
Can effectively detect unknown infection type virus.
It should be noted that the present invention can be carried out in the assembly of hardware at software and/or software,
Such as, special IC (ASIC), general purpose computer can be used or any other is similar hard
Part equipment realizes.In one embodiment, the software program of the present invention can be performed by processor
To realize steps described above or function.Similarly, the software program of the present invention (includes the number being correlated with
According to structure) can be stored in computer readable recording medium storing program for performing, such as, and RAM memory, magnetic
Or CD-ROM driver or floppy disc and similar devices.It addition, some steps of the present invention or function can use
Hardware realizes, and such as, performs the circuit of each step or function as coordinating with processor.
It addition, the part of the present invention can be applied to computer program, such as computer program
Instruction, when it is computer-executed, by the operation of this computer, can call or provide basis
The method of the present invention and/or technical scheme.And call the programmed instruction of the method for the present invention, may be deposited
Store up fixing or movably in record medium, and/or by broadcast or other signal bearing medias
Data stream and be transmitted, and/or be stored in the computer equipment that runs according to described programmed instruction
In working storage.Here, include a device according to one embodiment of present invention, this device bag
Include the memorizer for storing computer program instructions and for performing the processor of programmed instruction, wherein,
When this computer program instructions is performed by this processor, trigger this plant running based on aforementioned according to this
The method of multiple embodiments of invention and/or technical scheme.
It is obvious to a person skilled in the art that the invention is not restricted to the thin of above-mentioned one exemplary embodiment
Joint, and without departing from the spirit or essential characteristics of the present invention, it is possible to concrete with other
Form realizes the present invention.Therefore, no matter from the point of view of which point, embodiment all should be regarded as exemplary
, and be nonrestrictive, the scope of the present invention is limited by claims rather than described above
It is fixed, it is intended that all changes fallen in the implication of equivalency and scope of claim are included
In the present invention.Any reference in claim should not be considered as limit involved right want
Ask.Furthermore, it is to be understood that " an including " word is not excluded for other unit or step, odd number is not excluded for plural number.System
In system claim, multiple unit or the device of statement can also be passed through software by a unit or device
Or hardware realizes.The first, the second word such as grade is used for representing title, and is not offered as any specific
Order.
Claims (14)
1. the method being used for detecting infection type virus, wherein, including:
Extract the characteristic vector of file to be detected;Described characteristic vector includes: the finger of the code of entrance
Make frequency;
Utilize and carry out, based on described characteristic vector, the infection type virus identification model that machine learning obtains and depend on
Detect whether described file to be detected is infection type virus document according to described characteristic vector.
Method the most according to claim 1, wherein, extracts the characteristic vector of file to be detected
Including:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The frequency of occurrences of all instructions that the function that statistics is traveled through is comprised, obtains the code of entrance
Instruction frequency.
Method the most according to claim 1, wherein, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
Method the most according to claim 3, wherein, described Structural Characteristics include with down to
Few one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
5. according to the method described in claim 1 or 3, wherein, described characteristic vector also includes:
The distribution frequency of the entropy of the immediate of entrance.
Method the most according to claim 5, wherein, extracts the characteristic vector of file to be detected
Including:
Depth-first principle is used to begin stepping through the function of designated depth from entrance, until traveled through
Till the instruction number that function is comprised reaches specified quantity;
The immediate of all instructions that the function that statistics is traveled through is comprised;
Calculate the entropy of described immediate;
Add up the distribution frequency of the entropy of described immediate, obtain the immediate of entrance entropy point
Cloth frequency.
Method the most according to claim 6, wherein, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
8. for detect infection type virus a device, wherein, including:
For extracting the unit of the characteristic vector of file to be detected;Described characteristic vector includes: entrance
The instruction frequency of code;
Carry out, based on described characteristic vector, the infection type virus identification mould that machine learning obtains for utilizing
Type detects, according to described characteristic vector, the unit whether described file to be detected is infection type virus document.
Device the most according to claim 8, wherein, for extracting the feature of file to be detected
The unit of vector includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the frequency of occurrences of all instructions that the function traveled through is comprised, obtain entrance
The subelement of the instruction frequency of code.
Device the most according to claim 8, wherein, described characteristic vector also includes:
The Structural Characteristics that easy infected type virus infects.
11. devices according to claim 10, wherein, described Structural Characteristics includes following
At least one:
Place, entrance joint, can perform joint number, can perform joint title, place, entrance joint
Entropy, Jie Nei position, place, entrance.
Device described in 12. according to Claim 8 or 10, wherein, described characteristic vector also includes:
The distribution frequency of the entropy of the immediate of entrance.
13. devices according to claim 12, wherein, for extracting the spy of file to be detected
The unit levying vector includes:
For using depth-first principle to begin stepping through the function of designated depth from entrance, until institute time
The instruction number that the function gone through is comprised reaches the subelement till specified quantity;
For adding up the subelement of the immediate of all instructions that the function traveled through is comprised;
For calculating the subelement of the entropy of described immediate;
For adding up the distribution frequency of the entropy of described immediate, obtain the entropy of the immediate of entrance
The subelement of distribution frequency.
14. devices according to claim 13, wherein, described entropy includes:
Binary system entropy, decimal scale entropy and hexadecimal entropy.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510038898.6A CN105893843A (en) | 2015-01-26 | 2015-01-26 | Method and device used for detecting infective viruses |
BR102015028498A BR102015028498A2 (en) | 2015-01-26 | 2015-11-12 | method for detecting an infectious virus, apparatus for detecting an infectious virus, computer readable storage media, computer program product, and computer device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510038898.6A CN105893843A (en) | 2015-01-26 | 2015-01-26 | Method and device used for detecting infective viruses |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105893843A true CN105893843A (en) | 2016-08-24 |
Family
ID=56544772
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510038898.6A Pending CN105893843A (en) | 2015-01-26 | 2015-01-26 | Method and device used for detecting infective viruses |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105893843A (en) |
BR (1) | BR102015028498A2 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100162400A1 (en) * | 2008-12-11 | 2010-06-24 | Scansafe Limited | Malware detection |
CN103577755A (en) * | 2013-11-01 | 2014-02-12 | 浙江工业大学 | Malicious script static detection method based on SVM (support vector machine) |
CN103685307A (en) * | 2013-12-25 | 2014-03-26 | 北京奇虎科技有限公司 | Method, system, client and server for detecting phishing fraud webpage based on feature library |
CN104077524A (en) * | 2013-03-25 | 2014-10-01 | 腾讯科技(深圳)有限公司 | Training method used for virus identification and virus identification method and device |
-
2015
- 2015-01-26 CN CN201510038898.6A patent/CN105893843A/en active Pending
- 2015-11-12 BR BR102015028498A patent/BR102015028498A2/en not_active Application Discontinuation
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100162400A1 (en) * | 2008-12-11 | 2010-06-24 | Scansafe Limited | Malware detection |
CN104077524A (en) * | 2013-03-25 | 2014-10-01 | 腾讯科技(深圳)有限公司 | Training method used for virus identification and virus identification method and device |
CN103577755A (en) * | 2013-11-01 | 2014-02-12 | 浙江工业大学 | Malicious script static detection method based on SVM (support vector machine) |
CN103685307A (en) * | 2013-12-25 | 2014-03-26 | 北京奇虎科技有限公司 | Method, system, client and server for detecting phishing fraud webpage based on feature library |
Non-Patent Citations (1)
Title |
---|
罗文华: "基于逆向技术的恶意程序检测方法研究", 《警察技术》 * |
Also Published As
Publication number | Publication date |
---|---|
BR102015028498A2 (en) | 2016-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2019203208B2 (en) | Duplicate and similar bug report detection and retrieval using neural networks | |
WO2021096649A1 (en) | Detecting unknown malicious content in computer systems | |
US9389852B2 (en) | Technique for plagiarism detection in program source code files based on design pattern | |
KR101228899B1 (en) | Method and Apparatus for categorizing and analyzing Malicious Code Using Vector Calculation | |
US20160132718A1 (en) | Face recognition using gradient based feature analysis | |
CN103761476A (en) | Characteristic extraction method and device | |
CN103886229B (en) | Method and device for extracting PE file features | |
KR102013582B1 (en) | Apparatus and method for detecting error and determining corresponding position in source code of mixed mode application program source code thereof | |
CN107292168A (en) | Detect method and device, the server of program code | |
EP4113463A1 (en) | Methods, systems, articles of manufacture and apparatus to decode receipts based on neural graph architecture | |
KR101963756B1 (en) | Apparatus and method for learning software vulnerability prediction model, apparatus and method for analyzing software vulnerability | |
CN104680065A (en) | Virus detection method, virus detection device and virus detection equipment | |
Zhu et al. | Determining image base of firmware files for ARM devices | |
US20220318383A1 (en) | Methods and apparatus for malware classification through convolutional neural networks using raw bytes | |
CN105631336B (en) | Detect the system and method for the malicious file in mobile device | |
CN111651768A (en) | Method and device for identifying link library function name of computer binary program | |
CN104504334A (en) | System and method used for evaluating selectivity of classification rules | |
JP6821751B2 (en) | Methods, systems, and computer programs for correcting mistyping of virtual keyboards | |
CN104077527A (en) | Method and device for generating virus detection machine and method and device for virus detection | |
Kalysch et al. | Tackling androids native library malware with robust, efficient and accurate similarity measures | |
KR102053869B1 (en) | Method and apparatus for detecting malignant code of linux environment | |
Liu et al. | Exploring sensor usage behaviors of android applications based on data flow analysis | |
CN104657662B (en) | Method and device for detecting infection type virus | |
KR102299525B1 (en) | Product Evolution Mining Method And Apparatus Thereof | |
CN105893842A (en) | Method and device used for detecting infective viruses |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160824 |