CN106127044A - The detection method of a kind of function malice degree and device - Google Patents
The detection method of a kind of function malice degree and device Download PDFInfo
- Publication number
- CN106127044A CN106127044A CN201610446745.XA CN201610446745A CN106127044A CN 106127044 A CN106127044 A CN 106127044A CN 201610446745 A CN201610446745 A CN 201610446745A CN 106127044 A CN106127044 A CN 106127044A
- Authority
- CN
- China
- Prior art keywords
- function
- file
- data
- malicious
- clean
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/568—Computer malware detection or handling, e.g. anti-virus arrangements eliminating virus, restoring damaged files
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Virology (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The present invention carries and belongs to information security field, relates to the detection method of a kind of function malice degree, and described method includes: file acquisition step, collecting sample file, and described sample file comprises known clean file and malicious file;File reverse compilation step, carries out decompiling to each clean file and malicious file file respectively, obtains the function data that substrate describes;Data screening step, screens described function data, selects the function data corresponding to content write with user in sample file;Data cleansing step, is carried out each function corresponding to described content write with user, removes unstable byte therein, the function data after being cleaned;Statistic procedure, adds up, according to the function data after described cleaning, the number of times that each function occurs in malicious file and clean file, obtains the malice degree numerical value of each function according to described number of times.Pass through technique scheme, it is achieved that the intelligent decision to predefined function malice degree.Therefore, the detection particle size reduction of file to function rank, be conducive to improving the malicious of file.
Description
Technical field
This patent belongs to information security field, in particular to detection method and the dress of a kind of function malice degree
Put.
Background technology
At information security field, malicious file detection is a very important ring.The type of malicious file is varied.Example
As, include but not limited to the PE file of windows system, the ELF file of Linux system, the executable file of MAC system,
APK file in android system;Also have client script class file, as Javascript script, VBscript script,
Shell script;Also has server side script file, such as php file, Python file, asp file etc..In order to guarantee information is pacified
Entirely need to determine whether predetermined file is malicious file, and take appropriate measures and process.
In prior art, the decision method to malicious file is the most exactly so identical for different file types.But malice literary composition
The judgement of part typically has two ways, and one is manually to judge, needs to be analyzed file by the experience of Security Officer;
Two is that automatization judges, automatization's decision procedure is to use computer program to solidify artificial experience, reaches Machine automated
Judge the purpose of malicious file.Automatization's decision technology, is substantially according to associating between known file with unknown file, comes
Deduce the attribute of unknown file.This association contains the similarity-rough set of file content, the diversity of file content compares,
The source of file is the most identical and be no between file to have same signing messages, etc..Most important of which is exactly file
The similarity-rough set of content, because in most cases, it is possible to the most documentary content of acquisition and do not have documentary peripheral association
Information.
Growing along with malicious file kind and technology, needs in prior art to enrich constantly and examines malicious file
The means surveyed, in order to improve information security ability.
Summary of the invention
This patent i.e. proposes based on the demand of the prior art, and this patent to solve the technical problem that being to carry
For detection method and the device of a kind of function malice degree, to differentiate the malice degree of predefined function.
In order to solve the problems referred to above, the technical scheme is that
The detection method of a kind of function malice degree, described method includes: file acquisition step, collecting sample file, institute
State sample file and comprise known clean file and malicious file;File reverse compilation step, respectively to each clean file and evil
Meaning File carries out decompiling, obtains the function data that substrate describes;Data screening step, enters described function data
Row filter, selects the function data corresponding to content write with user in sample file;Data cleansing step, to described and user
Each function corresponding to write content is carried out, and removes unstable byte therein, the function data after being cleaned;System
Meter step, adds up, according to the function data after described cleaning, the number of times that each function occurs in malicious file and clean file,
The malice degree numerical value of each function is obtained according to described number of times.
Preferably, described method also includes: function content digest calculations step, for each function, after taking described cleaning
Function data in the coding of predetermined length, be calculated this function content digest value according to this coding.
Preferably, add up each function according to the function data after described cleaning to occur in malicious file and clean file
Number of times include: according to described function content digest value add up that each function occurs in malicious file and clean file secondary
Number.
Preferably, the function data that described substrate substrate describes is operation code or bytecode.
Preferably, described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability,
Or the byte of described instability is deleted completely.
Additionally provide the detection device of a kind of function malice degree, described device bag according to another aspect of the present invention
Including: file acquisition module, collecting sample file, described sample file comprises known clean file and malicious file;File reverse
Collector, carries out decompiling to each clean file and malicious file file respectively, obtains the function number that substrate describes
According to;Data screening module, screens described function data, selects the letter corresponding to content write with user in sample file
Number data;Data cleansing module, is carried out each function corresponding to described content write with user, remove therein not
Stablize byte, the function data after being cleaned;Statistical module, adds up each function according to the function data after described cleaning and exists
The number of times occurred in malicious file and clean file, obtains the malice degree numerical value of each function according to described number of times.
Preferably, described device also includes: function content digest calculations module, for each function, after taking described cleaning
Function data in the coding of predetermined length, be calculated this function content digest value according to this coding.
Preferably, add up each function according to the function data after described cleaning to occur in malicious file and clean file
Number of times include: according to described function content digest value add up that each function occurs in malicious file and clean file secondary
Number.
Preferably, the function data that described substrate substrate describes is operation code or bytecode.
Preferably, described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability,
Or the byte of described instability is deleted completely.
This patent passes through technique scheme, it is achieved that the intelligent decision to predefined function malice degree.Therefore, file
Detection particle size reduction to function rank, be conducive to improving the malicious of file.
Accompanying drawing explanation
Fig. 1 is the flow chart of the detection method of a kind of function malice degree provided in this patent detailed description of the invention;
Fig. 2 is the structure chart of the detection device of a kind of function malice degree provided in this patent detailed description of the invention.
Detailed description of the invention
Below in conjunction with the accompanying drawings specific implementation of the patent mode is illustrated.It is pointed out that this specific embodiment party
Formula is only the citing to this patent optimal technical scheme, can not be interpreted as the restriction to this patent protection domain.
Embodiment one
The present embodiment one provides the detection method of a kind of function malice degree.By in the method detection computer documents
The malice degree of single function and clean level.
In the present embodiment one, the malicious file of indication refers to run also in computer system or other intelligence systems
Carry out the file of malicious operation.Wherein said computer system is not limited in PC or server, also includes it
He utilizes the system of computer operation;Other intelligence systems include but not limited to that the operation of mobile phone operating system, wearable device is
System and intelligent robot operating system etc..
Fig. 1 shows the flow process of malice file test method in the present embodiment.Comprise the steps: in the method
Step 101, gathers known clean file and malicious file.
In this step 101, substantial amounts of clean file and malicious file can be gathered;This collection can be disposable
, but more preferably it is constantly running this method, thus gather various known clean file and evil continuously
Meaning file.Described clean file and malicious file sample are it has been acknowledged that the existing information of file excessively, i.e. basis just can be accurate
Judge that this document is safety or the file of malice.Described malicious file in the present embodiment refers to all to produce harm
Software, includes but not limited to " viral ", " worm-type virus ", trojan horse program, maliciously spyware, nonpermissive ad ware and Le
Rope software etc..Described clean file is relative with malicious file, refers to security of system or information security to be produced the soft of harm
Part.Such as, described clean paper sample can choose the file of company's signature with safe prestige, such as, sign through Microsoft
File etc., it is also possible to be based on various channels obtain confirm safety file.Malicious file sample can choose various
Confirmed malicious file, such as, include but not limited to through the malicious file sample that antivirus software company examined.In this step
In Zhou, the software sample quantity of collection can be multiple, even enormous amount.Such as gather and as much as possible can get
Clean file and malicious file.When the quantity of acquisition software is the most, then analysis based on statistics is the most accurate.
Step 102, respectively to each clean file and malicious file file, carries out decompiling, obtains substrate and describes
The function data of assembler language.
In a step 102 the file obtained is carried out decompiling, both included that described clean file is carried out decompiling also wrapped
Include and described malicious file is carried out decompiling.The decompiling result that each file obtains individually stores.Decompiling can use existing
There is the decompiling instrument in technology to realize, realize for example with compilation tool IDA of the prior art, naturally it is also possible to adopt
Realize by other decompiling method of the prior art, each clean file and malicious file are compiled respectively obtain respective
The function data that substrate describes.Described substrate is a kind of sequence using hexadecimal character to describe.Such as sample
File is EXE file, then decompiled into the function data bag that assembler language describes, if sample is APK file, then by it
Decompile into the function data bag that smali language describes.Described assembler language and described samli language are all to use hexadecimal
The sequence that describes of byte.The function data that described assembler language describes can also be called operation code, other substrate
The function data described is commonly referred to bytecode.
Described clean file and malicious file are all decompiled into the content that above-mentioned substrate describes, is because the biggest portion
Point computer documents can decompile into above-mentioned substrate, and decompile into above-mentioned substrate can be truer
Ground reacts the content of described file.Thus improve the scope of application to file analysis and accuracy.
Step 103, screens described function data, obtain with sample software in by corresponding to the write content of user
Function data.
Described screening in step 103 includes screening the function described by the substrate obtained, described sieve
Choosing can include the built-in function removing assembler language rank and the function automatically generated in Decompilation by compiler.
Because the generally instruction of malice is all applied by specific user, the function that built-in function and compilation tool generate is the most all
Malicious instructions will not be brought, say, that the characteristic information of the malicious instructions in usual file is all based on user and has write voluntarily
Become, thus only retain function corresponding to content write to user and just be enough to retain the feature the most relevant with file,
Remove other functions can avoid bringing noise effect to judge.
Identify the function that the content write of user and built-in function and compilation tool generate, can be come by prior art real
Existing, the function automatically generated for built-in function and compiler in usual prior art has clear and definite record, thus by these letters
Number directly removes.Such as, when using IDA decompiling instrument decompiling, can call what IDA decompiling instrument provided
API realizes these screenings, certainly according to rule similar therewith, it is also possible to realize above-mentioned screening in decompiling storehouse.
Screening the data of the function of built-in function and compiler generation, the most remaining rear content write with user is corresponding
Function.If the function that certainly there is other content write with user in the function data that decompiling obtains unrelated can also go
Remove.
Step 104, is carried out the function corresponding to each content write with user, removes wherein unstable byte,
Function data after being cleaned.
As it has been described above, the data content that each function is after decompiling completes is the byte sequence of one section of 16 system.?
Some byte in this byte sequence be likely to be after every time compiling different, such as, the value of some of which byte
It is the offset address of certain character string, owing to the position, position of character string is likely to be different, so after compiling every time
Having led to described cheap address the most different, thus corresponding byte content also can produce change, these are prone to produce change
Byte be unstable byte, or the most variable byte.The most unstable byte is included in following types of byte:
Such as, assembler language is being used to carry out, in decompiling, including but not limited to, the byte 68 xx xx xx xx of character string quoting,
API Function FF 15 xx xx xx xx, calls the types such as intrinsic function E8 xx xx xx xx.Xx xx xx therein
Xx is variable byte.
As can be seen here, if function refer to some character string or other resource, after carrying out decompiling, operation code
Middle meeting contains relative address, this relative address, is likely to be change after recompility, so causing the content of function also
There occurs change.Thus for the cleaning of unstable byte, purpose is exactly to remove the impact that this variable byte is brought.For this
A little bytes are carried out, and these bytes can be taked to reset to predetermined numerical value, include but not limited to 0;Or remove completely
Etc. method.
Such as, a function can be operated content after decompiling as follows:
text:00401828 55
text:00401829 8B EC
text:0040182B 83 EC 20
text:0040182E 6A 64
text:00401830 68 80 E1 40 00
text:00401835 6A 67F
text:00401837 FF 75 08
text:0040183A FF 15 50 91 40 00
text:00401840 6A 64
text:00401842 68 E8 E1 40 00
text:00401847 6A 6D
text:00401849 FF 75 08
text:0040184C FF 15 50 91 40 40
text:00401852 FF 75 08
text:00401855 E8 53 F9 FF FF
text:0040185A 59
text:0040185B 8B 45 08
text:0040185E A3 A3 D1 40 00
text:00401863FF 75 14
text:00401866FF 75 08
text:00401869 E8 E9 0F 00 00
text:0040186E 59
text:0040186F 59
text:00401870 85 C0
Wherein,
55 8B EC 83 EC 20 6A 64 68 80 E1 40 00 6A 67 FF
75 08 FF 15 50 91 40 00 6A 64 68 E8 E1 40 00 6A
6D FF 75 08 FF 15 50 91 40 00 FF 75 08 E8 53 F9
FF FF 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
08 E8 E9 0F 00 00 59 59 85 C0
It is the operation code obtained.Then it is carried out operation, such as, the most front 64 bytes is carried out operation,
Its result is:
55 8B EC 83 EC 20 6A 64 68 00 00 00 00 6A 67 FF
75 08 FF 15 00 00 00 00 6A 64 68 00 00 00 00 6A
6D FF 75 08 FF 15 00 00 00 00 FF 75 08 E8 00 00
00 00 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
Wherein according to aforesaid rule judgment unstable byte 80 E1 40 00,50 91 40 00, E8 E1 40
00,50 91 40 00,53 F9 FF FF, all carry out rezero operation, thus realize cleaning.
Step 105, calculates the coding of predetermined length in the function data after the cleaning taking each function, is calculated this letter
Number synopsis value.
Function after over cleaning is i.e. expressed as a certain content and the coding of order, and this coding can be by calculating
The obtained synopsis of its predetermined length numerical value represents.Described synopsis becomes " fingerprint " of this function, permissible
For identifying this function.
Described predetermined length can be the front N byte (such as 64 bytes or 128 bytes) of this function coding, it is also possible to be
All bytes or in described coding the partial bytes of selected parts.The algorithm calculating described synopsis can be that Hash is calculated
Method, described synopsis is the cryptographic Hash calculated.The cryptographic Hash of the assembler code of each function predetermined length in packet.
Such as, in as above institute's illustrated example, the cryptographic Hash of employing SHA256 algorithm front 64 bytes of calculating:
SHA256
(558BEC83EC206A6468000000006A67FF7508FF15000000006A6468000000006A6DFF7508FF15
00000000FF7508E800000000598B4508A3A3D14000FF7514FF75)=
324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
The value of final SHA256 is just for representing this function:
324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
Step 106 adds up the number of times that each function occurs in malicious file and clean file, thus obtains each function
Malice degree numerical value.
Substantial amounts of malice and clean sample file reverse are compiled, and extracts function, and each function is occurred in evil
Number of times in meaning file and clean file is added up.If this function occurs in malicious file, then corresponding malice statistics
Number of times is incremented by;Whereas if occur in clean file, then corresponding clean statistics number is incremented by.According to statistical data, so that it may
To be calculated the malice degree value of this function.
Embodiment two
The present embodiment two provides the detection device of a kind of function malice degree.By in this device detection computer documents
The malice degree of single function and clean level.
Fig. 2 shows the detection device of a kind of function malice degree in the present embodiment.Include such as lower module at device:
File acquisition module, gathers known clean file and malicious file.
In presents acquisition module, substantial amounts of clean file and malicious file can be gathered;This collection can be one
Secondary property, but more preferably it is constantly running this method, thus gather various known clean file continuously
And malicious file.Described clean file and malicious file sample are it has been acknowledged that file excessively, i.e. just can according to existing information
Accurately judge that this document is safety or the file of malice.Described malicious file in the present embodiment refers to all to produce danger
The software of evil, includes but not limited to " viral ", " worm-type virus ", trojan horse program, maliciously spyware, nonpermissive ad ware
With extort software etc..Described clean file is relative with malicious file, refers to security of system or information security to be produced harm
Software.Such as, described clean paper sample can choose the file of company's signature with safe prestige, such as through Microsoft
The file etc. of signature, it is also possible to be the file confirming safety obtained based on various channels.Malicious file sample can be chosen respectively
Kind it has been acknowledged that malicious file, such as include but not limited to through the malicious file sample that antivirus software company examined.
In this module, the software sample quantity of collection can be multiple, even enormous amount.Such as gather and to the greatest extent may be used
Can many clean files that can get and malicious file.When the quantity of acquisition software is the most, then analysis based on statistics is the most accurate
Really.
File reverse collector, respectively to each clean file and malicious file file, carries out decompiling, obtains bottom language
Speech describes the function data of assembler language.
Described file reverse collector, carries out decompiling to the file obtained, had both included carrying out described clean file instead
Compiling also includes described malicious file is carried out decompiling.The decompiling result that each file obtains individually stores.Decompiling can
To use decompiling instrument of the prior art to realize, realize, certainly for example with compilation tool IDA of the prior art
Other decompiling method of the prior art can also be used to realize, each clean file and malicious file are compiled respectively
The function data described to respective substrate.Described substrate is a kind of sequence using hexadecimal character to describe.
Such as sample file is EXE file, then decompiled into the function data bag that assembler language describes, if sample is APK literary composition
Part, then decompiled into the function data bag that smali language describes.Described assembler language and described samli language all make
The sequence described by hexadecimal byte.The function data that described assembler language describes can also be called operation code, other
Substrate describe function data be commonly referred to bytecode.
Described clean file and malicious file are all decompiled into the content that above-mentioned substrate describes, is because the biggest portion
Point computer documents can decompile into above-mentioned substrate, and decompile into above-mentioned substrate can be truer
Ground reacts the content of described file.Thus improve the scope of application to file analysis and accuracy.
Data screening module, screens described function data, obtain with sample software in by user write content institute
Corresponding function data.
Described screening includes screening each function after decompiling, and described screening can include removing substrate
The built-in function of rank and the function automatically generated by compiler in Decompilation.
Because the generally instruction of malice is all applied by specific user, the letter that built-in function and compilation tool generate
Number is generally all without bringing malicious instructions, say, that the generally characteristic information of the malicious instructions in file is all based on user certainly
Row has been write, thus only retains function corresponding to content write to user just be enough to retain whether malice is relevant with file
Feature, remove other functions and can avoid bringing noise effect to judge.
Identify the function that the content write of user and built-in function and compilation tool generate, can be come by prior art real
Existing, the function automatically generated for built-in function and compiler in usual prior art has clear and definite record, thus by these letters
Number directly removes.Such as, when using IDA decompiling instrument decompiling, can call what IDA decompiling instrument provided
API realizes these screenings, certainly according to rule similar therewith, it is also possible to realize above-mentioned screening in decompiling storehouse.
Screening the data of the function of built-in function and compiler generation, the most remaining rear content write with user is corresponding
Function.If the function that certainly there is other content write with user in the function data that decompiling obtains unrelated can also go
Remove.
Data cleansing module, is carried out each function corresponding to content write with user, removes wherein unstable
Byte, the function data after being cleaned.
As it has been described above, the data content that each function is after decompiling completes is the byte sequence of one section of 16 system.?
Some byte in this byte sequence be likely to be after every time compiling different, such as, the value of some of which byte
It is the offset address of certain character string, owing to the position, position of character string is likely to be different, so after compiling every time
Having led to described cheap address the most different, thus corresponding byte content also can produce change, these are prone to produce change
Byte be unstable byte, or the most variable byte.The most unstable byte is included in following types of byte:
Such as, assembler language is being used to carry out, in decompiling, including but not limited to, the byte 68 xx xx xx xx of character string quoting,
API Function FF 15 xx xx xx xx, calls the types such as intrinsic function E8 xx xx xx xx.Xx xx xx therein
Xx is variable byte.
As can be seen here, if function refer to some character string or other resource, after carrying out decompiling, operation code
Middle meeting contains relative address, this relative address, is likely to be change after recompility, so causing the content of function also
There occurs change.Thus for the cleaning of unstable byte, purpose is exactly to remove the impact that this variable byte is brought.For this
A little bytes are carried out, and these bytes can be taked to reset to predetermined numerical value, include but not limited to 0;Or remove completely
Etc. method.
Such as, a function can be operated content after decompiling as follows:
text:00401828 55
text:00401829 8B EC
text:0040182B 83 EC 20
text:0040182E 6A 64
text:00401830 68 80 E1 40 00
text:00401835 6A 67F
text:00401837 FF 75 08
text:0040183A FF 15 50 91 40 00
text:00401840 6A 64
text:00401842 68 E8 E1 40 00
text:00401847 6A 6D
text:00401849 FF 75 08
text:0040184C FF 15 50 91 40 40
text:00401852 FF 75 08
text:00401855 E8 53 F9 FF FF
text:0040185A 59
text:0040185B 8B 45 08
text:0040185E A3 A3 D1 40 00
text:00401863 FF 75 14
text:00401866 FF 75 08
text:00401869 E8 E9 0F 00 00
text:0040186E 59
text:0040186F 59
text:00401870 85 C0
Wherein,
55 8B EC 83 EC 20 6A 64 68 80 E1 40 00 6A 67FF
75 08 FF 15 50 91 40 00 6A 64 68 E8 E1 40 00 6A
6D FF 75 08 FF 15 50 91 40 00 FF 75 08 E8 53 F9
FF FF 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
08 E8 E9 0F 00 00 59 59 85 C0
It is the operation code obtained.Then it is carried out operation, such as, the most front 64 bytes is carried out operation,
Its result is:
55 8B EC 83 EC 206A 64 68 00 00 00 00 6A 67 FF
75 08 FF 15 00 00 00 00 6A 64 68 00 00 00 00 6A
6D FF 75 08 FF 15 00 00 00 00 FF 75 08 E8 00 00
00 00 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
Wherein according to aforesaid rule judgment unstable byte 80 E1 40 00,50 91 40 00, E8 E1 40
00,50 91 40 00,53 F9 FF FF, all carry out rezero operation, thus realize cleaning.
Function content digest calculations module, calculates the volume of predetermined length in the function data after the cleaning taking each function
Code, is calculated this function content digest value.
Function after over cleaning is i.e. expressed as a certain content and the coding of order, and this coding can be by calculating
The obtained synopsis of its predetermined length numerical value represents.Described synopsis becomes " fingerprint " of this function, permissible
For identifying this function.
Described predetermined length can be the front N byte (such as 64 bytes or 128 bytes) of this function coding, it is also possible to be
All bytes or in described coding the partial bytes of selected parts.The algorithm calculating described synopsis can be that Hash is calculated
Method, described synopsis is the cryptographic Hash calculated.The cryptographic Hash of the assembler code of each function predetermined length in packet.
Such as, in as above institute's illustrated example, the cryptographic Hash of employing SHA256 algorithm front 64 bytes of calculating:
SHA256
(558BEC83EC206A6468000000006A67FF7508FF15000000006A6468000000006A6DFF7508FF15
00000000FF7508E800000000598B4508A3A3D14000FF7514FF75)=
324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
The value of final SHA256 is just for representing this function:
324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
Statistical module, adds up the number of times that each function occurs in malicious file and clean file, thus obtains each letter
The malice degree numerical value of number.
Substantial amounts of malice and clean sample file reverse are compiled, and extracts function, and each function is occurred in evil
Number of times in meaning file and clean file is added up.If this function occurs in malicious file, then corresponding malice statistics
Number of times is incremented by;Whereas if occur in clean file, then corresponding clean statistics number is incremented by.According to statistical data, so that it may
To be calculated the malice degree value of this function.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, device or computer program
Product.Therefore, the reality in terms of the present invention can use complete hardware embodiment, complete software implementation or combine software and hardware
Execute the form of example.And, the present invention can use at one or more computers wherein including computer usable program code
The shape of the upper computer program implemented of usable storage medium (including but not limited to disk memory and optical memory etc.)
Formula.
The present invention is with reference to method, equipment (device) and the flow process of computer program according to embodiments of the present invention
Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram
Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce
A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real
The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.
These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to
Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or
The function specified in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter
Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or
The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one
The step of the function specified in individual square frame or multiple square frame.
Obviously, those skilled in the art can carry out various change and the modification essence without deviating from the present invention to the present invention
God and scope.So, if these amendments of the present invention and modification belong to the scope of the claims in the present invention and equivalent technologies thereof
Within, then the present invention is also intended to comprise these change and modification.
Claims (10)
1. the detection method of a function malice degree, it is characterised in that described method includes:
File acquisition step, collecting sample file, described sample file comprises known clean file and malicious file;
File reverse compilation step, carries out decompiling to each clean file and malicious file file respectively, obtains substrate and retouch
The function data stated;
Data screening step, screens described function data, selects in sample file corresponding to content write with user
Function data;
Data cleansing step, is carried out each function corresponding to described content write with user, removes shakiness therein
Determine byte, the function data after being cleaned;
Statistic procedure, adds up what each function occurred in malicious file and clean file according to the function data after described cleaning
Number of times, obtains the malice degree numerical value of each function according to described number of times.
The detection method of a kind of function the most according to claim 2 malice degree, it is characterised in that described method is also wrapped
Include:
Function content digest calculations step, for each function, takes the coding of predetermined length in the function data after described cleaning,
It is calculated this function content digest value according to this coding.
The detection method of a kind of function the most according to claim 2 malice degree, it is characterised in that
Include according to the number of times that the function data each function of statistics after described cleaning occurs in malicious file and clean file:
The number of times that each function occurs in malicious file and clean file is added up according to described function content digest value.
4. according to the detection method of the function malice degree according to any one of claim 1-3, it is characterised in that
The function data that described substrate substrate describes is operation code or bytecode.
5. according to the detection method of the function malice degree according to any one of claim 1-4, it is characterised in that
Described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability, or by described
Unstable byte is deleted completely.
6. the detection device of a function malice degree, it is characterised in that described device includes:
File acquisition module, collecting sample file, described sample file comprises known clean file and malicious file;
File reverse collector, carries out decompiling to each clean file and malicious file file respectively, obtains substrate and retouch
The function data stated;
Data screening module, screens described function data, selects in sample file corresponding to content write with user
Function data;
Data cleansing module, is carried out each function corresponding to described content write with user, removes shakiness therein
Determine byte, the function data after being cleaned;
Statistical module, adds up what each function occurred in malicious file and clean file according to the function data after described cleaning
Number of times, obtains the malice degree numerical value of each function according to described number of times.
The detection device of a kind of function the most according to claim 6 malice degree, it is characterised in that described device also wraps
Include:
Function content digest calculations module, for each function, takes the coding of predetermined length in the function data after described cleaning,
It is calculated this function content digest value according to this coding.
The detection device of a kind of function the most according to claim 7 malice degree, it is characterised in that
Include according to the number of times that the function data each function of statistics after described cleaning occurs in malicious file and clean file:
The number of times that each function occurs in malicious file and clean file is added up according to described function content digest value.
9. according to the detection device of the function malice degree according to any one of claim 6-8, it is characterised in that
The function data that described substrate substrate describes is operation code or bytecode.
10. according to the detection device of the function malice degree according to any one of claim 6-9, it is characterised in that
Described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability, or by described
Unstable byte is deleted completely.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610446745.XA CN106127044A (en) | 2016-06-20 | 2016-06-20 | The detection method of a kind of function malice degree and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610446745.XA CN106127044A (en) | 2016-06-20 | 2016-06-20 | The detection method of a kind of function malice degree and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106127044A true CN106127044A (en) | 2016-11-16 |
Family
ID=57471001
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610446745.XA Pending CN106127044A (en) | 2016-06-20 | 2016-06-20 | The detection method of a kind of function malice degree and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106127044A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108664791A (en) * | 2017-03-29 | 2018-10-16 | 腾讯科技(深圳)有限公司 | A kind of webpage back door detection method in HyperText Preprocessor code and device |
CN111324890A (en) * | 2018-12-14 | 2020-06-23 | 华为技术有限公司 | Processing method, detection method and device of portable executive body file |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101604364A (en) * | 2009-07-10 | 2009-12-16 | 珠海金山软件股份有限公司 | Computer rogue program categorizing system and sorting technique based on file instruction sequence |
CN102982043A (en) * | 2011-09-07 | 2013-03-20 | 腾讯科技(深圳)有限公司 | Processing method and device for portable execute (PE) files |
US20140068768A1 (en) * | 2012-08-29 | 2014-03-06 | The Johns Hopkins University | Apparatus and Method for Identifying Related Code Variants in Binaries |
CN103761476A (en) * | 2013-12-30 | 2014-04-30 | 北京奇虎科技有限公司 | Characteristic extraction method and device |
-
2016
- 2016-06-20 CN CN201610446745.XA patent/CN106127044A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101604364A (en) * | 2009-07-10 | 2009-12-16 | 珠海金山软件股份有限公司 | Computer rogue program categorizing system and sorting technique based on file instruction sequence |
CN102982043A (en) * | 2011-09-07 | 2013-03-20 | 腾讯科技(深圳)有限公司 | Processing method and device for portable execute (PE) files |
US20140068768A1 (en) * | 2012-08-29 | 2014-03-06 | The Johns Hopkins University | Apparatus and Method for Identifying Related Code Variants in Binaries |
CN103761476A (en) * | 2013-12-30 | 2014-04-30 | 北京奇虎科技有限公司 | Characteristic extraction method and device |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108664791A (en) * | 2017-03-29 | 2018-10-16 | 腾讯科技(深圳)有限公司 | A kind of webpage back door detection method in HyperText Preprocessor code and device |
CN108664791B (en) * | 2017-03-29 | 2023-05-16 | 腾讯科技(深圳)有限公司 | Method and device for detecting back door of webpage in hypertext preprocessor code |
CN111324890A (en) * | 2018-12-14 | 2020-06-23 | 华为技术有限公司 | Processing method, detection method and device of portable executive body file |
CN111324890B (en) * | 2018-12-14 | 2022-12-02 | 华为技术有限公司 | Processing method, detection method and device of portable executive body file |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Libd: Scalable and precise third-party library detection in android markets | |
US8762948B1 (en) | System and method for establishing rules for filtering insignificant events for analysis of software program | |
US9348998B2 (en) | System and methods for detecting harmful files of different formats in virtual environments | |
Nari et al. | Automated malware classification based on network behavior | |
US9715588B2 (en) | Method of detecting a malware based on a white list | |
CN104091121B (en) | The detection, excision and the method recovered of the malicious code of bag Malware are beaten again Android | |
US9454658B2 (en) | Malware detection using feature analysis | |
CN101685483B (en) | Method and device for extracting virus feature code | |
EP2975873A1 (en) | A computer implemented method for classifying mobile applications and computer programs thereof | |
CN106326737B (en) | System and method for detecting the harmful file that can be executed on virtual stack machine | |
CN104834859A (en) | Method for dynamically detecting malicious behavior in Android App (Application) | |
Wang et al. | Orlis: Obfuscation-resilient library detection for Android | |
CN103473346A (en) | Android re-packed application detection method based on application programming interface | |
Zakeri et al. | A static heuristic approach to detecting malware targets | |
Balachandran et al. | Potent and stealthy control flow obfuscation by stack based self-modifying code | |
CN103475671B (en) | Malware detection methods | |
Han et al. | Malware classification methods using API sequence characteristics | |
CN104462990A (en) | Character string decrypting and encrypting method and device | |
CN103473104A (en) | Method for discriminating re-package of application based on keyword context frequency matrix | |
Anju et al. | Malware detection using assembly code and control flow graph optimization | |
KR101816045B1 (en) | Malware detecting system with malware rule set | |
CN106682506A (en) | Virus program detecting method and terminal | |
CN112287342A (en) | Internet of things firmware dynamic detection method and device, electronic equipment and storage medium | |
CN106127044A (en) | The detection method of a kind of function malice degree and device | |
CN105975854A (en) | Detection method and device for malicious file |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161116 |
|
RJ01 | Rejection of invention patent application after publication |