CN106127044A - The detection method of a kind of function malice degree and device - Google Patents

The detection method of a kind of function malice degree and device Download PDF

Info

Publication number
CN106127044A
CN106127044A CN201610446745.XA CN201610446745A CN106127044A CN 106127044 A CN106127044 A CN 106127044A CN 201610446745 A CN201610446745 A CN 201610446745A CN 106127044 A CN106127044 A CN 106127044A
Authority
CN
China
Prior art keywords
function
file
data
malicious
clean
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610446745.XA
Other languages
Chinese (zh)
Inventor
程波
侯贺明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Greenet Information Service Co Ltd
Original Assignee
Wuhan Greenet Information Service Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Greenet Information Service Co Ltd filed Critical Wuhan Greenet Information Service Co Ltd
Priority to CN201610446745.XA priority Critical patent/CN106127044A/en
Publication of CN106127044A publication Critical patent/CN106127044A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/568Computer malware detection or handling, e.g. anti-virus arrangements eliminating virus, restoring damaged files

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Virology (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention carries and belongs to information security field, relates to the detection method of a kind of function malice degree, and described method includes: file acquisition step, collecting sample file, and described sample file comprises known clean file and malicious file;File reverse compilation step, carries out decompiling to each clean file and malicious file file respectively, obtains the function data that substrate describes;Data screening step, screens described function data, selects the function data corresponding to content write with user in sample file;Data cleansing step, is carried out each function corresponding to described content write with user, removes unstable byte therein, the function data after being cleaned;Statistic procedure, adds up, according to the function data after described cleaning, the number of times that each function occurs in malicious file and clean file, obtains the malice degree numerical value of each function according to described number of times.Pass through technique scheme, it is achieved that the intelligent decision to predefined function malice degree.Therefore, the detection particle size reduction of file to function rank, be conducive to improving the malicious of file.

Description

The detection method of a kind of function malice degree and device
Technical field
This patent belongs to information security field, in particular to detection method and the dress of a kind of function malice degree Put.
Background technology
At information security field, malicious file detection is a very important ring.The type of malicious file is varied.Example As, include but not limited to the PE file of windows system, the ELF file of Linux system, the executable file of MAC system, APK file in android system;Also have client script class file, as Javascript script, VBscript script, Shell script;Also has server side script file, such as php file, Python file, asp file etc..In order to guarantee information is pacified Entirely need to determine whether predetermined file is malicious file, and take appropriate measures and process.
In prior art, the decision method to malicious file is the most exactly so identical for different file types.But malice literary composition The judgement of part typically has two ways, and one is manually to judge, needs to be analyzed file by the experience of Security Officer; Two is that automatization judges, automatization's decision procedure is to use computer program to solidify artificial experience, reaches Machine automated Judge the purpose of malicious file.Automatization's decision technology, is substantially according to associating between known file with unknown file, comes Deduce the attribute of unknown file.This association contains the similarity-rough set of file content, the diversity of file content compares, The source of file is the most identical and be no between file to have same signing messages, etc..Most important of which is exactly file The similarity-rough set of content, because in most cases, it is possible to the most documentary content of acquisition and do not have documentary peripheral association Information.
Growing along with malicious file kind and technology, needs in prior art to enrich constantly and examines malicious file The means surveyed, in order to improve information security ability.
Summary of the invention
This patent i.e. proposes based on the demand of the prior art, and this patent to solve the technical problem that being to carry For detection method and the device of a kind of function malice degree, to differentiate the malice degree of predefined function.
In order to solve the problems referred to above, the technical scheme is that
The detection method of a kind of function malice degree, described method includes: file acquisition step, collecting sample file, institute State sample file and comprise known clean file and malicious file;File reverse compilation step, respectively to each clean file and evil Meaning File carries out decompiling, obtains the function data that substrate describes;Data screening step, enters described function data Row filter, selects the function data corresponding to content write with user in sample file;Data cleansing step, to described and user Each function corresponding to write content is carried out, and removes unstable byte therein, the function data after being cleaned;System Meter step, adds up, according to the function data after described cleaning, the number of times that each function occurs in malicious file and clean file, The malice degree numerical value of each function is obtained according to described number of times.
Preferably, described method also includes: function content digest calculations step, for each function, after taking described cleaning Function data in the coding of predetermined length, be calculated this function content digest value according to this coding.
Preferably, add up each function according to the function data after described cleaning to occur in malicious file and clean file Number of times include: according to described function content digest value add up that each function occurs in malicious file and clean file secondary Number.
Preferably, the function data that described substrate substrate describes is operation code or bytecode.
Preferably, described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability, Or the byte of described instability is deleted completely.
Additionally provide the detection device of a kind of function malice degree, described device bag according to another aspect of the present invention Including: file acquisition module, collecting sample file, described sample file comprises known clean file and malicious file;File reverse Collector, carries out decompiling to each clean file and malicious file file respectively, obtains the function number that substrate describes According to;Data screening module, screens described function data, selects the letter corresponding to content write with user in sample file Number data;Data cleansing module, is carried out each function corresponding to described content write with user, remove therein not Stablize byte, the function data after being cleaned;Statistical module, adds up each function according to the function data after described cleaning and exists The number of times occurred in malicious file and clean file, obtains the malice degree numerical value of each function according to described number of times.
Preferably, described device also includes: function content digest calculations module, for each function, after taking described cleaning Function data in the coding of predetermined length, be calculated this function content digest value according to this coding.
Preferably, add up each function according to the function data after described cleaning to occur in malicious file and clean file Number of times include: according to described function content digest value add up that each function occurs in malicious file and clean file secondary Number.
Preferably, the function data that described substrate substrate describes is operation code or bytecode.
Preferably, described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability, Or the byte of described instability is deleted completely.
This patent passes through technique scheme, it is achieved that the intelligent decision to predefined function malice degree.Therefore, file Detection particle size reduction to function rank, be conducive to improving the malicious of file.
Accompanying drawing explanation
Fig. 1 is the flow chart of the detection method of a kind of function malice degree provided in this patent detailed description of the invention;
Fig. 2 is the structure chart of the detection device of a kind of function malice degree provided in this patent detailed description of the invention.
Detailed description of the invention
Below in conjunction with the accompanying drawings specific implementation of the patent mode is illustrated.It is pointed out that this specific embodiment party Formula is only the citing to this patent optimal technical scheme, can not be interpreted as the restriction to this patent protection domain.
Embodiment one
The present embodiment one provides the detection method of a kind of function malice degree.By in the method detection computer documents The malice degree of single function and clean level.
In the present embodiment one, the malicious file of indication refers to run also in computer system or other intelligence systems Carry out the file of malicious operation.Wherein said computer system is not limited in PC or server, also includes it He utilizes the system of computer operation;Other intelligence systems include but not limited to that the operation of mobile phone operating system, wearable device is System and intelligent robot operating system etc..
Fig. 1 shows the flow process of malice file test method in the present embodiment.Comprise the steps: in the method
Step 101, gathers known clean file and malicious file.
In this step 101, substantial amounts of clean file and malicious file can be gathered;This collection can be disposable , but more preferably it is constantly running this method, thus gather various known clean file and evil continuously Meaning file.Described clean file and malicious file sample are it has been acknowledged that the existing information of file excessively, i.e. basis just can be accurate Judge that this document is safety or the file of malice.Described malicious file in the present embodiment refers to all to produce harm Software, includes but not limited to " viral ", " worm-type virus ", trojan horse program, maliciously spyware, nonpermissive ad ware and Le Rope software etc..Described clean file is relative with malicious file, refers to security of system or information security to be produced the soft of harm Part.Such as, described clean paper sample can choose the file of company's signature with safe prestige, such as, sign through Microsoft File etc., it is also possible to be based on various channels obtain confirm safety file.Malicious file sample can choose various Confirmed malicious file, such as, include but not limited to through the malicious file sample that antivirus software company examined.In this step In Zhou, the software sample quantity of collection can be multiple, even enormous amount.Such as gather and as much as possible can get Clean file and malicious file.When the quantity of acquisition software is the most, then analysis based on statistics is the most accurate.
Step 102, respectively to each clean file and malicious file file, carries out decompiling, obtains substrate and describes The function data of assembler language.
In a step 102 the file obtained is carried out decompiling, both included that described clean file is carried out decompiling also wrapped Include and described malicious file is carried out decompiling.The decompiling result that each file obtains individually stores.Decompiling can use existing There is the decompiling instrument in technology to realize, realize for example with compilation tool IDA of the prior art, naturally it is also possible to adopt Realize by other decompiling method of the prior art, each clean file and malicious file are compiled respectively obtain respective The function data that substrate describes.Described substrate is a kind of sequence using hexadecimal character to describe.Such as sample File is EXE file, then decompiled into the function data bag that assembler language describes, if sample is APK file, then by it Decompile into the function data bag that smali language describes.Described assembler language and described samli language are all to use hexadecimal The sequence that describes of byte.The function data that described assembler language describes can also be called operation code, other substrate The function data described is commonly referred to bytecode.
Described clean file and malicious file are all decompiled into the content that above-mentioned substrate describes, is because the biggest portion Point computer documents can decompile into above-mentioned substrate, and decompile into above-mentioned substrate can be truer Ground reacts the content of described file.Thus improve the scope of application to file analysis and accuracy.
Step 103, screens described function data, obtain with sample software in by corresponding to the write content of user Function data.
Described screening in step 103 includes screening the function described by the substrate obtained, described sieve Choosing can include the built-in function removing assembler language rank and the function automatically generated in Decompilation by compiler. Because the generally instruction of malice is all applied by specific user, the function that built-in function and compilation tool generate is the most all Malicious instructions will not be brought, say, that the characteristic information of the malicious instructions in usual file is all based on user and has write voluntarily Become, thus only retain function corresponding to content write to user and just be enough to retain the feature the most relevant with file, Remove other functions can avoid bringing noise effect to judge.
Identify the function that the content write of user and built-in function and compilation tool generate, can be come by prior art real Existing, the function automatically generated for built-in function and compiler in usual prior art has clear and definite record, thus by these letters Number directly removes.Such as, when using IDA decompiling instrument decompiling, can call what IDA decompiling instrument provided API realizes these screenings, certainly according to rule similar therewith, it is also possible to realize above-mentioned screening in decompiling storehouse.
Screening the data of the function of built-in function and compiler generation, the most remaining rear content write with user is corresponding Function.If the function that certainly there is other content write with user in the function data that decompiling obtains unrelated can also go Remove.
Step 104, is carried out the function corresponding to each content write with user, removes wherein unstable byte, Function data after being cleaned.
As it has been described above, the data content that each function is after decompiling completes is the byte sequence of one section of 16 system.? Some byte in this byte sequence be likely to be after every time compiling different, such as, the value of some of which byte It is the offset address of certain character string, owing to the position, position of character string is likely to be different, so after compiling every time Having led to described cheap address the most different, thus corresponding byte content also can produce change, these are prone to produce change Byte be unstable byte, or the most variable byte.The most unstable byte is included in following types of byte: Such as, assembler language is being used to carry out, in decompiling, including but not limited to, the byte 68 xx xx xx xx of character string quoting, API Function FF 15 xx xx xx xx, calls the types such as intrinsic function E8 xx xx xx xx.Xx xx xx therein Xx is variable byte.
As can be seen here, if function refer to some character string or other resource, after carrying out decompiling, operation code Middle meeting contains relative address, this relative address, is likely to be change after recompility, so causing the content of function also There occurs change.Thus for the cleaning of unstable byte, purpose is exactly to remove the impact that this variable byte is brought.For this A little bytes are carried out, and these bytes can be taked to reset to predetermined numerical value, include but not limited to 0;Or remove completely Etc. method.
Such as, a function can be operated content after decompiling as follows:
text:00401828 55
text:00401829 8B EC
text:0040182B 83 EC 20
text:0040182E 6A 64
text:00401830 68 80 E1 40 00
text:00401835 6A 67F
text:00401837 FF 75 08
text:0040183A FF 15 50 91 40 00
text:00401840 6A 64
text:00401842 68 E8 E1 40 00
text:00401847 6A 6D
text:00401849 FF 75 08
text:0040184C FF 15 50 91 40 40
text:00401852 FF 75 08
text:00401855 E8 53 F9 FF FF
text:0040185A 59
text:0040185B 8B 45 08
text:0040185E A3 A3 D1 40 00
text:00401863FF 75 14
text:00401866FF 75 08
text:00401869 E8 E9 0F 00 00
text:0040186E 59
text:0040186F 59
text:00401870 85 C0
Wherein,
55 8B EC 83 EC 20 6A 64 68 80 E1 40 00 6A 67 FF
75 08 FF 15 50 91 40 00 6A 64 68 E8 E1 40 00 6A
6D FF 75 08 FF 15 50 91 40 00 FF 75 08 E8 53 F9
FF FF 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
08 E8 E9 0F 00 00 59 59 85 C0
It is the operation code obtained.Then it is carried out operation, such as, the most front 64 bytes is carried out operation, Its result is:
55 8B EC 83 EC 20 6A 64 68 00 00 00 00 6A 67 FF
75 08 FF 15 00 00 00 00 6A 64 68 00 00 00 00 6A
6D FF 75 08 FF 15 00 00 00 00 FF 75 08 E8 00 00
00 00 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
Wherein according to aforesaid rule judgment unstable byte 80 E1 40 00,50 91 40 00, E8 E1 40 00,50 91 40 00,53 F9 FF FF, all carry out rezero operation, thus realize cleaning.
Step 105, calculates the coding of predetermined length in the function data after the cleaning taking each function, is calculated this letter Number synopsis value.
Function after over cleaning is i.e. expressed as a certain content and the coding of order, and this coding can be by calculating The obtained synopsis of its predetermined length numerical value represents.Described synopsis becomes " fingerprint " of this function, permissible For identifying this function.
Described predetermined length can be the front N byte (such as 64 bytes or 128 bytes) of this function coding, it is also possible to be All bytes or in described coding the partial bytes of selected parts.The algorithm calculating described synopsis can be that Hash is calculated Method, described synopsis is the cryptographic Hash calculated.The cryptographic Hash of the assembler code of each function predetermined length in packet.
Such as, in as above institute's illustrated example, the cryptographic Hash of employing SHA256 algorithm front 64 bytes of calculating:
SHA256
(558BEC83EC206A6468000000006A67FF7508FF15000000006A6468000000006A6DFF7508FF15 00000000FF7508E800000000598B4508A3A3D14000FF7514FF75)= 324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
The value of final SHA256 is just for representing this function:
324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
Step 106 adds up the number of times that each function occurs in malicious file and clean file, thus obtains each function Malice degree numerical value.
Substantial amounts of malice and clean sample file reverse are compiled, and extracts function, and each function is occurred in evil Number of times in meaning file and clean file is added up.If this function occurs in malicious file, then corresponding malice statistics Number of times is incremented by;Whereas if occur in clean file, then corresponding clean statistics number is incremented by.According to statistical data, so that it may To be calculated the malice degree value of this function.
Embodiment two
The present embodiment two provides the detection device of a kind of function malice degree.By in this device detection computer documents The malice degree of single function and clean level.
Fig. 2 shows the detection device of a kind of function malice degree in the present embodiment.Include such as lower module at device:
File acquisition module, gathers known clean file and malicious file.
In presents acquisition module, substantial amounts of clean file and malicious file can be gathered;This collection can be one Secondary property, but more preferably it is constantly running this method, thus gather various known clean file continuously And malicious file.Described clean file and malicious file sample are it has been acknowledged that file excessively, i.e. just can according to existing information Accurately judge that this document is safety or the file of malice.Described malicious file in the present embodiment refers to all to produce danger The software of evil, includes but not limited to " viral ", " worm-type virus ", trojan horse program, maliciously spyware, nonpermissive ad ware With extort software etc..Described clean file is relative with malicious file, refers to security of system or information security to be produced harm Software.Such as, described clean paper sample can choose the file of company's signature with safe prestige, such as through Microsoft The file etc. of signature, it is also possible to be the file confirming safety obtained based on various channels.Malicious file sample can be chosen respectively Kind it has been acknowledged that malicious file, such as include but not limited to through the malicious file sample that antivirus software company examined.
In this module, the software sample quantity of collection can be multiple, even enormous amount.Such as gather and to the greatest extent may be used Can many clean files that can get and malicious file.When the quantity of acquisition software is the most, then analysis based on statistics is the most accurate Really.
File reverse collector, respectively to each clean file and malicious file file, carries out decompiling, obtains bottom language Speech describes the function data of assembler language.
Described file reverse collector, carries out decompiling to the file obtained, had both included carrying out described clean file instead Compiling also includes described malicious file is carried out decompiling.The decompiling result that each file obtains individually stores.Decompiling can To use decompiling instrument of the prior art to realize, realize, certainly for example with compilation tool IDA of the prior art Other decompiling method of the prior art can also be used to realize, each clean file and malicious file are compiled respectively The function data described to respective substrate.Described substrate is a kind of sequence using hexadecimal character to describe. Such as sample file is EXE file, then decompiled into the function data bag that assembler language describes, if sample is APK literary composition Part, then decompiled into the function data bag that smali language describes.Described assembler language and described samli language all make The sequence described by hexadecimal byte.The function data that described assembler language describes can also be called operation code, other Substrate describe function data be commonly referred to bytecode.
Described clean file and malicious file are all decompiled into the content that above-mentioned substrate describes, is because the biggest portion Point computer documents can decompile into above-mentioned substrate, and decompile into above-mentioned substrate can be truer Ground reacts the content of described file.Thus improve the scope of application to file analysis and accuracy.
Data screening module, screens described function data, obtain with sample software in by user write content institute Corresponding function data.
Described screening includes screening each function after decompiling, and described screening can include removing substrate The built-in function of rank and the function automatically generated by compiler in Decompilation.
Because the generally instruction of malice is all applied by specific user, the letter that built-in function and compilation tool generate Number is generally all without bringing malicious instructions, say, that the generally characteristic information of the malicious instructions in file is all based on user certainly Row has been write, thus only retains function corresponding to content write to user just be enough to retain whether malice is relevant with file Feature, remove other functions and can avoid bringing noise effect to judge.
Identify the function that the content write of user and built-in function and compilation tool generate, can be come by prior art real Existing, the function automatically generated for built-in function and compiler in usual prior art has clear and definite record, thus by these letters Number directly removes.Such as, when using IDA decompiling instrument decompiling, can call what IDA decompiling instrument provided API realizes these screenings, certainly according to rule similar therewith, it is also possible to realize above-mentioned screening in decompiling storehouse.
Screening the data of the function of built-in function and compiler generation, the most remaining rear content write with user is corresponding Function.If the function that certainly there is other content write with user in the function data that decompiling obtains unrelated can also go Remove.
Data cleansing module, is carried out each function corresponding to content write with user, removes wherein unstable Byte, the function data after being cleaned.
As it has been described above, the data content that each function is after decompiling completes is the byte sequence of one section of 16 system.? Some byte in this byte sequence be likely to be after every time compiling different, such as, the value of some of which byte It is the offset address of certain character string, owing to the position, position of character string is likely to be different, so after compiling every time Having led to described cheap address the most different, thus corresponding byte content also can produce change, these are prone to produce change Byte be unstable byte, or the most variable byte.The most unstable byte is included in following types of byte: Such as, assembler language is being used to carry out, in decompiling, including but not limited to, the byte 68 xx xx xx xx of character string quoting, API Function FF 15 xx xx xx xx, calls the types such as intrinsic function E8 xx xx xx xx.Xx xx xx therein Xx is variable byte.
As can be seen here, if function refer to some character string or other resource, after carrying out decompiling, operation code Middle meeting contains relative address, this relative address, is likely to be change after recompility, so causing the content of function also There occurs change.Thus for the cleaning of unstable byte, purpose is exactly to remove the impact that this variable byte is brought.For this A little bytes are carried out, and these bytes can be taked to reset to predetermined numerical value, include but not limited to 0;Or remove completely Etc. method.
Such as, a function can be operated content after decompiling as follows:
text:00401828 55
text:00401829 8B EC
text:0040182B 83 EC 20
text:0040182E 6A 64
text:00401830 68 80 E1 40 00
text:00401835 6A 67F
text:00401837 FF 75 08
text:0040183A FF 15 50 91 40 00
text:00401840 6A 64
text:00401842 68 E8 E1 40 00
text:00401847 6A 6D
text:00401849 FF 75 08
text:0040184C FF 15 50 91 40 40
text:00401852 FF 75 08
text:00401855 E8 53 F9 FF FF
text:0040185A 59
text:0040185B 8B 45 08
text:0040185E A3 A3 D1 40 00
text:00401863 FF 75 14
text:00401866 FF 75 08
text:00401869 E8 E9 0F 00 00
text:0040186E 59
text:0040186F 59
text:00401870 85 C0
Wherein,
55 8B EC 83 EC 20 6A 64 68 80 E1 40 00 6A 67FF
75 08 FF 15 50 91 40 00 6A 64 68 E8 E1 40 00 6A
6D FF 75 08 FF 15 50 91 40 00 FF 75 08 E8 53 F9
FF FF 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
08 E8 E9 0F 00 00 59 59 85 C0
It is the operation code obtained.Then it is carried out operation, such as, the most front 64 bytes is carried out operation, Its result is:
55 8B EC 83 EC 206A 64 68 00 00 00 00 6A 67 FF
75 08 FF 15 00 00 00 00 6A 64 68 00 00 00 00 6A
6D FF 75 08 FF 15 00 00 00 00 FF 75 08 E8 00 00
00 00 59 8B 45 08 A3 A3 D1 40 00 FF 75 14 FF 75
Wherein according to aforesaid rule judgment unstable byte 80 E1 40 00,50 91 40 00, E8 E1 40 00,50 91 40 00,53 F9 FF FF, all carry out rezero operation, thus realize cleaning.
Function content digest calculations module, calculates the volume of predetermined length in the function data after the cleaning taking each function Code, is calculated this function content digest value.
Function after over cleaning is i.e. expressed as a certain content and the coding of order, and this coding can be by calculating The obtained synopsis of its predetermined length numerical value represents.Described synopsis becomes " fingerprint " of this function, permissible For identifying this function.
Described predetermined length can be the front N byte (such as 64 bytes or 128 bytes) of this function coding, it is also possible to be All bytes or in described coding the partial bytes of selected parts.The algorithm calculating described synopsis can be that Hash is calculated Method, described synopsis is the cryptographic Hash calculated.The cryptographic Hash of the assembler code of each function predetermined length in packet.
Such as, in as above institute's illustrated example, the cryptographic Hash of employing SHA256 algorithm front 64 bytes of calculating:
SHA256
(558BEC83EC206A6468000000006A67FF7508FF15000000006A6468000000006A6DFF7508FF15 00000000FF7508E800000000598B4508A3A3D14000FF7514FF75)= 324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
The value of final SHA256 is just for representing this function:
324b5e91805e6fe493919f8b3e971972942e14835470a02ae8f0fb5b97cd393b
Statistical module, adds up the number of times that each function occurs in malicious file and clean file, thus obtains each letter The malice degree numerical value of number.
Substantial amounts of malice and clean sample file reverse are compiled, and extracts function, and each function is occurred in evil Number of times in meaning file and clean file is added up.If this function occurs in malicious file, then corresponding malice statistics Number of times is incremented by;Whereas if occur in clean file, then corresponding clean statistics number is incremented by.According to statistical data, so that it may To be calculated the malice degree value of this function.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, device or computer program Product.Therefore, the reality in terms of the present invention can use complete hardware embodiment, complete software implementation or combine software and hardware Execute the form of example.And, the present invention can use at one or more computers wherein including computer usable program code The shape of the upper computer program implemented of usable storage medium (including but not limited to disk memory and optical memory etc.) Formula.
The present invention is with reference to method, equipment (device) and the flow process of computer program according to embodiments of the present invention Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.
These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or The function specified in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one The step of the function specified in individual square frame or multiple square frame.
Obviously, those skilled in the art can carry out various change and the modification essence without deviating from the present invention to the present invention God and scope.So, if these amendments of the present invention and modification belong to the scope of the claims in the present invention and equivalent technologies thereof Within, then the present invention is also intended to comprise these change and modification.

Claims (10)

1. the detection method of a function malice degree, it is characterised in that described method includes:
File acquisition step, collecting sample file, described sample file comprises known clean file and malicious file;
File reverse compilation step, carries out decompiling to each clean file and malicious file file respectively, obtains substrate and retouch The function data stated;
Data screening step, screens described function data, selects in sample file corresponding to content write with user Function data;
Data cleansing step, is carried out each function corresponding to described content write with user, removes shakiness therein Determine byte, the function data after being cleaned;
Statistic procedure, adds up what each function occurred in malicious file and clean file according to the function data after described cleaning Number of times, obtains the malice degree numerical value of each function according to described number of times.
The detection method of a kind of function the most according to claim 2 malice degree, it is characterised in that described method is also wrapped Include:
Function content digest calculations step, for each function, takes the coding of predetermined length in the function data after described cleaning, It is calculated this function content digest value according to this coding.
The detection method of a kind of function the most according to claim 2 malice degree, it is characterised in that
Include according to the number of times that the function data each function of statistics after described cleaning occurs in malicious file and clean file: The number of times that each function occurs in malicious file and clean file is added up according to described function content digest value.
4. according to the detection method of the function malice degree according to any one of claim 1-3, it is characterised in that
The function data that described substrate substrate describes is operation code or bytecode.
5. according to the detection method of the function malice degree according to any one of claim 1-4, it is characterised in that
Described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability, or by described Unstable byte is deleted completely.
6. the detection device of a function malice degree, it is characterised in that described device includes:
File acquisition module, collecting sample file, described sample file comprises known clean file and malicious file;
File reverse collector, carries out decompiling to each clean file and malicious file file respectively, obtains substrate and retouch The function data stated;
Data screening module, screens described function data, selects in sample file corresponding to content write with user Function data;
Data cleansing module, is carried out each function corresponding to described content write with user, removes shakiness therein Determine byte, the function data after being cleaned;
Statistical module, adds up what each function occurred in malicious file and clean file according to the function data after described cleaning Number of times, obtains the malice degree numerical value of each function according to described number of times.
The detection device of a kind of function the most according to claim 6 malice degree, it is characterised in that described device also wraps Include:
Function content digest calculations module, for each function, takes the coding of predetermined length in the function data after described cleaning, It is calculated this function content digest value according to this coding.
The detection device of a kind of function the most according to claim 7 malice degree, it is characterised in that
Include according to the number of times that the function data each function of statistics after described cleaning occurs in malicious file and clean file: The number of times that each function occurs in malicious file and clean file is added up according to described function content digest value.
9. according to the detection device of the function malice degree according to any one of claim 6-8, it is characterised in that
The function data that described substrate substrate describes is operation code or bytecode.
10. according to the detection device of the function malice degree according to any one of claim 6-9, it is characterised in that
Described removal unstable byte therein includes: give predetermined numerical value to the byte of described instability, or by described Unstable byte is deleted completely.
CN201610446745.XA 2016-06-20 2016-06-20 The detection method of a kind of function malice degree and device Pending CN106127044A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610446745.XA CN106127044A (en) 2016-06-20 2016-06-20 The detection method of a kind of function malice degree and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610446745.XA CN106127044A (en) 2016-06-20 2016-06-20 The detection method of a kind of function malice degree and device

Publications (1)

Publication Number Publication Date
CN106127044A true CN106127044A (en) 2016-11-16

Family

ID=57471001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610446745.XA Pending CN106127044A (en) 2016-06-20 2016-06-20 The detection method of a kind of function malice degree and device

Country Status (1)

Country Link
CN (1) CN106127044A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664791A (en) * 2017-03-29 2018-10-16 腾讯科技(深圳)有限公司 A kind of webpage back door detection method in HyperText Preprocessor code and device
CN111324890A (en) * 2018-12-14 2020-06-23 华为技术有限公司 Processing method, detection method and device of portable executive body file

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101604364A (en) * 2009-07-10 2009-12-16 珠海金山软件股份有限公司 Computer rogue program categorizing system and sorting technique based on file instruction sequence
CN102982043A (en) * 2011-09-07 2013-03-20 腾讯科技(深圳)有限公司 Processing method and device for portable execute (PE) files
US20140068768A1 (en) * 2012-08-29 2014-03-06 The Johns Hopkins University Apparatus and Method for Identifying Related Code Variants in Binaries
CN103761476A (en) * 2013-12-30 2014-04-30 北京奇虎科技有限公司 Characteristic extraction method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101604364A (en) * 2009-07-10 2009-12-16 珠海金山软件股份有限公司 Computer rogue program categorizing system and sorting technique based on file instruction sequence
CN102982043A (en) * 2011-09-07 2013-03-20 腾讯科技(深圳)有限公司 Processing method and device for portable execute (PE) files
US20140068768A1 (en) * 2012-08-29 2014-03-06 The Johns Hopkins University Apparatus and Method for Identifying Related Code Variants in Binaries
CN103761476A (en) * 2013-12-30 2014-04-30 北京奇虎科技有限公司 Characteristic extraction method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664791A (en) * 2017-03-29 2018-10-16 腾讯科技(深圳)有限公司 A kind of webpage back door detection method in HyperText Preprocessor code and device
CN108664791B (en) * 2017-03-29 2023-05-16 腾讯科技(深圳)有限公司 Method and device for detecting back door of webpage in hypertext preprocessor code
CN111324890A (en) * 2018-12-14 2020-06-23 华为技术有限公司 Processing method, detection method and device of portable executive body file
CN111324890B (en) * 2018-12-14 2022-12-02 华为技术有限公司 Processing method, detection method and device of portable executive body file

Similar Documents

Publication Publication Date Title
Li et al. Libd: Scalable and precise third-party library detection in android markets
US8762948B1 (en) System and method for establishing rules for filtering insignificant events for analysis of software program
US9348998B2 (en) System and methods for detecting harmful files of different formats in virtual environments
Nari et al. Automated malware classification based on network behavior
US9715588B2 (en) Method of detecting a malware based on a white list
CN104091121B (en) The detection, excision and the method recovered of the malicious code of bag Malware are beaten again Android
US9454658B2 (en) Malware detection using feature analysis
CN101685483B (en) Method and device for extracting virus feature code
EP2975873A1 (en) A computer implemented method for classifying mobile applications and computer programs thereof
CN106326737B (en) System and method for detecting the harmful file that can be executed on virtual stack machine
CN104834859A (en) Method for dynamically detecting malicious behavior in Android App (Application)
Wang et al. Orlis: Obfuscation-resilient library detection for Android
CN103473346A (en) Android re-packed application detection method based on application programming interface
Zakeri et al. A static heuristic approach to detecting malware targets
Balachandran et al. Potent and stealthy control flow obfuscation by stack based self-modifying code
CN103475671B (en) Malware detection methods
Han et al. Malware classification methods using API sequence characteristics
CN104462990A (en) Character string decrypting and encrypting method and device
CN103473104A (en) Method for discriminating re-package of application based on keyword context frequency matrix
Anju et al. Malware detection using assembly code and control flow graph optimization
KR101816045B1 (en) Malware detecting system with malware rule set
CN106682506A (en) Virus program detecting method and terminal
CN112287342A (en) Internet of things firmware dynamic detection method and device, electronic equipment and storage medium
CN106127044A (en) The detection method of a kind of function malice degree and device
CN105975854A (en) Detection method and device for malicious file

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20161116

RJ01 Rejection of invention patent application after publication