CN106803040A - Virus signature processing method and processing device - Google Patents
Virus signature processing method and processing device Download PDFInfo
- Publication number
- CN106803040A CN106803040A CN201710035588.8A CN201710035588A CN106803040A CN 106803040 A CN106803040 A CN 106803040A CN 201710035588 A CN201710035588 A CN 201710035588A CN 106803040 A CN106803040 A CN 106803040A
- Authority
- CN
- China
- Prior art keywords
- code
- function
- code block
- sample
- application program
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/564—Static detection by virus signature recognition
Abstract
The invention discloses a kind of virus signature processing method and processing device;Method includes:Dis-assembling treatment is carried out to carrying virulent malice sample, the dis-assembling code that will be obtained carries out splitting the multiple code blocks for obtaining the malice sample;Travel through the function call that the code block obtains being performed in the code block, the destination path of the function call is compared with the path of application program interface function, determine the application program interface function called in the code block, and the number of times for calling the application program interface function;Corresponding code block feature is built based on the application program interface function called in the code block and the number of times for calling the application program interface function;The code block feature of each described code block of the malice sample is merged the virus signature to form the malice sample.Implement the present invention, the broad spectrum activity of virus signature and ageing can be lifted.
Description
Technical field
The present invention relates to safe practice, more particularly to a kind of virus signature processing method and processing device.
Background technology
Computer virus is also referred to as virus, is that in terminal, (smart mobile phone, computer and server etc. are various to be calculated eventually organizer
End) in implantation destruction terminal function or the malicious intent code such as data.
Virus is run to realize malicious intent usually as (such as shell adding) independent application program user cheating in the terminal,
Or be embedded into the conventional application program of secondary encapsulation, realize malicious intent in the running of conventional application program.
When being currently based on the Anti- Virus Engine Scan for Viruses of condition code, sample to be detected is carried out with the condition code of virus
Matching, including the cryptographic Hash of sample is matched with the cryptographic Hash in condition code, and by the binary word joint number of sample (i.e.
The volume of the sample represented with byte number) matched with the file byte number in condition code.
However, in practical application, there is following both sides reason so that condition code easily fails, effect characteristicses code detection
The broad spectrum activity of virus:
On the one hand, virus authors can reach the Hash for changing virus by carrying out a small amount of modification to viral source code
The purpose of value and file byte number, so that can originally detect the condition code failure of virus, it is necessary to constantly update virus
Condition code, causes detection virus to there is hysteresis quality;
On the other hand, there is the Optimization Mechanisms such as instruction is reset, register is reassigned in most of compiler so that even phase
With the source code binary content of file destination that compiles out be likely to inconsistent, byte number is detected in causing feature based code
The situation of leak detection or error detection occurs when viral.
As can be seen that correlation technique provide condition code for virus change it is extremely sensitive, do not possess detection virus
Broad spectrum activity, there is hysteresis quality for the detection of new virus.
The content of the invention
The embodiment of the present invention provides a kind of virus signature processing method and processing device, can lift the wide spectrum of virus signature
Property and ageing.
What the technical scheme of the embodiment of the present invention was realized in:
In a first aspect, the embodiment of the present invention provides a kind of virus signature processing method, including:
Dis-assembling treatment is carried out to carrying virulent malice sample, the dis-assembling code that will be obtained split and obtains institute
State multiple code blocks of malice sample;
The function call that the code block obtains being performed in the code block is traveled through, by the destination path of the function call
Compare with the path of application program interface function, determine the application program interface function called in the code block, and call
The number of times of the application program interface function;
Based on the application program interface function called in the code block and call the application program interface function
Number of times builds corresponding code block feature;
The code block feature of each described code block of the malice sample is merged the virus spy to form the malice sample
Levy code.
Second aspect, the embodiment of the present invention provides a kind of virus signature processing unit, including:
Compilation cutting unit, for carrying out dis-assembling treatment to carrying virulent malice sample, the dis-assembling that will be obtained
Code carries out splitting the multiple code blocks for obtaining the malice sample;
Function calling cell, the function call performed in the code block for traveling through the code block to obtain will be described
The destination path of function call compares with the path of application program interface function, determines the application program called in the code block
Interface function, and call the number of times of the application program interface function;
Construction feature unit, for based on the application program interface function called in the code block and calling described
The number of times of application program interface function builds corresponding code block feature;
Feature combining unit, to form described for the code block feature of each described code block of the malice sample to be merged
The virus signature of malice sample.
The third aspect, the embodiment of the present invention provides a kind of virus signature processing unit, including memory and processor, deposits
Be stored with executable instruction in reservoir, for causing computing device virus signature treatment side provided in an embodiment of the present invention
Method.
Fourth aspect, the embodiment of the present invention provides a kind of storage medium, and be stored with executable instruction, for causing processor
Perform virus signature processing method provided in an embodiment of the present invention.
The embodiment of the present invention has the advantages that:
Depending on the computing capability of terminal (such as terminal or server) can efficiently complete;Meanwhile, using malice sample
The feature that api function is called carrys out construction feature code, compared with correlation technique is using the malice sample cryptographic Hash of itself, due to malice
The feature that the api function of sample is called can accurately reflect the feature of semanteme of the malice sample when malicious intent is realized, not disliked
The influence that the cryptographic Hash and byte number of sample of anticipating change, therefore, it is possible to realize the broad spectrum activity of detection virus;Further, since malice sample
API Calls in this have metastable characteristic, therefore, can be detected based on the feature construction condition code that api function is called
Virus after to evolution, it is to avoid the signature detection virus that correlation technique is provided has hysteresis quality.
Brief description of the drawings
Fig. 1 be it is provided in an embodiment of the present invention extraction virus signature and based on virus signature detection sample whether
Take viruliferous one optional treatment schematic diagram;
Fig. 2 is an optional treatment schematic diagram of virus signature processing method provided in an embodiment of the present invention;
Fig. 3 is an optional schematic flow sheet of virus signature processing method provided in an embodiment of the present invention;
Fig. 4 be virus signature processing unit provided in an embodiment of the present invention be deployed in one of network side server it is optional
Schematic diagram;
Fig. 5 is that an optional software and hardware structure of virus signature processing unit 10 provided in an embodiment of the present invention is illustrated
Figure;
Fig. 6 is another optional schematic flow sheet of condition code processing method provided in an embodiment of the present invention;
Fig. 7-1 is the api function that extraction operating system provided in an embodiment of the present invention is provided and stores to api function storehouse
One optional schematic flow sheet;
Fig. 7-2 is that the malice sample in the Sample Storehouse for malice provided in an embodiment of the present invention calculates entrained viral disease
The optional schematic flow sheet of of malicious condition code;
Fig. 7-3 be it is provided in an embodiment of the present invention sample to be detected detect whether taking viruliferous one it is optional
Schematic flow sheet;
Fig. 8 is the optional schematic diagram of that dis-assembling treatment is carried out to executable file provided in an embodiment of the present invention;
Fig. 9-1 is one based on code tree segmentation dis-assembling code formation code block provided in an embodiment of the present invention optional
Schematic diagram;
Fig. 9-2 is one based on code tree segmentation dis-assembling code formation code block provided in an embodiment of the present invention optional
Schematic diagram;
Fig. 9-3 is one based on code tree segmentation dis-assembling code formation code block provided in an embodiment of the present invention optional
Schematic diagram;
Figure 10 is the api function that extraction operating system provided in an embodiment of the present invention is provided and stores to api function storehouse
One optional treatment schematic diagram;
Figure 11 is an optional treatment schematic diagram of calculating condition code similarity provided in an embodiment of the present invention;
Figure 12 is an optional structural representation of condition code processing unit 20 provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with drawings and Examples, the present invention will be described in further detail.It should be appreciated that mentioned herein
Embodiment is only used to explain the present invention, is not intended to limit the present invention.In addition, embodiment provided below is for implementing
Section Example of the invention, rather than provide implementation whole embodiments of the invention, creation is not paid in those skilled in the art
Property work on the premise of, the embodiment of gained is recombinated to the technical scheme of following examples and based on to invention institute reality
The other embodiment applied belongs to protection scope of the present invention.
It should be noted that in embodiments of the present invention, term " including ", "comprising" or its any other variant be intended to
Cover including for nonexcludability, so that method or device including a series of key elements not only include that what is be expressly recited wants
Element, but also other key elements including being not expressly set out, or also include for implementation or device it is intrinsic want
Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including the key element
Method or device in also there is other relevant factor (such as the unit in step or device in method).
For example, virus signature processing method provided in an embodiment of the present invention contains a series of step, but this hair
The virus signature processing method that bright embodiment is provided is not limited to described step, similarly, provided in an embodiment of the present invention
Virus signature processing unit includes a series of units, but virus signature processing unit provided in an embodiment of the present invention is not
It is limited to include unit be expressly recited, it is required when can also include to obtain relevant information or being processed based on information
The unit of setting.
Before the present invention will be described in further detail, noun and term to being related in the embodiment of the present invention are said
Bright, the noun and term being related in the embodiment of the present invention are applied to following explanation.
1) virus, also referred to as computer virus or malicious code are organizers terminal (such as smart mobile phone, panel computer,
The various computing terminals such as notebook computer, desktop computer) implantation destruction terminal function, destroy data or steal data etc.
Malicious intent binary code.
2) sample, such as general designation of various types of application programs, Microsoft Window systematic differences program, Unix system should
With program, iOS system application program and Android (Andriod) system application etc..
3) malice sample, including virulent sample.
4) normal sample, without virulent sample.
5) code block, the dis-assembling code to application program carries out dividing the block to be formed according to certain particle size.
6) application programming interfaces (API, Application Programming Interface), that is, use various programmings
The api function that language is realized, is to be supplied to application program to use respectively by operating system (Operating system) or program library
The DLL of service (or function) is planted, application program can be helped to reach unlatching form, generating writing pattern and using terminal function
(such as image and position) purpose.
7) function, i.e. subprogram, while can realize fixing calculation function, also go out with an entrance and one
Mouthful, so-called entrance is exactly the parameters of function institute band, and the parameter value of function is substituted at subprogram by this entrance
Reason;Outlet refers to the functional value of function, after functional value is tried to achieve, by outlet band back to function caller.
8) code block feature, also referred herein as feature, refer to the feature to this behavior of code block API Function
A kind of digitized feature of coding (as used hash algorithm, BASE64 algorithm codings) generation.
9) condition code, the citation form of condition code is the set of the feature of each code block of sample;In addition, in condition code
The Integral Characteristic of sample can also be included, such as the byte number (i.e. memory space shared by sample) etc. of sample.
It is when detecting virus based on the virus signature that correlation technique is provided, the condition code of sample to be detected and virus is special
Code is levied to be matched.For example, the cryptographic Hash of sample (such as the file of application program) in itself is carried out with the cryptographic Hash in condition code
Matching, and by the file byte in the binary word joint number (volume of the sample for being represented with byte number) of sample and condition code
Number is matched, and the virus signature that correlation technique is provided generally uses following form:
Form 1) Hash character string (HashString);File byte number (FileSize);Malware title
(MalwareName)
To should an example of form be:
507d8f868c27feb88b18e6f8426adf1c;12391;Win.Exploit.CVE_2013_3163
Form 2:MalwareName=HexSignature
Use form 2 condition code an example for:
Trojan.URLspoof.gen (Clam)=2e687265663d756e6573636170652827*3a2f2f*
As can be seen that virus signature is very sensitive for the change to sample, as long as malice sample has slight
Change, the cryptographic Hash and byte number that may result in malice sample change, and then cause originally to be able to detect that malice sample
In take the failure of viruliferous virus signature, the broad spectrum activity of influence virus signature detection virus, the detection for new virus is deposited
In hysteresis quality.
In the embodiment of the present invention, semantic point is not carried out to malice sample when virus signature is extracted for correlation technique
The problem of analysis, there is provided the feature construction condition code of the API Calls based on malice sample with detect virus scheme, the tune of code block
With the behavioural characteristic of api function, Compiler Optimization strategy and virus authors can be preferably eliminated to the modification of source code to feature
The interference that code is introduced, improves the broad spectrum activity of condition code, it is to avoid the hysteresis quality of Viral diagnosis, improves the efficiency of Viral diagnosis
And precision.
Referring specifically to Fig. 1, Fig. 1 is extraction virus signature provided in an embodiment of the present invention and the condition code based on virus
Whether detection sample takes viruliferous one optional treatment schematic diagram, be related to api function storehouse to generate, the generation of virus characteristic storehouse and
Three parts of pattern detection, illustrate separately below.
1) api function storehouse generation, detects the api function of operating system offer of terminal (namely in terminal operating system
It is integrated in the api function in storehouse, referred to as built-in function), and third-party api function is (such as embedding in an operating system in terminal
Api function in the third party library for entering, referred to as third party library function, or the API letters that the application program installed in terminal is provided
Number), the api function that will be extracted is stored in API library.For example, by api function in api function storehouse with<Api function is in terminal
The coding result (such as Hash coding, BASE64 codings) in path, api function mark>The form storage of such two tuple.
2) feature database generation, by the destination path and api function of the function call of each code block of known malice sample
The path of api function is compared in storehouse, detects the feature (i.e. code block feature) that the api function of code block is called, and is included in generation
The mark of the api function called in code block, and the number of times of corresponding api function is called in code block.
Code block feature with<The mark of the api function for calling, the call number of api function>... so form sequence
Row storage, each code block feature of malice sample is merged the condition code to form virus, is stored in virus characteristic storehouse.
3) pattern detection, condition code is extracted from sample to be detected, by the condition code and virus signature of sample to be detected
Compare, the similarity of feature based code judges whether sample to be detected carries virus.
It it is one of virus signature processing method provided in an embodiment of the present invention optional referring back to Fig. 2 and Fig. 3, Fig. 2
Treatment schematic diagram, Fig. 3 is an optional schematic flow sheet of virus signature processing method provided in an embodiment of the present invention,
For the condition code of virus is extracted from the malice sample including virus, dis-assembling is carried out to carrying virulent malice sample
Treatment, the dis-assembling code that will be obtained carries out splitting the multiple code blocks (step 101) for obtaining malice sample;Traversal code block is obtained
The function call performed in code block, the destination path of function call is compared with the path of application program interface function, really
Determine the number of times (step 102) of the application program interface function and calls application interface function called in code block;It is based on
The application program interface function and the number of times of calls application interface function called in code block build corresponding code block
Feature (step 103);The virus signature of virus entrained by the feature construction malice sample of each code block based on malice sample
(step 104).
Above-mentioned steps can automate realization by way of machine processing, depend on terminal (such as terminal or server)
Computing capability efficiently complete;Meanwhile, virus characteristic is built using the feature of API Function in each code block of malice sample
Code, with the cryptographic Hash and byte number structure virus characteristic code-phase that correlation technique uses the malice sample binary data of itself
Than because the feature of malice sample API Function can accurately reflect semanteme spy of the malice sample when malicious intent is realized
Property, do not influenceed by cryptographic Hash and the byte number change of malice sample itself binary data, therefore, it is possible to realize detection virus
Broad spectrum activity;In addition, even if virus distribution person is modified to the virus that malice sample is carried, but it is same for carrying
For the viral sample of race, the feature of the API Function of malice sample has metastable characteristic, therefore, based on tune
The virus after developing is able to detect that with the feature construction condition code of api function, it is to avoid the condition code inspection that correlation technique is provided
Survey virus and there is a problem of hysteresis quality.
The embodiment of the present invention also provides to perform the virus signature processing unit of above-mentioned virus signature processing method,
Hardware in virus signature processing unit can be fully deployed in user's lateral terminal or network side server.
Antivirus application for example is provided as in terminal, terminal timing pulls malice sample and extracts virus from malice Sample Storehouse
Condition code is simultaneously stored, (to be detected based on the viral condition code application installed to terminal local and the application installed
Sample) security sweep is carried out, the security strategy treatment according to terminal local for example includes:1) to detecting including virulent
Application shielding to be installed is installed;2) installed using isolating including virulent to detecting;3) prompting user and basis
The processing mode of user's selection is processed.
Again for example, with reference to Fig. 4, Fig. 4 is that virus signature processing unit provided in an embodiment of the present invention is deployed in network side
The optional schematic diagram of of server, server provides cloud antivirus service, and server timing pulls malice from malice Sample Storehouse
Sample simultaneously extracts the condition code of virus, in virus characteristic library storage from the condition code of malice sample extraction, the feature based on virus
The condition code of the sample to be detected that the antivirus application of code end of scan is submitted to, scanning result, root are issued to the antivirus application of terminal
Processed according to the security strategy of terminal local, for example, included:1) to detecting including virulent application shielding installation to be installed;2)
Installed using isolating including virulent to detecting;3) point out user and carried out according to the processing mode that user selects
Treatment.
Referring to an optional software and hardware structure schematic diagram of the virus signature processing unit 10 shown in Fig. 5, virus is special
Levying yard processing unit 10 includes hardware layer, intermediate layer, operating system layer and software layer.However, those skilled in the art should
Understand, the structure of the virus signature processing unit 10 shown in Fig. 5 is merely illustrative, does not constitute to virus signature processing unit
The restriction of 10 structures.For example, virus signature processing unit 10 can be set compared with the more components of Fig. 5 according to needs are implemented, or
Person needs to omit setting unit component according to implementation.
The hardware layer of virus signature processing unit 10 include processor 11, input/output interface 13, storage medium 14 with
And network interface 12, component can be through system bus connection communication.
Processor 11 can using central processing unit (CPU), microprocessor (MCU, Microcontroller Unit), specially
With integrated circuit (ASIC, Application Specific Integrated Circuit) or logic programmable gate array
(FPGA, Field-Programmable Gate Array) is realized.
Input/output interface 13 can be realized using such as display screen, touch-screen, loudspeaker input/output device.
Storage medium 14 can be realized using non-volatile memory mediums such as flash memory, hard disk, CDs, it would however also be possible to employ double
The volatile storage mediums such as rate (DDR, Double Data Rate) dynamic buffering are realized, wherein being stored with to perform above-mentioned disease
The executable instruction of malicious condition code processing method.
Exemplarily, storage medium 14 can with the other assemblies of virus signature processing unit 10 in same position (such as
User's lateral terminal) set, it is also possible to it is distributed relative to the other assemblies in virus signature processing unit 10 and is set.Network interface
12 access abilities that the external data storage medium 14 that such as strange land is set is provided to processor 11, exemplarily, network interface 12
Near-field communication (NFC, Near Field Communication) technology, bluetooth (Bluetooth) technology, purple honeybee can be based on
(ZigBee) short-range communication that technology is carried out, furthermore it is also possible to realize such as based on CDMA (CDMA, Code Division
Multiple Access), WCDMA (WCDMA, Wideband Code Division Multiple Access) etc.
The cellular communication of communication standard and its evolution standard, and for example, based on Wireless Fidelity (Wi-Fi) mode via access nothing
The communication of line access point (AP, Access Point) access network side.
Drive layer include for for operating system 16 recognize hardware layer and with the middleware 15 of each assembly communication of hardware layer, example
It such as can be the set of the driver of each component for hardware layer.
Operating system 16 be used for user oriented graphical interfaces is provided, exemplarily, including plug-in unit icon, desktop background and
Application icon, the support user of operating system 16 is via graphical interfaces to the control embodiment of the present invention of terminal to the soft of above-mentioned terminal
Part environment such as OS Type, version is not limited, for example can be (SuSE) Linux OS, UNIX operating system or other
Operating system.
The antivirus application of application layer including user side terminal operating/high in the clouds antivirus service 17, or can with terminal in
The module (or feature card) of fail-safe software coupling, is provided with executable instruction, is used to perform above-mentioned virus signature
Processing method.
Below, the characteristic processing method shown in Fig. 2 is further illustrated with reference to Fig. 6, it should be pointed out that ground, people in the art
Member is based on following recording based on Fig. 6 can be real in condition code processing unit is deployed in the scene of subscriber terminal side easily
Apply.
Referring to Fig. 6, Fig. 6 is another optional schematic flow sheet of condition code processing method provided in an embodiment of the present invention,
Comprise the following steps:
Step 201, server extracts the api function provided in terminal operating system, and/or extracts the third party in terminal
Api function.
In one embodiment, from being exclusively used in from collecting, the database of storage api function pulls different type behaviour to server
Make the api function provided in system, the operating system for each type makes a distinction according to version, and pulls third party API
Function.Certainly, on the premise of server and terminal set up security authentication mechanism, server can be by the peace between terminal
Full connection directly pulls above-mentioned api function from terminal.Above-mentioned different types of api function is illustrated separately below.
1) api function provided in the operating system of terminal
The api function of offer refers in terminal operating system, the built-in function of the primary offer of operating system, and built-in function is with storehouse
Form is stored in the file system of terminal, the basic capacity for supporting the application program using terminal in terminal, exemplary
Ground, including following several types api function:
1.1) network API function, for creating or closing network connection, Enumerate network resource.
1.2) Message Processing api function, for realizing the message transmission between window.
1.3) file process api function, for realizing that establishment, duplication and deletion etc. are related to the operation of file.
1.4) api function is printed, for supporting that the application program in terminal realizes printing function.
1.5) drawing api function, for the function of realizing drawing.
2) the third party's api function in terminal
In order to realize some functions of expanding, such as various software development environments are built, extra in the operating system of terminal
The api function of the third party library of injection, such as the API corresponding with exclusive function that the various third party applications of terminal are provided
Function, by taking wechat client as an example, api function can be wechat software development kit (SDK, Software Development
Kit) wechat is provided to pay, share the corresponding api function of the functions such as circle of friends.The extraction position of third party's api function according to
The SDK files of different third party applications are otherwise varied in the storage location of terminal,
Step 202, server is encoded to the path of detected api function, by the coding in the path of api function
Result is stored in api function storehouse together with the mark of api function.
The path of api function can uniquely position one api function of mark for " bag name+class name+api function name " composition
Character string,
It is that the embodiment of the present invention is carried referring to Fig. 7-1 and Figure 10, Fig. 7-1 for the api function that terminal operating system is provided
The api function of the extraction operating system offer of confession is simultaneously stored to an optional schematic flow sheet in api function storehouse, and Figure 10 is this
Api function that the extractions operating system that inventive embodiments are provided is provided and storing to one of api function storehouse optional treatment shows
It is intended to, with the operating system of terminal as Android as a example by (Andriod) operating system, the API of the function class of android system definition
Function is all in core.jar and framework.jar the two jar bags.
First, the core.jar bags and framework.jar bags of each version Andriod operating systems are collected, is then solved
These jar bags are analysed, the path of all of api function in the inside is extracted.
For example, in Fig. 10, the path of api function (int state, String incomingNumber) is:
Android.telephony.PhoneListener
Void onCallStateChanged(int state,String incomingNumber)
Secondly, it is and by the path integration of these functions that (Smali codes are that the Dalvik of Android is virtual to Smali language
Code language after the executable file DEX file dis-assembling of machine) description form, facilitate matching during subsequent extracted feature.
As a example by still with foregoing api function (int state, String incomingNumber), Smali languages are converted to
The example of form for saying description is:
Landroid/telephony/PhoneStateListener
onCallStateChanged(ILjava/lang/String;)v
Finally, it is the path computing cryptographic Hash of these Smali language description, by the cryptographic Hash in the path of api function and is
The sequence number (mark) of api function distribution is stored in api function storehouse.
As a example by still with foregoing api function (int state, String incomingNumber), Smali language is retouched
The path stated is encoded, and assigned sequence number, is obtained:
<Cryptographic Hash:4036329264617481551;Sequence number:12>.
Changed for the path of other api functions in Fig. 10, encoded and the treatment of assigned sequence number can be with
Understood based on described above, no longer illustrated one by one.
Certainly, it is necessary to explanation, the coding result in the path of api function using hash algorithm except being calculated
Cryptographic Hash, can also be encoded using the other kinds of encryption algorithm such as BASE 64 and obtained.
Api function is using the coding result of law path and two tuples as the mark of api function in api function storehouse
Form represent, be located in api function storehouse and store api function i (i is the sequence number of api function, and 1≤i≤I, I are what is extracted
The quantity of api function) an optional data structure be:
<The cryptographic Hash in the path of api function i, i>.
Encoded for the path of third party's api function and by coding result and the sequence number of third party's api function
Store to the treatment in api function storehouse, the processing mode with the foregoing api function provided for operating system is identical, here no longer
Illustrate in a separate paper.
Used as an interchangeable step of step 202, server is (rather than right to the path of detected api function
The coding result in the path of api function) it is stored in api function storehouse together with the mark of api function.
The form table of two tuples as the path of api function and the mark of api function can be stored in api function storehouse
Show, (i is the sequence number of api function to storage api function i, and 1≤i≤I, I are the api function for extracting in being located at api function storehouse
Quantity) an optional data structure be:
<The path of api function, i>.
Step 203, server pulls malice sample from malice Sample Storehouse.
Malice Sample Storehouse can be with the connection of existing malice sample, for example, the viral number from different families
Docked according to storehouse, including:
1) system virus database, usually, system virus has area according to the difference of system in malice Sample Storehouse
Not, prefix is:Win32, PE, Win95, W32 and W95 etc..
2) worm-type virus database, the prefix of worm-type virus is:Worm.The total characteristic of this virus be by network or
Person's system vulnerability is propagated, and significant portion of worm-type virus is sent out band contaminated mail, the characteristic of clogging networks.
3) script virus database, the prefix of script virus is:Script.The total characteristic of script virus is to use script
Language is write, the virus of the propagation carried out by webpage.
4) back door virus database, back door virus prefix be:Backdoor, the total characteristic of the viroid is by net
Network is propagated, and is practiced backdoorism to system.
5) destructive program virus database, the prefix of destructive program virus is:Harm.The total characteristic of this viroid
It is that there is good-looking icon to lure user to click on for itself, when user clicks on this viroid, virus will directly to user's end
End produces destruction.
For example, requirement of real-time of the malice Sample Storehouse according to Scan for Viruses, according to the frequency never consanguinity of week/day/hour
The virus database of race pulls the malice sample including virus, is never pulled with the virus database unification of family, or, according to
The renewal frequency of each family viral database is individually pulled.
Step 204, server obtains dis-assembling code to carrying out dis-assembling treatment including virulent malice sample.
For carrying out dis-assembling treatment to malice sample, executable file is extracted from malice sample, according to can hold
The form of the operating system executable file that style of writing part is run there is also difference, and file is performed in Windows operating system is
Executable file is that executable file is dex lattice in elf forms, Android operation system in exe forms, (SuSE) Linux OS
Formula, elf forms etc., then pair voluntarily can carry out dis-assembling treatment by file, and referring to Fig. 8, Fig. 8 is provided in an embodiment of the present invention
An optional schematic diagram of dis-assembling treatment is carried out to executable file, the result of dis-assembling treatment includes:
1) no initializtion data (BSS, Block Start by Symbol) section:It is complete for no initializtion in storage program
One piece of region of memory of office's variable;
2) data segment:One piece of region of memory of the global variable for being initialized in storage program.Including variable data
Section and immutable data segment.
3) code segment (code segment/text segment):It is commonly used to one piece that storage performs code (sentence)
Region of memory.
4) heap:For depositing the application heap that is dynamically allocated in process operation, size and do not fix, dynamic extending.
When process calls the storage allocations such as malloc, newly assigned internal memory is dynamically added on heap (heap is extended), works as utilization
During the function releasing memory such as free, the internal memory being released is rejected from heap
5) stack:Stack is produced when process is run, and a process has a process stack.Stack is used for storage program storage temporarily
The variable defined in local variable, i.e. function, not including the variable of static (static) type.
Step 205, server carries out splitting the multiple code segments for obtaining malice sample to dis-assembling code.
After the completion of decompiling treatment, code segment is divided into by code block by the code segment for traveling through executable file, referring to
Fig. 8, Fig. 8 are that an optional treatment for being divided into code block in the embodiment of the present invention to the code segment of executable file is illustrated
Figure, in fig. 8 as a example by splitting to code segment, code block is with letter to dis-assembling code (code segment as shown in Figure 8)
The path of number or intended level is split for granularity, using following partitioning scheme:
Mode 1) obtain code block by granulometric dis-assembling code of function
The dis-assembling code segment of malice sample is traveled through, dis-assembling code split as granularity with function is obtained being constituted instead
Multiple functions (now function is equal to code block) of assembly code;It is of course also possible to two functions or multiple function be grain
Spend to code segment carry out segmentation formed constitute code segment multiple code blocks (now each code block include two or more letters
Number).
Function is the basic logic unit for constituting code segment, and each function contains a complete treatment logic, in generation
Code section is split according to function granularity, on the one hand can easily realize the segmentation to dis-assembling code, on the other hand can be with complete
The whole logic retained inside dis-assembling code.
Mode 2) obtain code block by granulometric of the path of the different stage of code tree
Referring to Fig. 9-1, Fig. 9-1 is provided in an embodiment of the present invention based on code tree segmentation dis-assembling code formation code block
An optional schematic diagram, according to path (including one-level path, second grade highway footpath and three that rank is preset in code tree
Level path), the code under each one-level path is divided into a single code block, certainly, can for one-level path
A single code block is divided into by each the second grade highway footpath under one-level path.
Referring back to Fig. 9-2, Fig. 9-2 is provided in an embodiment of the present invention based on code tree segmentation dis-assembling code formation code
The optional schematic diagram of of block, for malice sample is the application program run in Android operation system, from application journey
Form is extracted in sequence for the executable file of Dex carries out dis-assembling, the dis-assembling code described with Smali language is obtained, will be anti-
Assembly code is divided into code block.For example, the path of class rank can be chosen for, Dex is divided into the path pair with class rank
The code block answered, a class in each code block correspondence Dex.
In Fig. 9-2, a class in each code block correspondence dis-assembling code, specially:
Code block 1:Com.android.internal.app.ActionBarImpl,
Code block 2:Com.android.internal.app.AlertActivity,
Code block 3:Com.android.internal.app.AlertController,
,……。
Certainly, server can also using other arbitrary number of levels other path segmentation dis-assembling code, for example, referring back to Fig. 9-
3, Fig. 9-3 are the optional signals that code block is formed based on code tree segmentation dis-assembling code provided in an embodiment of the present invention
Figure, can split dis-assembling code, each code that segmentation is obtained according to the preceding level Four path in the code tree shown in Fig. 9-3
A level Four path in block correspondence code tree, specially:
Code block 1:Com.android.internal.app,
Code block 2:Com.android.internal.appwidget,
Code block 3:Com.android.internal.backup,
……。
Step 206, the function that each code block of server traversal dis-assembling code obtains being performed in each code block is adjusted
With, the destination path of function call is compared with the path of application program interface function in api function storehouse, determine each code block
In the application program interface function called and calls application interface function number of times.
According in api function storehouse store api function data structure difference, with regard to server by code block j (1≤j≤J,
J is to split the quantity of the code block for obtaining in dis-assembling code) in function call destination path and application programming interfaces letter
Several paths comparatively, can there is following several ways:
Mode 1) in api function storehouse with<The path of api function, i>Such form stores api function, and server will be
The destination path of function call is detected in code block j, with api function storehouse in api function i path according to path field
Matched one by one, when each field in path is matched completely, it is determined that currently detected function call in code block j is
Api function is called the call number cumulative 1 in code block j for api function i.
Mode 2) in api function storehouse with<The coding result (such as cryptographic Hash) in the path of api function, i>Such form is deposited
The destination path that function call is detected in code block j is encoded storage api function, server (also, and api function
The coded system in the path of api function is consistent in storehouse, such as using identical hash algorithm), with the api function i in api function storehouse
The coding result in path be compared, if coding result unanimously if illustrate that path is identical, it is determined that currently being examined in code block j
The function call for measuring is called for api function, by the call number cumulative 1 in code block j for api function i.
Obviously, judge whether path is consistent using the mode of the coding result for comparing path, and by each word in path
Section be compared one by one it is identical can lift treatment effeciency, especially when the path of api function is more long, treatment effeciency is significantly carried
Rise.
Step 207, based on the application program interface function and calls application interface function called in code block
Number of times builds corresponding code block feature.
In one embodiment, for each code block, with the mark of each api function called in code block and
The call number in code block of corresponding application programs interface function forms a characteristic element, and each is called in code block
Api function form a characteristic element, collection is formed based on the corresponding characteristic element of whole api functions for being called in code block
Close, carrying out coding to set forms code block feature.
Still by taking code block j as an example, by the call function k in code block j, (1≤k≤K, K are the difference of execution in code block J
The quantity that api function is called) characteristic element k is formed, it is recorded as the characteristic element of following form<The sequence number of api function, generation
The number of times of the api function k called in code block j>, and then formed the following form of code block j set<The sequence of api function
Number, the number of times of the api function k called in code block j>;1≤k≤K }, set is encoded (as compiled using hash algorithm
Code), using coding result as code block feature.
Step 208, the code block feature of each code block of malice sample is merged the virus signature to form malice sample,
Store to virus characteristic storehouse.
Still by taking code block j as an example (1≤j≤J, J be malice sample dis-assembling code in split the number of the code block for obtaining
Amount), if correspondence code block feature j, then malice sample takes viruliferous virus signature and can use following form:{<Generation
Code block feature 1>;<Code block feature 2>;……<Code block feature J>},
Abovementioned steps 204 to step 207 is a malice sample to being pulled in malice Sample Storehouse and calculates entrained disease
The handling process of the virus signature of poison, for multiple malice samples of malice Sample Storehouse, circulation performs such as abovementioned steps 204
To the treatment of the calculating virus signature of step 207, referring to Fig. 7-2, Fig. 7-2 is provided in an embodiment of the present invention for malice sample
Malice sample in this storehouse calculates an optional schematic flow sheet of the virus signature of entrained virus, according to abovementioned steps
204 to step 207 calculates the virus signature for extracting virus entrained by a malice sample at random from malice Sample Storehouse, until
All malice samples in traversal malice Sample Storehouse.
Server is the corresponding viral allocation identification (sequence number VID) of calculated virus signature, in virus characteristic
In storehouse with<Virus sequence number, virus signature>Such whole virus signatures of two tuples form storage.
In addition, it is necessary to point out ground, foregoing is to store the api function extracted from terminal (e.g., terminal behaviour in function library
The api function provided in system is provided, and/or extracts the third-party api function in terminal) as a example by, due in api function storehouse
The function for being extracted has been prestored, thus it is follow-up when the code block of dis-assembling code of malice sample is traveled through, can be based on
Calling for api function in the rapid location code block of api function of function library storage, it is ensured that treatment effeciency.
However, it can be appreciated that in the case where the computing capability of server is enough, API is safeguarded in the embodiment of the present invention
The step of function library can be default execution, server can needing detection code block for when calling of api function, just from
Terminal extracts api function and the api function from terminal extraction is stored in the caching of server local (as including api function
The coding result and sequence number in path), i.e., function library need not be separately maintained, can so realize that the path of api function is total
It is newest, it is to avoid the api function in terminal changes causes the hysteresis quality of virus signature.
Step 209, server extracts the condition code of sample to be detected, the condition code of sample relatively more to be detected and the spy of virus
The similarity that code obtains condition code is levied, judges whether sample to be detected carries virus based on similarity.
Referring to Fig. 7-3, Fig. 7-3 be it is provided in an embodiment of the present invention sample to be detected detect whether taking it is viruliferous
One optional schematic flow sheet, illustrates with reference to Fig. 7-3.
Firstly, for any sample to be detected, server extracts corresponding condition code from sample to be detected, is designated as
df。
Specifically, server extracts executable file from sample to be detected, and the executable file to extracting enters
Row dis-assembling treatment obtains dis-assembling code, the mode that the dis-assembling code of reference pair malice sample is split:Mode 1) with
Function obtains code block, mode 2 for granulometric dis-assembling code) with the path of the different stage of code tree as granulometric is obtained
To code block.
Each code block of server traversal dis-assembling code obtains the function call performed in each code block, by function
The destination path for calling compares with the path of application program interface function in api function storehouse, determines what is called in each code block
The number of times of application program interface function and calls application interface function.
For example, according to the difference of the data structure that api function is stored in api function storehouse, with regard to server by code block j
The destination path and application journey of the function call in (1≤j≤J, J are dis-assembling code in split the quantity of the code block for obtaining)
The path of sequence interface function comparatively, can there is following several ways:
Mode 1) in api function storehouse with<The path of api function, i>Such form stores api function, and server will be
The destination path of function call is detected in code block j, with api function storehouse in api function i path according to path field
Matched one by one, when each field in path is matched completely, it is determined that currently detected function call in code block j is
Api function is called the call number cumulative 1 in code block j for api function i.
Mode 2) in api function storehouse with<The coding result (such as cryptographic Hash) in the path of api function, i>Such form is deposited
The destination path that function call is detected in code block j is encoded storage api function, server (also, and api function
The coded system in the path of api function is consistent in storehouse, such as using identical hash algorithm), with the api function i in api function storehouse
The coding result in path be compared, if coding result unanimously if illustrate that path is identical, it is determined that currently being examined in code block j
The function call for measuring is called for api function, by the call number cumulative 1 in code block j for api function i.
Number of times based on the application program interface function and calls application interface function called in code block builds
Corresponding code block feature.For each code block, with the mark of each api function called in code block and accordingly should
A characteristic element, each called API in code block are formed with the call number in code block of program interface functions
Function forms a characteristic element, set is formed based on the corresponding characteristic element of whole api functions called in code block, to collection
Conjunction carries out coding and forms code block feature;The code block feature of each code block of sample to be detected is merged to form sample to be detected
Condition code.
Secondly, virus signature and correspondence sequence number are extracted from virus signature, if the virus signature of current extraction is
Vf, Serial No. VID.
Again, by the condition code df of sample to be detected (for example, software installation bag of the apk forms of Android operation system) with
Virus signature vf in virus characteristic storehouse is compared, and obtains the disease in the condition code df and virus characteristic storehouse of sample to be detected
The quantity S of malicious condition code vf total code block feature.
Specific example here in conjunction with a calculating similarity is illustrated, and referring to Figure 11, Figure 11 is the embodiment of the present invention
The optional treatment schematic diagram of of the calculating condition code similarity of offer.
In fig. 11, it is assumed that obtain 3 code blocks after the dis-assembling code division for taking viruliferous malice sample A, be designated as:
A1;A2;A3.
The api function called in code block A1 and corresponding call number are using (api function sequence number is called secondary
Number) as two tuples record, then code block A1 is called api function and corresponding call number are with set expression:
{ (12,3), (15,1), (22,1) }, will gather obtain the code block feature of code block A1 after calculating Hash:1800939131.
Similarly, the api function and call number that code block A2 is called be using set expression:{ (56,90) }, calculate
The code block feature of code block A2 is obtained after Hash:1369398484.
Similarly, the api function and call number that code block A3 is called be using set expression:{(32,54),(123,
34), (132,36), (645,1) }, the code block feature of code block A3 is obtained after calculating Hash:2596230670.
By code block A1;A2;The code block feature of A3 merges, and the virus signature for obtaining malice sample A samples is A=
{1800939131,1369398484,2596230670}。
Assuming that sample B to be detected includes 4 code blocks, it is designated as:B1, B2, B3 and B4.
The api function and call number that code block B1 is called be using set expression:{(12,3),(15,1),(22,
1) the code block feature that code block B1 is obtained after Hash }, is calculated:1800939131.
The api function and call number that code block B2 is called be using set expression:{ (32,3), (122,3) }, will
The code block feature of code block B2 is obtained after its calculating Hash:4111055178.
The api function and call number that code block B3 is called be using set expression:{ (56,91) }, are calculated Kazakhstan
The code block feature of code block B3 is obtained after uncommon:1348286179
The api function and call number that code block B4 is called be using set expression:{ (56,35), (68,9) }, will
The code block feature of code block B4 is obtained after its calculating Hash:281916613
Therefore, the feature B={ 1800939131,4111055178,1348286179,281916613 } of sample B.
By the virus signature of comparative sample A, with the common code block feature of the condition code of sample B to be detected=
{ 1800939131 }, then similarity similarity (A, B) can be calculated using such a way:
Similarity (A, B)=count ({ 1800939131 })/count (A)=1/3=0.33.
The similarity of the virus signature vf in the condition code df of sample to be detected and virus characteristic storehouse, it is possible to use S/M
(wherein M is the quantity of the code block feature included by virus signature vf) represents, if similarity is more than similarity threshold (N/
M, that is, S≤N), illustrate to carry virus VID in sample to be detected.
If similarity is without departing from similarity threshold, illustrate that the api function of sample to be detected is called with virus for API letters
There is larger difference in several calling, continue to proceed to compare from the other virus signatures of virus characteristic storehouse extraction, if phase
Similarity threshold is respectively less than like degree, illustrates that sample to be detected does not carry virus, belong to normal sample.
The functional structure to aforementioned viral condition code processing unit is illustrated again, and referring to Figure 12, Figure 12 is of the invention real
One optional structural representation of the condition code processing unit 20 of example offer is provided, including:Compilation cutting unit 21, function call
Unit 22, construction feature unit 23 and feature combining unit 24, illustrate separately below.
Compilation cutting unit 21, for carrying out dis-assembling treatment to carrying virulent malice sample, the anti-remittance that will be obtained
Compiling code carries out splitting the multiple code blocks for obtaining malice sample.
For example, for the dis-assembling code that will be obtained carries out splitting the multiple code blocks for obtaining malice sample, compilation point
Path of the unit 21 according to the code tree of dis-assembling code is cut, the path of intended level is granulometric dis-assembling with code tree
Code obtains multiple code blocks, or, obtain multiple code blocks according to function by granulometric dis-assembling code of function.
Function calling cell 22, the function call performed in code block for traveling through code block to obtain, by function call
Destination path compares with the path of application program interface function, determines the application program interface function called in code block, and
The number of times of calls application interface function.
With regard to function calling cell 72 by the path of the destination path of function call and application program interface function comparatively,
Function calling cell 22 is used for the application program interface function provided in the operating system for obtain terminal, and/or the 3rd in terminal
The application program interface function of side, is each application program interface function allocation identification (such as sequence number), by the target of function call
Path is compared with the path of acquired application program interface function, will each field correspondence in path be compared and be
No consistent record, the mark and corresponding call number of the application program interface function that record is called in code block.
With regard to function calling cell 72 by the path of the destination path of function call and application program interface function comparatively,
Function calling cell 22, is additionally operable to the application program interface function provided in the operating system to acquired terminal, and/or eventually
The path of third-party application program interface function is encoded in end, and to application program interface function allocation identification, by letter
The coding result of the destination path that number is called, the coding result in the path of application program interface function is compared in and function storehouse
Compared with, if coding result is consistent, currently detected function call is illustrated for application program interface function is called, record in code block
In the mark and corresponding call number of application program interface function called.
With regard to function calling cell 72 by the path of the destination path of function call and application program interface function comparatively,
Function calling cell 22 is additionally operable to prestore application program interface function in function library, for example, by acquired application program
The coding result in the path of interface function and the mark for the distribution of corresponding application programs interface function are stored in function library.In letter
During the number traversal code block of call unit 72, by the coding result of the destination path of function call, application program connects in and function storehouse
The coding result in the path of mouth function is compared, if coding result is consistent, illustrates that currently detected function call is application
Program interface functions are called, and are recorded the mark of the application program interface function called in code block and are called accordingly secondary
Number.
Construction feature unit 23, for based on the application program interface function called in code block and calling and applying journey
The number of times of sequence interface function builds corresponding code block feature.
For piece code block feature, construction feature unit 23 is additionally operable to the application interface letter to be called in code block
Several marks and the call number of corresponding application programs interface function form characteristic element, each based on what is called in code block
The corresponding characteristic element of application program interface function forms set, and carrying out coding to set forms code block feature.
Feature combining unit 24, for merging to form malice sample the code block feature of each code block of malice sample
Virus signature.
Pattern detection unit 25, the condition code for calculating sample to be detected, condition code and the disease of sample relatively more to be detected
The condition code of poison obtains the similarity of condition code, judges whether sample to be detected carries virus based on similarity.
For the condition code that pattern detection unit 25 calculates sample to be detected, the condition code of sample relatively more to be detected is wrapped
The code block feature for including and the code block feature included by virus signature, obtain sample to be detected and malice sample total generation
Code block condition code, calculates the quantity ratio of total code block feature and the code block feature included by virus signature.
The condition code for comparing sample to be detected with regard to pattern detection unit 25 obtains the similar of condition code to the condition code of virus
Degree, for judging whether sample to be detected carries virus based on similarity, pattern detection unit 25 is for by sample to be detected
The destination path of function call compares with the path of predetermined application interface function in each code block, based on comparing generation for obtaining
The application program interface function called in code block, and calls application interface function number of times, build corresponding code block
Feature;The code block feature of sample to be detected is merged the condition code to form sample to be detected.
In sum, the embodiment of the present invention has the advantages that:
1) depending on the computing capability of terminal (such as terminal or server) can efficiently complete;
2) feature called using the api function of malice sample uses malice sample come construction feature code with correlation technique
Cryptographic Hash compare, realizing malice mesh because the feature that the api function of malice sample is called can accurately reflect malice sample
When the feature of semanteme, by malice sample cryptographic Hash and byte number change influenceed, therefore, it is possible to realize detection virus
Broad spectrum activity;
3) because the API Calls in malice sample have metastable characteristic, therefore, based on the spy that api function is called
Levy construction feature code and be able to detect that the virus after developing, it is to avoid the signature detection virus that correlation technique is provided exists delayed
The problem of property;
4) api function for extracting each code block of sample is called and after being encoded to it as feature.The method is considered
The semanteme of program and behavior, the interference and virus authors that can preferably resist the introducing of Compiler Optimization strategy are repaiied to source code
Change the interference of introducing, greatly improve the broad spectrum activity of condition code, reduce the difficulty of checking and killing virus.
It will be appreciated by those skilled in the art that:Realize that all or part of step of above method embodiment can be by journey
Sequence instructs related hardware to complete, and foregoing program can be stored in a computer read/write memory medium, and the program exists
During execution, the step of including above method embodiment is performed;And foregoing storage medium includes:Flash memory device, deposit at random
Access to memory (RAM, Random Access Memory), read-only storage (ROM, Read-Only Memory), magnetic disc or
CD etc. is various can be with the medium of store program codes.
Or, if the above-mentioned integrated unit of the present invention is to realize in the form of software function module and as independent product
When selling or using, it is also possible to which storage is in a computer read/write memory medium.Based on such understanding, the present invention is implemented
The part that the technical scheme of example substantially contributes to correlation technique in other words can be embodied in the form of software product,
The computer software product is stored in a storage medium, including some instructions are used to so that computer installation (can be with
It is personal computer, server or network equipment etc.) perform all or part of each embodiment methods described of the invention.
And foregoing storage medium includes:Flash memory device, RAM, ROM, magnetic disc or CD etc. are various can be with store program codes
Medium.
The above, specific embodiment only of the invention, but protection scope of the present invention is not limited thereto, and it is any
Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all contain
Cover within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.
Claims (18)
1. a kind of virus signature processing method, it is characterised in that including:
Dis-assembling treatment is carried out to carrying virulent malice sample, the dis-assembling code that will be obtained split and obtains the evil
Multiple code blocks of meaning sample;
Travel through the function call that the code block obtains performing in the code block, by the destination path of the function call with should
Compared with the path of program interface functions, determine the application program interface function called in the code block, and call described
The number of times of application program interface function;
Based on the application program interface function called in the code block and the number of times for calling the application program interface function
Build corresponding code block feature;
The code block feature of each described code block of the malice sample is merged the virus signature to form the malice sample.
2. the method for claim 1, it is characterised in that the dis-assembling code that will be obtained split and obtains described
Multiple code blocks of malice sample, including:
The path of the code tree according to the dis-assembling code, the path of intended level is granulometric institute with the code tree
State dis-assembling code and obtain multiple code blocks, or, dis-assembling code obtains many according to function described in function as granulometric
Individual code block.
3. the method for claim 1, it is characterised in that the destination path and application program by the function call
The path of interface function is compared, including:
The application program interface function provided in the operating system of terminal, and/or third-party application journey in the terminal are provided
Sequence interface function, the destination path of the function call is compared with the path of acquired application program interface function.
4. method as claimed in claim 3, it is characterised in that the destination path by the function call with it is acquired
The path of application program interface function is compared, including:
To the application program interface function provided in the operating system of acquired terminal, and/or third-party application in terminal
The path of program interface functions is encoded, and the coding result of the destination path of the function call is connect with the application program
The coding result in the path of mouth function is compared.
5. method as claimed in claim 4, it is characterised in that the coding result of the destination path by the function call
Coding result with the path of the application program interface function is compared, including:
By the coding result in the path of acquired application program interface function and be corresponding application programs interface function distribution
Mark is stored in function library, by application programming interfaces in the coding result of the destination path of the function call, with the function library
The coding result in the path of function is compared.
6. the method for claim 1, it is characterised in that described based on the application programming interfaces called in the code block
Function and the number of times of the application program interface function is called to build corresponding code block feature, including:
The mark and the call number of corresponding application programs interface function of the application interface function to be called in the code block
Characteristic element is formed, collection is formed based on the corresponding characteristic element of each described application program interface function called in the code block
Close, carrying out coding to the set forms the code block feature.
7. the method for claim 1, it is characterised in that also include:
The condition code of sample to be detected is calculated, the condition code of relatively more described sample to be detected obtains spy with the condition code of the virus
The similarity of code is levied, judges whether the sample to be detected carries the virus based on the similarity.
8. method as claimed in claim 7, it is characterised in that the condition code and the disease of the comparing sample to be detected
The condition code of poison obtains the similarity of condition code, including:
Compare the code block feature included by the condition code of the sample to be detected and the code included by the virus signature
Block feature, obtains the sample to be detected and the total code block feature of malice sample, calculates the total code block feature
With the quantity ratio of the code block feature included by the virus signature.
9. method as claimed in claim 7, it is characterised in that the condition code of the calculating sample to be detected, including:
By the road of the destination path of function call in each code block of the sample to be detected and the application program interface function
Footpath is compared, based on comparing the application program interface function called in the code block that obtains, and the application programming interfaces
The call number of function, builds corresponding code block feature;The code block feature of the sample to be detected is merged to form described
The condition code of sample to be detected.
10. a kind of virus signature processing unit, it is characterised in that including:
Compilation cutting unit, for carrying out dis-assembling treatment to carrying virulent malice sample, the dis-assembling code that will be obtained
Carry out splitting the multiple code blocks for obtaining the malice sample;
Function calling cell, the function call performed in the code block for traveling through the code block to obtain, by the function
The destination path for calling compares with the path of application program interface function, determines the application programming interfaces called in the code block
Function, and call the number of times of the application program interface function;
Construction feature unit, for based on the application program interface function called in the code block and calling the application
The number of times of program interface functions builds corresponding code block feature;
Feature combining unit, for merging to form the malice code block feature of each described code block of the malice sample
The virus signature of sample.
11. devices as claimed in claim 10, it is characterised in that
The compilation cutting unit, is additionally operable to the path of the code tree according to the dis-assembling code, with pre- in the code tree
Other path define the level for dis-assembling code described in granulometric obtains multiple code blocks, or, described in function as granulometric
Dis-assembling code obtains multiple code blocks according to function.
12. devices as claimed in claim 10, it is characterised in that
The function calling cell, is additionally operable to the application program interface function provided in the operating system for obtain terminal, and/or institute
Third-party application program interface function in terminal is stated, the destination path of the function call is connect with acquired application program
The path of mouth function is compared.
13. devices as claimed in claim 12, it is characterised in that
The function calling cell, is additionally operable to the application program interface function provided in the operating system to acquired terminal,
And/or the path of third-party application program interface function is encoded in terminal, by the destination path of the function call
Coding result, the coding result with the path of the application program interface function is compared.
14. devices as claimed in claim 13, it is characterised in that
The function calling cell, is additionally operable to the coding result in the path of acquired application program interface function and is phase
The mark for answering application program interface function to distribute is stored in function library, by the coding result of the destination path of the function call, with
The coding result in the path of application program interface function is compared in the function library.
15. devices as claimed in claim 10, it is characterised in that
The construction feature unit, is additionally operable to the mark of the application interface function called in the code block and accordingly should
Characteristic element is formed with the call number of program interface functions, based on each described application programming interfaces called in the code block
The corresponding characteristic element of function forms set, and carrying out coding to the set forms the code block feature.
16. devices as claimed in claim 10, it is characterised in that also include:
Pattern detection unit, the condition code for calculating sample to be detected, the relatively condition code of the sample to be detected with it is described
The condition code of virus obtains the similarity of condition code, judges whether the sample to be detected carries the disease based on the similarity
Poison.
17. devices as claimed in claim 16, it is characterised in that
The pattern detection unit, be additionally operable to code block feature described in comparing included by the condition code of sample to be detected with it is described
Code block feature included by virus signature, obtains the sample to be detected and the total code block feature of malice sample, meter
Calculate the quantity ratio of the total code block feature and the code block feature included by the virus signature.
18. devices as claimed in claim 17, it is characterised in that
The pattern detection unit, is additionally operable to the destination path of function call and institute in each code block of the sample to be detected
The path for stating application program interface function is compared, based on comparing the application programming interfaces letter that is called in the code block that obtains
Number, and the application program interface function call number, build corresponding code block feature;By the sample to be detected
Code block feature merges the condition code to form the sample to be detected.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710035588.8A CN106803040B (en) | 2017-01-18 | 2017-01-18 | Virus characteristic code processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710035588.8A CN106803040B (en) | 2017-01-18 | 2017-01-18 | Virus characteristic code processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106803040A true CN106803040A (en) | 2017-06-06 |
CN106803040B CN106803040B (en) | 2021-08-10 |
Family
ID=58984570
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710035588.8A Active CN106803040B (en) | 2017-01-18 | 2017-01-18 | Virus characteristic code processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106803040B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107678968A (en) * | 2017-10-18 | 2018-02-09 | 北京奇虎科技有限公司 | Sample extraction method, apparatus, computing device and the storage medium of source code function |
CN108334778A (en) * | 2017-12-20 | 2018-07-27 | 北京金山安全管理系统技术有限公司 | Method for detecting virus, device, storage medium and processor |
CN109165514A (en) * | 2018-10-16 | 2019-01-08 | 北京芯盾时代科技有限公司 | A kind of risk checking method |
CN109492396A (en) * | 2018-11-12 | 2019-03-19 | 杭州安恒信息技术股份有限公司 | Malware Gene Detecting method and apparatus based on semantic segmentation |
CN110647747A (en) * | 2019-09-05 | 2020-01-03 | 四川大学 | False mobile application detection method based on multi-dimensional similarity |
CN112148305A (en) * | 2020-10-28 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Application detection method and device, computer equipment and readable storage medium |
CN112579828A (en) * | 2019-09-30 | 2021-03-30 | 奇安信安全技术(珠海)有限公司 | Feature code processing method, device and system, storage medium and electronic device |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103136475A (en) * | 2011-11-29 | 2013-06-05 | 姚纪卫 | Method and device for detecting computer viruses |
CN103970523A (en) * | 2013-02-05 | 2014-08-06 | 中国移动通信集团广东有限公司 | Method and device for recognition of JAVA compiling destination file |
CN104391798A (en) * | 2014-12-09 | 2015-03-04 | 北京邮电大学 | Software feature information extracting method |
CN104751052A (en) * | 2013-12-30 | 2015-07-01 | 南京理工大学常熟研究院有限公司 | Dynamic behavior analysis method for mobile intelligent terminal software based on support vector machine algorithm |
CN105184160A (en) * | 2015-07-24 | 2015-12-23 | 哈尔滨工程大学 | API object calling relation graph based method for detecting malicious behavior of application program in Android mobile phone platform |
CN106709349A (en) * | 2016-12-15 | 2017-05-24 | 中国人民解放军国防科学技术大学 | Multi-dimension behavior characteristic-based malicious code classification method |
-
2017
- 2017-01-18 CN CN201710035588.8A patent/CN106803040B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103136475A (en) * | 2011-11-29 | 2013-06-05 | 姚纪卫 | Method and device for detecting computer viruses |
CN103970523A (en) * | 2013-02-05 | 2014-08-06 | 中国移动通信集团广东有限公司 | Method and device for recognition of JAVA compiling destination file |
CN104751052A (en) * | 2013-12-30 | 2015-07-01 | 南京理工大学常熟研究院有限公司 | Dynamic behavior analysis method for mobile intelligent terminal software based on support vector machine algorithm |
CN104391798A (en) * | 2014-12-09 | 2015-03-04 | 北京邮电大学 | Software feature information extracting method |
CN105184160A (en) * | 2015-07-24 | 2015-12-23 | 哈尔滨工程大学 | API object calling relation graph based method for detecting malicious behavior of application program in Android mobile phone platform |
CN106709349A (en) * | 2016-12-15 | 2017-05-24 | 中国人民解放军国防科学技术大学 | Multi-dimension behavior characteristic-based malicious code classification method |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107678968A (en) * | 2017-10-18 | 2018-02-09 | 北京奇虎科技有限公司 | Sample extraction method, apparatus, computing device and the storage medium of source code function |
CN108334778A (en) * | 2017-12-20 | 2018-07-27 | 北京金山安全管理系统技术有限公司 | Method for detecting virus, device, storage medium and processor |
CN109165514A (en) * | 2018-10-16 | 2019-01-08 | 北京芯盾时代科技有限公司 | A kind of risk checking method |
CN109492396A (en) * | 2018-11-12 | 2019-03-19 | 杭州安恒信息技术股份有限公司 | Malware Gene Detecting method and apparatus based on semantic segmentation |
CN110647747A (en) * | 2019-09-05 | 2020-01-03 | 四川大学 | False mobile application detection method based on multi-dimensional similarity |
CN112579828A (en) * | 2019-09-30 | 2021-03-30 | 奇安信安全技术(珠海)有限公司 | Feature code processing method, device and system, storage medium and electronic device |
CN112148305A (en) * | 2020-10-28 | 2020-12-29 | 腾讯科技(深圳)有限公司 | Application detection method and device, computer equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106803040B (en) | 2021-08-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106803040A (en) | Virus signature processing method and processing device | |
US11188650B2 (en) | Detection of malware using feature hashing | |
Crussell et al. | Attack of the clones: Detecting cloned applications on android markets | |
Jung et al. | Repackaging attack on android banking applications and its countermeasures | |
CN104123493B (en) | The safety detecting method and device of application program | |
Zhang et al. | Libid: reliable identification of obfuscated third-party android libraries | |
US20130246038A1 (en) | Emulator updating system and method | |
JP6326502B2 (en) | Reputation based on frequency | |
US10275593B2 (en) | Secure computing device using different central processing resources | |
Singh et al. | Experimental analysis of Android malware detection based on combinations of permissions and API-calls | |
CA3005314A1 (en) | Systems and methods for detection of malicious code in runtime generated code | |
CN105683988A (en) | Managed software remediation | |
CN109255235B (en) | Mobile application third-party library isolation method based on user state sandbox | |
US9104862B2 (en) | Secure computing device using new software versions | |
NL2027556B1 (en) | Method and system for generating a list of indicators of compromise | |
US11200317B2 (en) | Systems and methods for protecting a computing device against malicious code | |
Alfalqi et al. | Android platform malware analysis | |
CN108319853A (en) | Virus signature processing method and processing device | |
EP3506136B1 (en) | Detecting stack cookie utilization in a binary software component using binary static analysis | |
CN106909839A (en) | A kind of method and device for extracting sample code feature | |
US9361456B2 (en) | Secure computing device using a library of programs | |
RU2815242C1 (en) | Method and system for intercepting .net calls by means of patches in intermediate language | |
Malik | Malware detection in Android phones | |
Vandhana et al. | VIEGO: Malware Generating Tool | |
Johnstone et al. | Controlled Android application execution for the IoT infrastructure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |