CN106803040A - Virus signature processing method and processing device - Google Patents

Virus signature processing method and processing device Download PDF

Info

Publication number
CN106803040A
CN106803040A CN201710035588.8A CN201710035588A CN106803040A CN 106803040 A CN106803040 A CN 106803040A CN 201710035588 A CN201710035588 A CN 201710035588A CN 106803040 A CN106803040 A CN 106803040A
Authority
CN
China
Prior art keywords
code
function
code block
sample
application program
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710035588.8A
Other languages
Chinese (zh)
Other versions
CN106803040B (en
Inventor
罗元海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201710035588.8A priority Critical patent/CN106803040B/en
Publication of CN106803040A publication Critical patent/CN106803040A/en
Application granted granted Critical
Publication of CN106803040B publication Critical patent/CN106803040B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/564Static detection by virus signature recognition

Abstract

The invention discloses a kind of virus signature processing method and processing device;Method includes:Dis-assembling treatment is carried out to carrying virulent malice sample, the dis-assembling code that will be obtained carries out splitting the multiple code blocks for obtaining the malice sample;Travel through the function call that the code block obtains being performed in the code block, the destination path of the function call is compared with the path of application program interface function, determine the application program interface function called in the code block, and the number of times for calling the application program interface function;Corresponding code block feature is built based on the application program interface function called in the code block and the number of times for calling the application program interface function;The code block feature of each described code block of the malice sample is merged the virus signature to form the malice sample.Implement the present invention, the broad spectrum activity of virus signature and ageing can be lifted.

Description

Virus signature processing method and processing device
Technical field
The present invention relates to safe practice, more particularly to a kind of virus signature processing method and processing device.
Background technology
Computer virus is also referred to as virus, is that in terminal, (smart mobile phone, computer and server etc. are various to be calculated eventually organizer End) in implantation destruction terminal function or the malicious intent code such as data.
Virus is run to realize malicious intent usually as (such as shell adding) independent application program user cheating in the terminal, Or be embedded into the conventional application program of secondary encapsulation, realize malicious intent in the running of conventional application program.
When being currently based on the Anti- Virus Engine Scan for Viruses of condition code, sample to be detected is carried out with the condition code of virus Matching, including the cryptographic Hash of sample is matched with the cryptographic Hash in condition code, and by the binary word joint number of sample (i.e. The volume of the sample represented with byte number) matched with the file byte number in condition code.
However, in practical application, there is following both sides reason so that condition code easily fails, effect characteristicses code detection The broad spectrum activity of virus:
On the one hand, virus authors can reach the Hash for changing virus by carrying out a small amount of modification to viral source code The purpose of value and file byte number, so that can originally detect the condition code failure of virus, it is necessary to constantly update virus Condition code, causes detection virus to there is hysteresis quality;
On the other hand, there is the Optimization Mechanisms such as instruction is reset, register is reassigned in most of compiler so that even phase With the source code binary content of file destination that compiles out be likely to inconsistent, byte number is detected in causing feature based code The situation of leak detection or error detection occurs when viral.
As can be seen that correlation technique provide condition code for virus change it is extremely sensitive, do not possess detection virus Broad spectrum activity, there is hysteresis quality for the detection of new virus.
The content of the invention
The embodiment of the present invention provides a kind of virus signature processing method and processing device, can lift the wide spectrum of virus signature Property and ageing.
What the technical scheme of the embodiment of the present invention was realized in:
In a first aspect, the embodiment of the present invention provides a kind of virus signature processing method, including:
Dis-assembling treatment is carried out to carrying virulent malice sample, the dis-assembling code that will be obtained split and obtains institute State multiple code blocks of malice sample;
The function call that the code block obtains being performed in the code block is traveled through, by the destination path of the function call Compare with the path of application program interface function, determine the application program interface function called in the code block, and call The number of times of the application program interface function;
Based on the application program interface function called in the code block and call the application program interface function Number of times builds corresponding code block feature;
The code block feature of each described code block of the malice sample is merged the virus spy to form the malice sample Levy code.
Second aspect, the embodiment of the present invention provides a kind of virus signature processing unit, including:
Compilation cutting unit, for carrying out dis-assembling treatment to carrying virulent malice sample, the dis-assembling that will be obtained Code carries out splitting the multiple code blocks for obtaining the malice sample;
Function calling cell, the function call performed in the code block for traveling through the code block to obtain will be described The destination path of function call compares with the path of application program interface function, determines the application program called in the code block Interface function, and call the number of times of the application program interface function;
Construction feature unit, for based on the application program interface function called in the code block and calling described The number of times of application program interface function builds corresponding code block feature;
Feature combining unit, to form described for the code block feature of each described code block of the malice sample to be merged The virus signature of malice sample.
The third aspect, the embodiment of the present invention provides a kind of virus signature processing unit, including memory and processor, deposits Be stored with executable instruction in reservoir, for causing computing device virus signature treatment side provided in an embodiment of the present invention Method.
Fourth aspect, the embodiment of the present invention provides a kind of storage medium, and be stored with executable instruction, for causing processor Perform virus signature processing method provided in an embodiment of the present invention.
The embodiment of the present invention has the advantages that:
Depending on the computing capability of terminal (such as terminal or server) can efficiently complete;Meanwhile, using malice sample The feature that api function is called carrys out construction feature code, compared with correlation technique is using the malice sample cryptographic Hash of itself, due to malice The feature that the api function of sample is called can accurately reflect the feature of semanteme of the malice sample when malicious intent is realized, not disliked The influence that the cryptographic Hash and byte number of sample of anticipating change, therefore, it is possible to realize the broad spectrum activity of detection virus;Further, since malice sample API Calls in this have metastable characteristic, therefore, can be detected based on the feature construction condition code that api function is called Virus after to evolution, it is to avoid the signature detection virus that correlation technique is provided has hysteresis quality.
Brief description of the drawings
Fig. 1 be it is provided in an embodiment of the present invention extraction virus signature and based on virus signature detection sample whether Take viruliferous one optional treatment schematic diagram;
Fig. 2 is an optional treatment schematic diagram of virus signature processing method provided in an embodiment of the present invention;
Fig. 3 is an optional schematic flow sheet of virus signature processing method provided in an embodiment of the present invention;
Fig. 4 be virus signature processing unit provided in an embodiment of the present invention be deployed in one of network side server it is optional Schematic diagram;
Fig. 5 is that an optional software and hardware structure of virus signature processing unit 10 provided in an embodiment of the present invention is illustrated Figure;
Fig. 6 is another optional schematic flow sheet of condition code processing method provided in an embodiment of the present invention;
Fig. 7-1 is the api function that extraction operating system provided in an embodiment of the present invention is provided and stores to api function storehouse One optional schematic flow sheet;
Fig. 7-2 is that the malice sample in the Sample Storehouse for malice provided in an embodiment of the present invention calculates entrained viral disease The optional schematic flow sheet of of malicious condition code;
Fig. 7-3 be it is provided in an embodiment of the present invention sample to be detected detect whether taking viruliferous one it is optional Schematic flow sheet;
Fig. 8 is the optional schematic diagram of that dis-assembling treatment is carried out to executable file provided in an embodiment of the present invention;
Fig. 9-1 is one based on code tree segmentation dis-assembling code formation code block provided in an embodiment of the present invention optional Schematic diagram;
Fig. 9-2 is one based on code tree segmentation dis-assembling code formation code block provided in an embodiment of the present invention optional Schematic diagram;
Fig. 9-3 is one based on code tree segmentation dis-assembling code formation code block provided in an embodiment of the present invention optional Schematic diagram;
Figure 10 is the api function that extraction operating system provided in an embodiment of the present invention is provided and stores to api function storehouse One optional treatment schematic diagram;
Figure 11 is an optional treatment schematic diagram of calculating condition code similarity provided in an embodiment of the present invention;
Figure 12 is an optional structural representation of condition code processing unit 20 provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with drawings and Examples, the present invention will be described in further detail.It should be appreciated that mentioned herein Embodiment is only used to explain the present invention, is not intended to limit the present invention.In addition, embodiment provided below is for implementing Section Example of the invention, rather than provide implementation whole embodiments of the invention, creation is not paid in those skilled in the art Property work on the premise of, the embodiment of gained is recombinated to the technical scheme of following examples and based on to invention institute reality The other embodiment applied belongs to protection scope of the present invention.
It should be noted that in embodiments of the present invention, term " including ", "comprising" or its any other variant be intended to Cover including for nonexcludability, so that method or device including a series of key elements not only include that what is be expressly recited wants Element, but also other key elements including being not expressly set out, or also include for implementation or device it is intrinsic want Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including the key element Method or device in also there is other relevant factor (such as the unit in step or device in method).
For example, virus signature processing method provided in an embodiment of the present invention contains a series of step, but this hair The virus signature processing method that bright embodiment is provided is not limited to described step, similarly, provided in an embodiment of the present invention Virus signature processing unit includes a series of units, but virus signature processing unit provided in an embodiment of the present invention is not It is limited to include unit be expressly recited, it is required when can also include to obtain relevant information or being processed based on information The unit of setting.
Before the present invention will be described in further detail, noun and term to being related in the embodiment of the present invention are said Bright, the noun and term being related in the embodiment of the present invention are applied to following explanation.
1) virus, also referred to as computer virus or malicious code are organizers terminal (such as smart mobile phone, panel computer, The various computing terminals such as notebook computer, desktop computer) implantation destruction terminal function, destroy data or steal data etc. Malicious intent binary code.
2) sample, such as general designation of various types of application programs, Microsoft Window systematic differences program, Unix system should With program, iOS system application program and Android (Andriod) system application etc..
3) malice sample, including virulent sample.
4) normal sample, without virulent sample.
5) code block, the dis-assembling code to application program carries out dividing the block to be formed according to certain particle size.
6) application programming interfaces (API, Application Programming Interface), that is, use various programmings The api function that language is realized, is to be supplied to application program to use respectively by operating system (Operating system) or program library The DLL of service (or function) is planted, application program can be helped to reach unlatching form, generating writing pattern and using terminal function (such as image and position) purpose.
7) function, i.e. subprogram, while can realize fixing calculation function, also go out with an entrance and one Mouthful, so-called entrance is exactly the parameters of function institute band, and the parameter value of function is substituted at subprogram by this entrance Reason;Outlet refers to the functional value of function, after functional value is tried to achieve, by outlet band back to function caller.
8) code block feature, also referred herein as feature, refer to the feature to this behavior of code block API Function A kind of digitized feature of coding (as used hash algorithm, BASE64 algorithm codings) generation.
9) condition code, the citation form of condition code is the set of the feature of each code block of sample;In addition, in condition code The Integral Characteristic of sample can also be included, such as the byte number (i.e. memory space shared by sample) etc. of sample.
It is when detecting virus based on the virus signature that correlation technique is provided, the condition code of sample to be detected and virus is special Code is levied to be matched.For example, the cryptographic Hash of sample (such as the file of application program) in itself is carried out with the cryptographic Hash in condition code Matching, and by the file byte in the binary word joint number (volume of the sample for being represented with byte number) of sample and condition code Number is matched, and the virus signature that correlation technique is provided generally uses following form:
Form 1) Hash character string (HashString);File byte number (FileSize);Malware title (MalwareName)
To should an example of form be:
507d8f868c27feb88b18e6f8426adf1c;12391;Win.Exploit.CVE_2013_3163
Form 2:MalwareName=HexSignature
Use form 2 condition code an example for:
Trojan.URLspoof.gen (Clam)=2e687265663d756e6573636170652827*3a2f2f*
As can be seen that virus signature is very sensitive for the change to sample, as long as malice sample has slight Change, the cryptographic Hash and byte number that may result in malice sample change, and then cause originally to be able to detect that malice sample In take the failure of viruliferous virus signature, the broad spectrum activity of influence virus signature detection virus, the detection for new virus is deposited In hysteresis quality.
In the embodiment of the present invention, semantic point is not carried out to malice sample when virus signature is extracted for correlation technique The problem of analysis, there is provided the feature construction condition code of the API Calls based on malice sample with detect virus scheme, the tune of code block With the behavioural characteristic of api function, Compiler Optimization strategy and virus authors can be preferably eliminated to the modification of source code to feature The interference that code is introduced, improves the broad spectrum activity of condition code, it is to avoid the hysteresis quality of Viral diagnosis, improves the efficiency of Viral diagnosis And precision.
Referring specifically to Fig. 1, Fig. 1 is extraction virus signature provided in an embodiment of the present invention and the condition code based on virus Whether detection sample takes viruliferous one optional treatment schematic diagram, be related to api function storehouse to generate, the generation of virus characteristic storehouse and Three parts of pattern detection, illustrate separately below.
1) api function storehouse generation, detects the api function of operating system offer of terminal (namely in terminal operating system It is integrated in the api function in storehouse, referred to as built-in function), and third-party api function is (such as embedding in an operating system in terminal Api function in the third party library for entering, referred to as third party library function, or the API letters that the application program installed in terminal is provided Number), the api function that will be extracted is stored in API library.For example, by api function in api function storehouse with<Api function is in terminal The coding result (such as Hash coding, BASE64 codings) in path, api function mark>The form storage of such two tuple.
2) feature database generation, by the destination path and api function of the function call of each code block of known malice sample The path of api function is compared in storehouse, detects the feature (i.e. code block feature) that the api function of code block is called, and is included in generation The mark of the api function called in code block, and the number of times of corresponding api function is called in code block.
Code block feature with<The mark of the api function for calling, the call number of api function>... so form sequence Row storage, each code block feature of malice sample is merged the condition code to form virus, is stored in virus characteristic storehouse.
3) pattern detection, condition code is extracted from sample to be detected, by the condition code and virus signature of sample to be detected Compare, the similarity of feature based code judges whether sample to be detected carries virus.
It it is one of virus signature processing method provided in an embodiment of the present invention optional referring back to Fig. 2 and Fig. 3, Fig. 2 Treatment schematic diagram, Fig. 3 is an optional schematic flow sheet of virus signature processing method provided in an embodiment of the present invention, For the condition code of virus is extracted from the malice sample including virus, dis-assembling is carried out to carrying virulent malice sample Treatment, the dis-assembling code that will be obtained carries out splitting the multiple code blocks (step 101) for obtaining malice sample;Traversal code block is obtained The function call performed in code block, the destination path of function call is compared with the path of application program interface function, really Determine the number of times (step 102) of the application program interface function and calls application interface function called in code block;It is based on The application program interface function and the number of times of calls application interface function called in code block build corresponding code block Feature (step 103);The virus signature of virus entrained by the feature construction malice sample of each code block based on malice sample (step 104).
Above-mentioned steps can automate realization by way of machine processing, depend on terminal (such as terminal or server) Computing capability efficiently complete;Meanwhile, virus characteristic is built using the feature of API Function in each code block of malice sample Code, with the cryptographic Hash and byte number structure virus characteristic code-phase that correlation technique uses the malice sample binary data of itself Than because the feature of malice sample API Function can accurately reflect semanteme spy of the malice sample when malicious intent is realized Property, do not influenceed by cryptographic Hash and the byte number change of malice sample itself binary data, therefore, it is possible to realize detection virus Broad spectrum activity;In addition, even if virus distribution person is modified to the virus that malice sample is carried, but it is same for carrying For the viral sample of race, the feature of the API Function of malice sample has metastable characteristic, therefore, based on tune The virus after developing is able to detect that with the feature construction condition code of api function, it is to avoid the condition code inspection that correlation technique is provided Survey virus and there is a problem of hysteresis quality.
The embodiment of the present invention also provides to perform the virus signature processing unit of above-mentioned virus signature processing method, Hardware in virus signature processing unit can be fully deployed in user's lateral terminal or network side server.
Antivirus application for example is provided as in terminal, terminal timing pulls malice sample and extracts virus from malice Sample Storehouse Condition code is simultaneously stored, (to be detected based on the viral condition code application installed to terminal local and the application installed Sample) security sweep is carried out, the security strategy treatment according to terminal local for example includes:1) to detecting including virulent Application shielding to be installed is installed;2) installed using isolating including virulent to detecting;3) prompting user and basis The processing mode of user's selection is processed.
Again for example, with reference to Fig. 4, Fig. 4 is that virus signature processing unit provided in an embodiment of the present invention is deployed in network side The optional schematic diagram of of server, server provides cloud antivirus service, and server timing pulls malice from malice Sample Storehouse Sample simultaneously extracts the condition code of virus, in virus characteristic library storage from the condition code of malice sample extraction, the feature based on virus The condition code of the sample to be detected that the antivirus application of code end of scan is submitted to, scanning result, root are issued to the antivirus application of terminal Processed according to the security strategy of terminal local, for example, included:1) to detecting including virulent application shielding installation to be installed;2) Installed using isolating including virulent to detecting;3) point out user and carried out according to the processing mode that user selects Treatment.
Referring to an optional software and hardware structure schematic diagram of the virus signature processing unit 10 shown in Fig. 5, virus is special Levying yard processing unit 10 includes hardware layer, intermediate layer, operating system layer and software layer.However, those skilled in the art should Understand, the structure of the virus signature processing unit 10 shown in Fig. 5 is merely illustrative, does not constitute to virus signature processing unit The restriction of 10 structures.For example, virus signature processing unit 10 can be set compared with the more components of Fig. 5 according to needs are implemented, or Person needs to omit setting unit component according to implementation.
The hardware layer of virus signature processing unit 10 include processor 11, input/output interface 13, storage medium 14 with And network interface 12, component can be through system bus connection communication.
Processor 11 can using central processing unit (CPU), microprocessor (MCU, Microcontroller Unit), specially With integrated circuit (ASIC, Application Specific Integrated Circuit) or logic programmable gate array (FPGA, Field-Programmable Gate Array) is realized.
Input/output interface 13 can be realized using such as display screen, touch-screen, loudspeaker input/output device.
Storage medium 14 can be realized using non-volatile memory mediums such as flash memory, hard disk, CDs, it would however also be possible to employ double The volatile storage mediums such as rate (DDR, Double Data Rate) dynamic buffering are realized, wherein being stored with to perform above-mentioned disease The executable instruction of malicious condition code processing method.
Exemplarily, storage medium 14 can with the other assemblies of virus signature processing unit 10 in same position (such as User's lateral terminal) set, it is also possible to it is distributed relative to the other assemblies in virus signature processing unit 10 and is set.Network interface 12 access abilities that the external data storage medium 14 that such as strange land is set is provided to processor 11, exemplarily, network interface 12 Near-field communication (NFC, Near Field Communication) technology, bluetooth (Bluetooth) technology, purple honeybee can be based on (ZigBee) short-range communication that technology is carried out, furthermore it is also possible to realize such as based on CDMA (CDMA, Code Division Multiple Access), WCDMA (WCDMA, Wideband Code Division Multiple Access) etc. The cellular communication of communication standard and its evolution standard, and for example, based on Wireless Fidelity (Wi-Fi) mode via access nothing The communication of line access point (AP, Access Point) access network side.
Drive layer include for for operating system 16 recognize hardware layer and with the middleware 15 of each assembly communication of hardware layer, example It such as can be the set of the driver of each component for hardware layer.
Operating system 16 be used for user oriented graphical interfaces is provided, exemplarily, including plug-in unit icon, desktop background and Application icon, the support user of operating system 16 is via graphical interfaces to the control embodiment of the present invention of terminal to the soft of above-mentioned terminal Part environment such as OS Type, version is not limited, for example can be (SuSE) Linux OS, UNIX operating system or other Operating system.
The antivirus application of application layer including user side terminal operating/high in the clouds antivirus service 17, or can with terminal in The module (or feature card) of fail-safe software coupling, is provided with executable instruction, is used to perform above-mentioned virus signature Processing method.
Below, the characteristic processing method shown in Fig. 2 is further illustrated with reference to Fig. 6, it should be pointed out that ground, people in the art Member is based on following recording based on Fig. 6 can be real in condition code processing unit is deployed in the scene of subscriber terminal side easily Apply.
Referring to Fig. 6, Fig. 6 is another optional schematic flow sheet of condition code processing method provided in an embodiment of the present invention, Comprise the following steps:
Step 201, server extracts the api function provided in terminal operating system, and/or extracts the third party in terminal Api function.
In one embodiment, from being exclusively used in from collecting, the database of storage api function pulls different type behaviour to server Make the api function provided in system, the operating system for each type makes a distinction according to version, and pulls third party API Function.Certainly, on the premise of server and terminal set up security authentication mechanism, server can be by the peace between terminal Full connection directly pulls above-mentioned api function from terminal.Above-mentioned different types of api function is illustrated separately below.
1) api function provided in the operating system of terminal
The api function of offer refers in terminal operating system, the built-in function of the primary offer of operating system, and built-in function is with storehouse Form is stored in the file system of terminal, the basic capacity for supporting the application program using terminal in terminal, exemplary Ground, including following several types api function:
1.1) network API function, for creating or closing network connection, Enumerate network resource.
1.2) Message Processing api function, for realizing the message transmission between window.
1.3) file process api function, for realizing that establishment, duplication and deletion etc. are related to the operation of file.
1.4) api function is printed, for supporting that the application program in terminal realizes printing function.
1.5) drawing api function, for the function of realizing drawing.
2) the third party's api function in terminal
In order to realize some functions of expanding, such as various software development environments are built, extra in the operating system of terminal The api function of the third party library of injection, such as the API corresponding with exclusive function that the various third party applications of terminal are provided Function, by taking wechat client as an example, api function can be wechat software development kit (SDK, Software Development Kit) wechat is provided to pay, share the corresponding api function of the functions such as circle of friends.The extraction position of third party's api function according to The SDK files of different third party applications are otherwise varied in the storage location of terminal,
Step 202, server is encoded to the path of detected api function, by the coding in the path of api function Result is stored in api function storehouse together with the mark of api function.
The path of api function can uniquely position one api function of mark for " bag name+class name+api function name " composition Character string,
It is that the embodiment of the present invention is carried referring to Fig. 7-1 and Figure 10, Fig. 7-1 for the api function that terminal operating system is provided The api function of the extraction operating system offer of confession is simultaneously stored to an optional schematic flow sheet in api function storehouse, and Figure 10 is this Api function that the extractions operating system that inventive embodiments are provided is provided and storing to one of api function storehouse optional treatment shows It is intended to, with the operating system of terminal as Android as a example by (Andriod) operating system, the API of the function class of android system definition Function is all in core.jar and framework.jar the two jar bags.
First, the core.jar bags and framework.jar bags of each version Andriod operating systems are collected, is then solved These jar bags are analysed, the path of all of api function in the inside is extracted.
For example, in Fig. 10, the path of api function (int state, String incomingNumber) is:
Android.telephony.PhoneListener
Void onCallStateChanged(int state,String incomingNumber)
Secondly, it is and by the path integration of these functions that (Smali codes are that the Dalvik of Android is virtual to Smali language Code language after the executable file DEX file dis-assembling of machine) description form, facilitate matching during subsequent extracted feature.
As a example by still with foregoing api function (int state, String incomingNumber), Smali languages are converted to The example of form for saying description is:
Landroid/telephony/PhoneStateListener
onCallStateChanged(ILjava/lang/String;)v
Finally, it is the path computing cryptographic Hash of these Smali language description, by the cryptographic Hash in the path of api function and is The sequence number (mark) of api function distribution is stored in api function storehouse.
As a example by still with foregoing api function (int state, String incomingNumber), Smali language is retouched The path stated is encoded, and assigned sequence number, is obtained:
<Cryptographic Hash:4036329264617481551;Sequence number:12>.
Changed for the path of other api functions in Fig. 10, encoded and the treatment of assigned sequence number can be with Understood based on described above, no longer illustrated one by one.
Certainly, it is necessary to explanation, the coding result in the path of api function using hash algorithm except being calculated Cryptographic Hash, can also be encoded using the other kinds of encryption algorithm such as BASE 64 and obtained.
Api function is using the coding result of law path and two tuples as the mark of api function in api function storehouse Form represent, be located in api function storehouse and store api function i (i is the sequence number of api function, and 1≤i≤I, I are what is extracted The quantity of api function) an optional data structure be:
<The cryptographic Hash in the path of api function i, i>.
Encoded for the path of third party's api function and by coding result and the sequence number of third party's api function Store to the treatment in api function storehouse, the processing mode with the foregoing api function provided for operating system is identical, here no longer Illustrate in a separate paper.
Used as an interchangeable step of step 202, server is (rather than right to the path of detected api function The coding result in the path of api function) it is stored in api function storehouse together with the mark of api function.
The form table of two tuples as the path of api function and the mark of api function can be stored in api function storehouse Show, (i is the sequence number of api function to storage api function i, and 1≤i≤I, I are the api function for extracting in being located at api function storehouse Quantity) an optional data structure be:
<The path of api function, i>.
Step 203, server pulls malice sample from malice Sample Storehouse.
Malice Sample Storehouse can be with the connection of existing malice sample, for example, the viral number from different families Docked according to storehouse, including:
1) system virus database, usually, system virus has area according to the difference of system in malice Sample Storehouse Not, prefix is:Win32, PE, Win95, W32 and W95 etc..
2) worm-type virus database, the prefix of worm-type virus is:Worm.The total characteristic of this virus be by network or Person's system vulnerability is propagated, and significant portion of worm-type virus is sent out band contaminated mail, the characteristic of clogging networks.
3) script virus database, the prefix of script virus is:Script.The total characteristic of script virus is to use script Language is write, the virus of the propagation carried out by webpage.
4) back door virus database, back door virus prefix be:Backdoor, the total characteristic of the viroid is by net Network is propagated, and is practiced backdoorism to system.
5) destructive program virus database, the prefix of destructive program virus is:Harm.The total characteristic of this viroid It is that there is good-looking icon to lure user to click on for itself, when user clicks on this viroid, virus will directly to user's end End produces destruction.
For example, requirement of real-time of the malice Sample Storehouse according to Scan for Viruses, according to the frequency never consanguinity of week/day/hour The virus database of race pulls the malice sample including virus, is never pulled with the virus database unification of family, or, according to The renewal frequency of each family viral database is individually pulled.
Step 204, server obtains dis-assembling code to carrying out dis-assembling treatment including virulent malice sample.
For carrying out dis-assembling treatment to malice sample, executable file is extracted from malice sample, according to can hold The form of the operating system executable file that style of writing part is run there is also difference, and file is performed in Windows operating system is Executable file is that executable file is dex lattice in elf forms, Android operation system in exe forms, (SuSE) Linux OS Formula, elf forms etc., then pair voluntarily can carry out dis-assembling treatment by file, and referring to Fig. 8, Fig. 8 is provided in an embodiment of the present invention An optional schematic diagram of dis-assembling treatment is carried out to executable file, the result of dis-assembling treatment includes:
1) no initializtion data (BSS, Block Start by Symbol) section:It is complete for no initializtion in storage program One piece of region of memory of office's variable;
2) data segment:One piece of region of memory of the global variable for being initialized in storage program.Including variable data Section and immutable data segment.
3) code segment (code segment/text segment):It is commonly used to one piece that storage performs code (sentence) Region of memory.
4) heap:For depositing the application heap that is dynamically allocated in process operation, size and do not fix, dynamic extending. When process calls the storage allocations such as malloc, newly assigned internal memory is dynamically added on heap (heap is extended), works as utilization During the function releasing memory such as free, the internal memory being released is rejected from heap
5) stack:Stack is produced when process is run, and a process has a process stack.Stack is used for storage program storage temporarily The variable defined in local variable, i.e. function, not including the variable of static (static) type.
Step 205, server carries out splitting the multiple code segments for obtaining malice sample to dis-assembling code.
After the completion of decompiling treatment, code segment is divided into by code block by the code segment for traveling through executable file, referring to Fig. 8, Fig. 8 are that an optional treatment for being divided into code block in the embodiment of the present invention to the code segment of executable file is illustrated Figure, in fig. 8 as a example by splitting to code segment, code block is with letter to dis-assembling code (code segment as shown in Figure 8) The path of number or intended level is split for granularity, using following partitioning scheme:
Mode 1) obtain code block by granulometric dis-assembling code of function
The dis-assembling code segment of malice sample is traveled through, dis-assembling code split as granularity with function is obtained being constituted instead Multiple functions (now function is equal to code block) of assembly code;It is of course also possible to two functions or multiple function be grain Spend to code segment carry out segmentation formed constitute code segment multiple code blocks (now each code block include two or more letters Number).
Function is the basic logic unit for constituting code segment, and each function contains a complete treatment logic, in generation Code section is split according to function granularity, on the one hand can easily realize the segmentation to dis-assembling code, on the other hand can be with complete The whole logic retained inside dis-assembling code.
Mode 2) obtain code block by granulometric of the path of the different stage of code tree
Referring to Fig. 9-1, Fig. 9-1 is provided in an embodiment of the present invention based on code tree segmentation dis-assembling code formation code block An optional schematic diagram, according to path (including one-level path, second grade highway footpath and three that rank is preset in code tree Level path), the code under each one-level path is divided into a single code block, certainly, can for one-level path A single code block is divided into by each the second grade highway footpath under one-level path.
Referring back to Fig. 9-2, Fig. 9-2 is provided in an embodiment of the present invention based on code tree segmentation dis-assembling code formation code The optional schematic diagram of of block, for malice sample is the application program run in Android operation system, from application journey Form is extracted in sequence for the executable file of Dex carries out dis-assembling, the dis-assembling code described with Smali language is obtained, will be anti- Assembly code is divided into code block.For example, the path of class rank can be chosen for, Dex is divided into the path pair with class rank The code block answered, a class in each code block correspondence Dex.
In Fig. 9-2, a class in each code block correspondence dis-assembling code, specially:
Code block 1:Com.android.internal.app.ActionBarImpl,
Code block 2:Com.android.internal.app.AlertActivity,
Code block 3:Com.android.internal.app.AlertController,
,……。
Certainly, server can also using other arbitrary number of levels other path segmentation dis-assembling code, for example, referring back to Fig. 9- 3, Fig. 9-3 are the optional signals that code block is formed based on code tree segmentation dis-assembling code provided in an embodiment of the present invention Figure, can split dis-assembling code, each code that segmentation is obtained according to the preceding level Four path in the code tree shown in Fig. 9-3 A level Four path in block correspondence code tree, specially:
Code block 1:Com.android.internal.app,
Code block 2:Com.android.internal.appwidget,
Code block 3:Com.android.internal.backup,
……。
Step 206, the function that each code block of server traversal dis-assembling code obtains being performed in each code block is adjusted With, the destination path of function call is compared with the path of application program interface function in api function storehouse, determine each code block In the application program interface function called and calls application interface function number of times.
According in api function storehouse store api function data structure difference, with regard to server by code block j (1≤j≤J, J is to split the quantity of the code block for obtaining in dis-assembling code) in function call destination path and application programming interfaces letter Several paths comparatively, can there is following several ways:
Mode 1) in api function storehouse with<The path of api function, i>Such form stores api function, and server will be The destination path of function call is detected in code block j, with api function storehouse in api function i path according to path field Matched one by one, when each field in path is matched completely, it is determined that currently detected function call in code block j is Api function is called the call number cumulative 1 in code block j for api function i.
Mode 2) in api function storehouse with<The coding result (such as cryptographic Hash) in the path of api function, i>Such form is deposited The destination path that function call is detected in code block j is encoded storage api function, server (also, and api function The coded system in the path of api function is consistent in storehouse, such as using identical hash algorithm), with the api function i in api function storehouse The coding result in path be compared, if coding result unanimously if illustrate that path is identical, it is determined that currently being examined in code block j The function call for measuring is called for api function, by the call number cumulative 1 in code block j for api function i.
Obviously, judge whether path is consistent using the mode of the coding result for comparing path, and by each word in path Section be compared one by one it is identical can lift treatment effeciency, especially when the path of api function is more long, treatment effeciency is significantly carried Rise.
Step 207, based on the application program interface function and calls application interface function called in code block Number of times builds corresponding code block feature.
In one embodiment, for each code block, with the mark of each api function called in code block and The call number in code block of corresponding application programs interface function forms a characteristic element, and each is called in code block Api function form a characteristic element, collection is formed based on the corresponding characteristic element of whole api functions for being called in code block Close, carrying out coding to set forms code block feature.
Still by taking code block j as an example, by the call function k in code block j, (1≤k≤K, K are the difference of execution in code block J The quantity that api function is called) characteristic element k is formed, it is recorded as the characteristic element of following form<The sequence number of api function, generation The number of times of the api function k called in code block j>, and then formed the following form of code block j set<The sequence of api function Number, the number of times of the api function k called in code block j>;1≤k≤K }, set is encoded (as compiled using hash algorithm Code), using coding result as code block feature.
Step 208, the code block feature of each code block of malice sample is merged the virus signature to form malice sample, Store to virus characteristic storehouse.
Still by taking code block j as an example (1≤j≤J, J be malice sample dis-assembling code in split the number of the code block for obtaining Amount), if correspondence code block feature j, then malice sample takes viruliferous virus signature and can use following form:{<Generation Code block feature 1>;<Code block feature 2>;……<Code block feature J>},
Abovementioned steps 204 to step 207 is a malice sample to being pulled in malice Sample Storehouse and calculates entrained disease The handling process of the virus signature of poison, for multiple malice samples of malice Sample Storehouse, circulation performs such as abovementioned steps 204 To the treatment of the calculating virus signature of step 207, referring to Fig. 7-2, Fig. 7-2 is provided in an embodiment of the present invention for malice sample Malice sample in this storehouse calculates an optional schematic flow sheet of the virus signature of entrained virus, according to abovementioned steps 204 to step 207 calculates the virus signature for extracting virus entrained by a malice sample at random from malice Sample Storehouse, until All malice samples in traversal malice Sample Storehouse.
Server is the corresponding viral allocation identification (sequence number VID) of calculated virus signature, in virus characteristic In storehouse with<Virus sequence number, virus signature>Such whole virus signatures of two tuples form storage.
In addition, it is necessary to point out ground, foregoing is to store the api function extracted from terminal (e.g., terminal behaviour in function library The api function provided in system is provided, and/or extracts the third-party api function in terminal) as a example by, due in api function storehouse The function for being extracted has been prestored, thus it is follow-up when the code block of dis-assembling code of malice sample is traveled through, can be based on Calling for api function in the rapid location code block of api function of function library storage, it is ensured that treatment effeciency.
However, it can be appreciated that in the case where the computing capability of server is enough, API is safeguarded in the embodiment of the present invention The step of function library can be default execution, server can needing detection code block for when calling of api function, just from Terminal extracts api function and the api function from terminal extraction is stored in the caching of server local (as including api function The coding result and sequence number in path), i.e., function library need not be separately maintained, can so realize that the path of api function is total It is newest, it is to avoid the api function in terminal changes causes the hysteresis quality of virus signature.
Step 209, server extracts the condition code of sample to be detected, the condition code of sample relatively more to be detected and the spy of virus The similarity that code obtains condition code is levied, judges whether sample to be detected carries virus based on similarity.
Referring to Fig. 7-3, Fig. 7-3 be it is provided in an embodiment of the present invention sample to be detected detect whether taking it is viruliferous One optional schematic flow sheet, illustrates with reference to Fig. 7-3.
Firstly, for any sample to be detected, server extracts corresponding condition code from sample to be detected, is designated as df。
Specifically, server extracts executable file from sample to be detected, and the executable file to extracting enters Row dis-assembling treatment obtains dis-assembling code, the mode that the dis-assembling code of reference pair malice sample is split:Mode 1) with Function obtains code block, mode 2 for granulometric dis-assembling code) with the path of the different stage of code tree as granulometric is obtained To code block.
Each code block of server traversal dis-assembling code obtains the function call performed in each code block, by function The destination path for calling compares with the path of application program interface function in api function storehouse, determines what is called in each code block The number of times of application program interface function and calls application interface function.
For example, according to the difference of the data structure that api function is stored in api function storehouse, with regard to server by code block j The destination path and application journey of the function call in (1≤j≤J, J are dis-assembling code in split the quantity of the code block for obtaining) The path of sequence interface function comparatively, can there is following several ways:
Mode 1) in api function storehouse with<The path of api function, i>Such form stores api function, and server will be The destination path of function call is detected in code block j, with api function storehouse in api function i path according to path field Matched one by one, when each field in path is matched completely, it is determined that currently detected function call in code block j is Api function is called the call number cumulative 1 in code block j for api function i.
Mode 2) in api function storehouse with<The coding result (such as cryptographic Hash) in the path of api function, i>Such form is deposited The destination path that function call is detected in code block j is encoded storage api function, server (also, and api function The coded system in the path of api function is consistent in storehouse, such as using identical hash algorithm), with the api function i in api function storehouse The coding result in path be compared, if coding result unanimously if illustrate that path is identical, it is determined that currently being examined in code block j The function call for measuring is called for api function, by the call number cumulative 1 in code block j for api function i.
Number of times based on the application program interface function and calls application interface function called in code block builds Corresponding code block feature.For each code block, with the mark of each api function called in code block and accordingly should A characteristic element, each called API in code block are formed with the call number in code block of program interface functions Function forms a characteristic element, set is formed based on the corresponding characteristic element of whole api functions called in code block, to collection Conjunction carries out coding and forms code block feature;The code block feature of each code block of sample to be detected is merged to form sample to be detected Condition code.
Secondly, virus signature and correspondence sequence number are extracted from virus signature, if the virus signature of current extraction is Vf, Serial No. VID.
Again, by the condition code df of sample to be detected (for example, software installation bag of the apk forms of Android operation system) with Virus signature vf in virus characteristic storehouse is compared, and obtains the disease in the condition code df and virus characteristic storehouse of sample to be detected The quantity S of malicious condition code vf total code block feature.
Specific example here in conjunction with a calculating similarity is illustrated, and referring to Figure 11, Figure 11 is the embodiment of the present invention The optional treatment schematic diagram of of the calculating condition code similarity of offer.
In fig. 11, it is assumed that obtain 3 code blocks after the dis-assembling code division for taking viruliferous malice sample A, be designated as: A1;A2;A3.
The api function called in code block A1 and corresponding call number are using (api function sequence number is called secondary Number) as two tuples record, then code block A1 is called api function and corresponding call number are with set expression: { (12,3), (15,1), (22,1) }, will gather obtain the code block feature of code block A1 after calculating Hash:1800939131.
Similarly, the api function and call number that code block A2 is called be using set expression:{ (56,90) }, calculate The code block feature of code block A2 is obtained after Hash:1369398484.
Similarly, the api function and call number that code block A3 is called be using set expression:{(32,54),(123, 34), (132,36), (645,1) }, the code block feature of code block A3 is obtained after calculating Hash:2596230670.
By code block A1;A2;The code block feature of A3 merges, and the virus signature for obtaining malice sample A samples is A= {1800939131,1369398484,2596230670}。
Assuming that sample B to be detected includes 4 code blocks, it is designated as:B1, B2, B3 and B4.
The api function and call number that code block B1 is called be using set expression:{(12,3),(15,1),(22, 1) the code block feature that code block B1 is obtained after Hash }, is calculated:1800939131.
The api function and call number that code block B2 is called be using set expression:{ (32,3), (122,3) }, will The code block feature of code block B2 is obtained after its calculating Hash:4111055178.
The api function and call number that code block B3 is called be using set expression:{ (56,91) }, are calculated Kazakhstan The code block feature of code block B3 is obtained after uncommon:1348286179
The api function and call number that code block B4 is called be using set expression:{ (56,35), (68,9) }, will The code block feature of code block B4 is obtained after its calculating Hash:281916613
Therefore, the feature B={ 1800939131,4111055178,1348286179,281916613 } of sample B.
By the virus signature of comparative sample A, with the common code block feature of the condition code of sample B to be detected= { 1800939131 }, then similarity similarity (A, B) can be calculated using such a way:
Similarity (A, B)=count ({ 1800939131 })/count (A)=1/3=0.33.
The similarity of the virus signature vf in the condition code df of sample to be detected and virus characteristic storehouse, it is possible to use S/M (wherein M is the quantity of the code block feature included by virus signature vf) represents, if similarity is more than similarity threshold (N/ M, that is, S≤N), illustrate to carry virus VID in sample to be detected.
If similarity is without departing from similarity threshold, illustrate that the api function of sample to be detected is called with virus for API letters There is larger difference in several calling, continue to proceed to compare from the other virus signatures of virus characteristic storehouse extraction, if phase Similarity threshold is respectively less than like degree, illustrates that sample to be detected does not carry virus, belong to normal sample.
The functional structure to aforementioned viral condition code processing unit is illustrated again, and referring to Figure 12, Figure 12 is of the invention real One optional structural representation of the condition code processing unit 20 of example offer is provided, including:Compilation cutting unit 21, function call Unit 22, construction feature unit 23 and feature combining unit 24, illustrate separately below.
Compilation cutting unit 21, for carrying out dis-assembling treatment to carrying virulent malice sample, the anti-remittance that will be obtained Compiling code carries out splitting the multiple code blocks for obtaining malice sample.
For example, for the dis-assembling code that will be obtained carries out splitting the multiple code blocks for obtaining malice sample, compilation point Path of the unit 21 according to the code tree of dis-assembling code is cut, the path of intended level is granulometric dis-assembling with code tree Code obtains multiple code blocks, or, obtain multiple code blocks according to function by granulometric dis-assembling code of function.
Function calling cell 22, the function call performed in code block for traveling through code block to obtain, by function call Destination path compares with the path of application program interface function, determines the application program interface function called in code block, and The number of times of calls application interface function.
With regard to function calling cell 72 by the path of the destination path of function call and application program interface function comparatively, Function calling cell 22 is used for the application program interface function provided in the operating system for obtain terminal, and/or the 3rd in terminal The application program interface function of side, is each application program interface function allocation identification (such as sequence number), by the target of function call Path is compared with the path of acquired application program interface function, will each field correspondence in path be compared and be No consistent record, the mark and corresponding call number of the application program interface function that record is called in code block.
With regard to function calling cell 72 by the path of the destination path of function call and application program interface function comparatively, Function calling cell 22, is additionally operable to the application program interface function provided in the operating system to acquired terminal, and/or eventually The path of third-party application program interface function is encoded in end, and to application program interface function allocation identification, by letter The coding result of the destination path that number is called, the coding result in the path of application program interface function is compared in and function storehouse Compared with, if coding result is consistent, currently detected function call is illustrated for application program interface function is called, record in code block In the mark and corresponding call number of application program interface function called.
With regard to function calling cell 72 by the path of the destination path of function call and application program interface function comparatively, Function calling cell 22 is additionally operable to prestore application program interface function in function library, for example, by acquired application program The coding result in the path of interface function and the mark for the distribution of corresponding application programs interface function are stored in function library.In letter During the number traversal code block of call unit 72, by the coding result of the destination path of function call, application program connects in and function storehouse The coding result in the path of mouth function is compared, if coding result is consistent, illustrates that currently detected function call is application Program interface functions are called, and are recorded the mark of the application program interface function called in code block and are called accordingly secondary Number.
Construction feature unit 23, for based on the application program interface function called in code block and calling and applying journey The number of times of sequence interface function builds corresponding code block feature.
For piece code block feature, construction feature unit 23 is additionally operable to the application interface letter to be called in code block Several marks and the call number of corresponding application programs interface function form characteristic element, each based on what is called in code block The corresponding characteristic element of application program interface function forms set, and carrying out coding to set forms code block feature.
Feature combining unit 24, for merging to form malice sample the code block feature of each code block of malice sample Virus signature.
Pattern detection unit 25, the condition code for calculating sample to be detected, condition code and the disease of sample relatively more to be detected The condition code of poison obtains the similarity of condition code, judges whether sample to be detected carries virus based on similarity.
For the condition code that pattern detection unit 25 calculates sample to be detected, the condition code of sample relatively more to be detected is wrapped The code block feature for including and the code block feature included by virus signature, obtain sample to be detected and malice sample total generation Code block condition code, calculates the quantity ratio of total code block feature and the code block feature included by virus signature.
The condition code for comparing sample to be detected with regard to pattern detection unit 25 obtains the similar of condition code to the condition code of virus Degree, for judging whether sample to be detected carries virus based on similarity, pattern detection unit 25 is for by sample to be detected The destination path of function call compares with the path of predetermined application interface function in each code block, based on comparing generation for obtaining The application program interface function called in code block, and calls application interface function number of times, build corresponding code block Feature;The code block feature of sample to be detected is merged the condition code to form sample to be detected.
In sum, the embodiment of the present invention has the advantages that:
1) depending on the computing capability of terminal (such as terminal or server) can efficiently complete;
2) feature called using the api function of malice sample uses malice sample come construction feature code with correlation technique Cryptographic Hash compare, realizing malice mesh because the feature that the api function of malice sample is called can accurately reflect malice sample When the feature of semanteme, by malice sample cryptographic Hash and byte number change influenceed, therefore, it is possible to realize detection virus Broad spectrum activity;
3) because the API Calls in malice sample have metastable characteristic, therefore, based on the spy that api function is called Levy construction feature code and be able to detect that the virus after developing, it is to avoid the signature detection virus that correlation technique is provided exists delayed The problem of property;
4) api function for extracting each code block of sample is called and after being encoded to it as feature.The method is considered The semanteme of program and behavior, the interference and virus authors that can preferably resist the introducing of Compiler Optimization strategy are repaiied to source code Change the interference of introducing, greatly improve the broad spectrum activity of condition code, reduce the difficulty of checking and killing virus.
It will be appreciated by those skilled in the art that:Realize that all or part of step of above method embodiment can be by journey Sequence instructs related hardware to complete, and foregoing program can be stored in a computer read/write memory medium, and the program exists During execution, the step of including above method embodiment is performed;And foregoing storage medium includes:Flash memory device, deposit at random Access to memory (RAM, Random Access Memory), read-only storage (ROM, Read-Only Memory), magnetic disc or CD etc. is various can be with the medium of store program codes.
Or, if the above-mentioned integrated unit of the present invention is to realize in the form of software function module and as independent product When selling or using, it is also possible to which storage is in a computer read/write memory medium.Based on such understanding, the present invention is implemented The part that the technical scheme of example substantially contributes to correlation technique in other words can be embodied in the form of software product, The computer software product is stored in a storage medium, including some instructions are used to so that computer installation (can be with It is personal computer, server or network equipment etc.) perform all or part of each embodiment methods described of the invention. And foregoing storage medium includes:Flash memory device, RAM, ROM, magnetic disc or CD etc. are various can be with store program codes Medium.
The above, specific embodiment only of the invention, but protection scope of the present invention is not limited thereto, and it is any Those familiar with the art the invention discloses technical scope in, change or replacement can be readily occurred in, should all contain Cover within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (18)

1. a kind of virus signature processing method, it is characterised in that including:
Dis-assembling treatment is carried out to carrying virulent malice sample, the dis-assembling code that will be obtained split and obtains the evil Multiple code blocks of meaning sample;
Travel through the function call that the code block obtains performing in the code block, by the destination path of the function call with should Compared with the path of program interface functions, determine the application program interface function called in the code block, and call described The number of times of application program interface function;
Based on the application program interface function called in the code block and the number of times for calling the application program interface function Build corresponding code block feature;
The code block feature of each described code block of the malice sample is merged the virus signature to form the malice sample.
2. the method for claim 1, it is characterised in that the dis-assembling code that will be obtained split and obtains described Multiple code blocks of malice sample, including:
The path of the code tree according to the dis-assembling code, the path of intended level is granulometric institute with the code tree State dis-assembling code and obtain multiple code blocks, or, dis-assembling code obtains many according to function described in function as granulometric Individual code block.
3. the method for claim 1, it is characterised in that the destination path and application program by the function call The path of interface function is compared, including:
The application program interface function provided in the operating system of terminal, and/or third-party application journey in the terminal are provided Sequence interface function, the destination path of the function call is compared with the path of acquired application program interface function.
4. method as claimed in claim 3, it is characterised in that the destination path by the function call with it is acquired The path of application program interface function is compared, including:
To the application program interface function provided in the operating system of acquired terminal, and/or third-party application in terminal The path of program interface functions is encoded, and the coding result of the destination path of the function call is connect with the application program The coding result in the path of mouth function is compared.
5. method as claimed in claim 4, it is characterised in that the coding result of the destination path by the function call Coding result with the path of the application program interface function is compared, including:
By the coding result in the path of acquired application program interface function and be corresponding application programs interface function distribution Mark is stored in function library, by application programming interfaces in the coding result of the destination path of the function call, with the function library The coding result in the path of function is compared.
6. the method for claim 1, it is characterised in that described based on the application programming interfaces called in the code block Function and the number of times of the application program interface function is called to build corresponding code block feature, including:
The mark and the call number of corresponding application programs interface function of the application interface function to be called in the code block Characteristic element is formed, collection is formed based on the corresponding characteristic element of each described application program interface function called in the code block Close, carrying out coding to the set forms the code block feature.
7. the method for claim 1, it is characterised in that also include:
The condition code of sample to be detected is calculated, the condition code of relatively more described sample to be detected obtains spy with the condition code of the virus The similarity of code is levied, judges whether the sample to be detected carries the virus based on the similarity.
8. method as claimed in claim 7, it is characterised in that the condition code and the disease of the comparing sample to be detected The condition code of poison obtains the similarity of condition code, including:
Compare the code block feature included by the condition code of the sample to be detected and the code included by the virus signature Block feature, obtains the sample to be detected and the total code block feature of malice sample, calculates the total code block feature With the quantity ratio of the code block feature included by the virus signature.
9. method as claimed in claim 7, it is characterised in that the condition code of the calculating sample to be detected, including:
By the road of the destination path of function call in each code block of the sample to be detected and the application program interface function Footpath is compared, based on comparing the application program interface function called in the code block that obtains, and the application programming interfaces The call number of function, builds corresponding code block feature;The code block feature of the sample to be detected is merged to form described The condition code of sample to be detected.
10. a kind of virus signature processing unit, it is characterised in that including:
Compilation cutting unit, for carrying out dis-assembling treatment to carrying virulent malice sample, the dis-assembling code that will be obtained Carry out splitting the multiple code blocks for obtaining the malice sample;
Function calling cell, the function call performed in the code block for traveling through the code block to obtain, by the function The destination path for calling compares with the path of application program interface function, determines the application programming interfaces called in the code block Function, and call the number of times of the application program interface function;
Construction feature unit, for based on the application program interface function called in the code block and calling the application The number of times of program interface functions builds corresponding code block feature;
Feature combining unit, for merging to form the malice code block feature of each described code block of the malice sample The virus signature of sample.
11. devices as claimed in claim 10, it is characterised in that
The compilation cutting unit, is additionally operable to the path of the code tree according to the dis-assembling code, with pre- in the code tree Other path define the level for dis-assembling code described in granulometric obtains multiple code blocks, or, described in function as granulometric Dis-assembling code obtains multiple code blocks according to function.
12. devices as claimed in claim 10, it is characterised in that
The function calling cell, is additionally operable to the application program interface function provided in the operating system for obtain terminal, and/or institute Third-party application program interface function in terminal is stated, the destination path of the function call is connect with acquired application program The path of mouth function is compared.
13. devices as claimed in claim 12, it is characterised in that
The function calling cell, is additionally operable to the application program interface function provided in the operating system to acquired terminal, And/or the path of third-party application program interface function is encoded in terminal, by the destination path of the function call Coding result, the coding result with the path of the application program interface function is compared.
14. devices as claimed in claim 13, it is characterised in that
The function calling cell, is additionally operable to the coding result in the path of acquired application program interface function and is phase The mark for answering application program interface function to distribute is stored in function library, by the coding result of the destination path of the function call, with The coding result in the path of application program interface function is compared in the function library.
15. devices as claimed in claim 10, it is characterised in that
The construction feature unit, is additionally operable to the mark of the application interface function called in the code block and accordingly should Characteristic element is formed with the call number of program interface functions, based on each described application programming interfaces called in the code block The corresponding characteristic element of function forms set, and carrying out coding to the set forms the code block feature.
16. devices as claimed in claim 10, it is characterised in that also include:
Pattern detection unit, the condition code for calculating sample to be detected, the relatively condition code of the sample to be detected with it is described The condition code of virus obtains the similarity of condition code, judges whether the sample to be detected carries the disease based on the similarity Poison.
17. devices as claimed in claim 16, it is characterised in that
The pattern detection unit, be additionally operable to code block feature described in comparing included by the condition code of sample to be detected with it is described Code block feature included by virus signature, obtains the sample to be detected and the total code block feature of malice sample, meter Calculate the quantity ratio of the total code block feature and the code block feature included by the virus signature.
18. devices as claimed in claim 17, it is characterised in that
The pattern detection unit, is additionally operable to the destination path of function call and institute in each code block of the sample to be detected The path for stating application program interface function is compared, based on comparing the application programming interfaces letter that is called in the code block that obtains Number, and the application program interface function call number, build corresponding code block feature;By the sample to be detected Code block feature merges the condition code to form the sample to be detected.
CN201710035588.8A 2017-01-18 2017-01-18 Virus characteristic code processing method and device Active CN106803040B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710035588.8A CN106803040B (en) 2017-01-18 2017-01-18 Virus characteristic code processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710035588.8A CN106803040B (en) 2017-01-18 2017-01-18 Virus characteristic code processing method and device

Publications (2)

Publication Number Publication Date
CN106803040A true CN106803040A (en) 2017-06-06
CN106803040B CN106803040B (en) 2021-08-10

Family

ID=58984570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710035588.8A Active CN106803040B (en) 2017-01-18 2017-01-18 Virus characteristic code processing method and device

Country Status (1)

Country Link
CN (1) CN106803040B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678968A (en) * 2017-10-18 2018-02-09 北京奇虎科技有限公司 Sample extraction method, apparatus, computing device and the storage medium of source code function
CN108334778A (en) * 2017-12-20 2018-07-27 北京金山安全管理系统技术有限公司 Method for detecting virus, device, storage medium and processor
CN109165514A (en) * 2018-10-16 2019-01-08 北京芯盾时代科技有限公司 A kind of risk checking method
CN109492396A (en) * 2018-11-12 2019-03-19 杭州安恒信息技术股份有限公司 Malware Gene Detecting method and apparatus based on semantic segmentation
CN110647747A (en) * 2019-09-05 2020-01-03 四川大学 False mobile application detection method based on multi-dimensional similarity
CN112148305A (en) * 2020-10-28 2020-12-29 腾讯科技(深圳)有限公司 Application detection method and device, computer equipment and readable storage medium
CN112579828A (en) * 2019-09-30 2021-03-30 奇安信安全技术(珠海)有限公司 Feature code processing method, device and system, storage medium and electronic device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136475A (en) * 2011-11-29 2013-06-05 姚纪卫 Method and device for detecting computer viruses
CN103970523A (en) * 2013-02-05 2014-08-06 中国移动通信集团广东有限公司 Method and device for recognition of JAVA compiling destination file
CN104391798A (en) * 2014-12-09 2015-03-04 北京邮电大学 Software feature information extracting method
CN104751052A (en) * 2013-12-30 2015-07-01 南京理工大学常熟研究院有限公司 Dynamic behavior analysis method for mobile intelligent terminal software based on support vector machine algorithm
CN105184160A (en) * 2015-07-24 2015-12-23 哈尔滨工程大学 API object calling relation graph based method for detecting malicious behavior of application program in Android mobile phone platform
CN106709349A (en) * 2016-12-15 2017-05-24 中国人民解放军国防科学技术大学 Multi-dimension behavior characteristic-based malicious code classification method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136475A (en) * 2011-11-29 2013-06-05 姚纪卫 Method and device for detecting computer viruses
CN103970523A (en) * 2013-02-05 2014-08-06 中国移动通信集团广东有限公司 Method and device for recognition of JAVA compiling destination file
CN104751052A (en) * 2013-12-30 2015-07-01 南京理工大学常熟研究院有限公司 Dynamic behavior analysis method for mobile intelligent terminal software based on support vector machine algorithm
CN104391798A (en) * 2014-12-09 2015-03-04 北京邮电大学 Software feature information extracting method
CN105184160A (en) * 2015-07-24 2015-12-23 哈尔滨工程大学 API object calling relation graph based method for detecting malicious behavior of application program in Android mobile phone platform
CN106709349A (en) * 2016-12-15 2017-05-24 中国人民解放军国防科学技术大学 Multi-dimension behavior characteristic-based malicious code classification method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107678968A (en) * 2017-10-18 2018-02-09 北京奇虎科技有限公司 Sample extraction method, apparatus, computing device and the storage medium of source code function
CN108334778A (en) * 2017-12-20 2018-07-27 北京金山安全管理系统技术有限公司 Method for detecting virus, device, storage medium and processor
CN109165514A (en) * 2018-10-16 2019-01-08 北京芯盾时代科技有限公司 A kind of risk checking method
CN109492396A (en) * 2018-11-12 2019-03-19 杭州安恒信息技术股份有限公司 Malware Gene Detecting method and apparatus based on semantic segmentation
CN110647747A (en) * 2019-09-05 2020-01-03 四川大学 False mobile application detection method based on multi-dimensional similarity
CN112579828A (en) * 2019-09-30 2021-03-30 奇安信安全技术(珠海)有限公司 Feature code processing method, device and system, storage medium and electronic device
CN112148305A (en) * 2020-10-28 2020-12-29 腾讯科技(深圳)有限公司 Application detection method and device, computer equipment and readable storage medium

Also Published As

Publication number Publication date
CN106803040B (en) 2021-08-10

Similar Documents

Publication Publication Date Title
CN106803040A (en) Virus signature processing method and processing device
US11188650B2 (en) Detection of malware using feature hashing
Crussell et al. Attack of the clones: Detecting cloned applications on android markets
Jung et al. Repackaging attack on android banking applications and its countermeasures
CN104123493B (en) The safety detecting method and device of application program
Zhang et al. Libid: reliable identification of obfuscated third-party android libraries
US20130246038A1 (en) Emulator updating system and method
JP6326502B2 (en) Reputation based on frequency
US10275593B2 (en) Secure computing device using different central processing resources
Singh et al. Experimental analysis of Android malware detection based on combinations of permissions and API-calls
CA3005314A1 (en) Systems and methods for detection of malicious code in runtime generated code
CN105683988A (en) Managed software remediation
CN109255235B (en) Mobile application third-party library isolation method based on user state sandbox
US9104862B2 (en) Secure computing device using new software versions
NL2027556B1 (en) Method and system for generating a list of indicators of compromise
US11200317B2 (en) Systems and methods for protecting a computing device against malicious code
Alfalqi et al. Android platform malware analysis
CN108319853A (en) Virus signature processing method and processing device
EP3506136B1 (en) Detecting stack cookie utilization in a binary software component using binary static analysis
CN106909839A (en) A kind of method and device for extracting sample code feature
US9361456B2 (en) Secure computing device using a library of programs
RU2815242C1 (en) Method and system for intercepting .net calls by means of patches in intermediate language
Malik Malware detection in Android phones
Vandhana et al. VIEGO: Malware Generating Tool
Johnstone et al. Controlled Android application execution for the IoT infrastructure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant