CN106909841A - A kind of method and device for judging viral code - Google Patents
A kind of method and device for judging viral code Download PDFInfo
- Publication number
- CN106909841A CN106909841A CN201510971165.8A CN201510971165A CN106909841A CN 106909841 A CN106909841 A CN 106909841A CN 201510971165 A CN201510971165 A CN 201510971165A CN 106909841 A CN106909841 A CN 106909841A
- Authority
- CN
- China
- Prior art keywords
- decompiling
- information structure
- function
- threshold value
- virtual machine
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/563—Static detection by source code analysis
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Virology (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
This application discloses a kind of method and device for judging viral code.Method therein includes:The virtual machine execution file to application program carries out decompiling, obtains the function information structure of decompiling;The function information structure of the decompiling is parsed, the function instruction sequence in the function information structure of the decompiling is extracted;It is determined that the editing distance between the function instruction sequence extracted and the function instruction sequence of default viral code;If it is determined that the editing distance be less than predetermined threshold value, it is determined that the virtual machine execution file of the application program include viral code.Using application scheme, can accurately judge whether certain application program on intelligent terminal belongs to the program by changing the character string of viral code reference to reach purpose free to kill, so that the safety of intelligent terminal.
Description
Technical field
The application is related to intelligent terminal security technology area, more particularly to a kind of method for judging viral code and
Device.
Background technology
With the development of science and technology, intelligent terminal has increasing function.For example, the mobile phone of people is from biography
GSM, TDMA digital mobile phone of system turned to possess can process multimedia resource, provide web page browsing,
The smart mobile phone of the much informations such as videoconference, ecommerce service.However, the increasingly various mobile phone of kind
Malicious code is attacked and the increasingly serious personal data safety problem of situation is also following, more and more
It is bitter that mobile phone viruses endure it to the fullest extent by smart phone user.
At present, the character string of virtual machine execution file is mainly based upon for the antivirus technique of all kinds of intelligent terminals
Killing is carried out, the character string feature of extraction is matched with the feature in virus base, however, some are viral
(such as trojan horse) can be easily free to kill to reach by changing the character string of viral code reference
Purpose, so as to the safety of intelligent terminal can not be ensured.
The content of the invention
The embodiment of the present application is provided and a kind of overcomes above mentioned problem or sentencing of solving the above problems at least in part
The method and device of disconnected viral code.
The embodiment of the present application uses following technical proposals:
A kind of method for judging viral code, including:
Virtual machine execution file to application program carries out decompiling, obtains the function information structure of decompiling;
The function information structure of the decompiling is parsed, the letter in the function information structure of the decompiling is extracted
Number command sequence;
It is determined that between the function instruction sequence extracted and the function instruction sequence of default viral code
Editing distance;
Judge the editing distance for determining whether less than predetermined threshold value, however, it is determined that the editing distance be less than
Predetermined threshold value, it is determined that the virtual machine execution file of the application program includes viral code.
Preferably, before whether the editing distance for judging to determine is less than predetermined threshold value, methods described is also
Including:
It is determined that the character sum of the function instruction sequence extracted;
The character sum of the function instruction sequence is defined as the predetermined threshold value with the product of default value;
Wherein, the default value is between 0~1.
Preferably, the virtual machine execution file to application program carries out decompiling, obtains decompiling
Function information structure, specifically include:
Virtual machine execution file is parsed according to virtual machine execution file form, obtains the function of each class
Information structure;
According to the field in the function information structure, the position of the function of the virtual machine execution file is determined
Put and size, obtain the function information structure of the decompiling.
A kind of method for judging viral code, including:
Virtual machine execution file to application program carries out decompiling, obtains the function information structure of decompiling;
Parse the function information structure of the decompiling, helping in the function information structure of the extraction decompiling
Note symbol sequence;
It is determined that the volume between the memonic symbol sequence extracted and the memonic symbol sequence of default viral code
Collect distance;
Whether the editing distance for determining is judged less than predetermined threshold value, if the editing distance is less than default threshold
Value, it is determined that the virtual machine execution file of the application program includes viral code.
Preferably, before whether the editing distance for judging to determine is less than predetermined threshold value, methods described is also
Including:
It is determined that the character sum of the memonic symbol sequence extracted;
The character sum of the memonic symbol sequence is defined as the predetermined threshold value with the product of default value;Its
In, the default value is between 0~1.
A kind of device for judging viral code, the device includes:
Decompiling unit, decompiling is carried out for the virtual machine execution file to application program, obtains decompiling
Function information structure;
Extraction unit, the function information structure for parsing the decompiling extracts the function of the decompiling
Function instruction sequence in message structure;
Editing distance determining unit, for the function instruction sequence for determining to extract and default viral generation
Editing distance between the function instruction sequence of code;
Whether viral code determining unit, the editing distance for judging to determine is less than predetermined threshold value,
When the editing distance for determining is less than predetermined threshold value, the virtual machine execution file bag of the application program is determined
Containing viral code.
Preferably, described device also includes:
Number of characters determining unit, for judge determine the editing distance whether less than predetermined threshold value it
Before, it is determined that the character sum of the function instruction sequence extracted;
Predetermined threshold value determining unit, for multiplying the character of function instruction sequence sum and default value
Product is defined as the predetermined threshold value;Wherein, the default value is between 0~1.
Preferably, the decompiling unit includes:
Information structure obtaining unit, for being entered to virtual machine execution file according to virtual machine execution file form
Row parsing, obtains the function information structure of each class;
Function information structure obtaining unit, for the field in the function information structure, determines institute
Position and the size of the function of virtual machine execution file are stated, the function information structure of the decompiling is obtained.
A kind of device for judging viral code, the device includes:
Decompiling unit, decompiling is carried out for the virtual machine execution file to application program, obtains decompiling
Function information structure;
Extraction unit, the function information structure for parsing the decompiling extracts the function of the decompiling
Memonic symbol sequence in message structure;
Editing distance determining unit, for the memonic symbol sequence for determining to extract and default viral code
Function instruction sequence between editing distance;
Whether viral code determining unit, the editing distance for judging to determine is less than predetermined threshold value,
When the editing distance is less than predetermined threshold value, determine that the virtual machine execution file of the application program includes virus
Code.
Preferably, described device also includes:
Number of characters determining unit, for judge determine the editing distance whether less than predetermined threshold value it
Before, it is determined that the character sum of the memonic symbol sequence extracted;
Predetermined threshold value determining unit, for the product of the total and default value by the character of the memonic symbol sequence
It is defined as the predetermined threshold value;Wherein, the default value is between 0~1.
Above-mentioned at least one technical scheme that the embodiment of the present application is used can reach following beneficial effect:
By the analysis and decompiling of the virtual machine execution file of the application program to being installed on intelligent terminal, can
With function instruction sequence (or the mnemonic(al) in the function information structure for obtaining decompiling corresponding with the application program
Symbol sequence), and the function instruction sequence (or the memonic symbol sequence for determining to extract using editing distance algorithm
Row) editing distance and the function instruction sequence (or memonic symbol sequence) of default viral code between, most
Eventually it is determined that the editing distance be less than predetermined threshold value when, determine that the virtual machine of the application program performs text
Part includes viral code, it is possible thereby to whether accurate certain application program judged on intelligent terminal belongs to pass through
Change the character string of viral code reference to reach the program of purpose free to kill, so as to ensure the peace of intelligent terminal
Entirely.
Brief description of the drawings
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes of the application
Point, the schematic description and description of the application is used to explain the application, does not constitute to the application not
Work as restriction.In the accompanying drawings:
The flow chart of the method for judging viral code that Fig. 1 is provided for the embodiment of the application one;
Fig. 2 is the example that the embodiment of the present application carries out the function information structure that decompiling is obtained to dex files;
The flow chart of the method for judging viral code that Fig. 3 is provided for another embodiment of the application;
The module map of the device for judging viral code that Fig. 4 is provided for the embodiment of the application one.
Specific embodiment
It is specifically real below in conjunction with the application to make the purpose, technical scheme and advantage of the application clearer
Apply example and corresponding accompanying drawing is clearly and completely described to technical scheme.Obviously, it is described
Embodiment is only some embodiments of the present application, rather than whole embodiments.Based on the implementation in the application
Example, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made
Example, belongs to the scope of the application protection.
Herein by by taking Android (Android) operating system that mobile terminal is used as an example come to narration this technology
Scheme, but it is not limited to Android (Android) operating system.
By taking Android (Android) operating system as an example, typically at least include application layer (app layers) and
System framework layer (framework layers), other layer of the application for including is possible to then as being divided from function
It is not covered.Wherein, the generally above-mentioned app layers interface that can be to carry out user mutual, for example:With
With realize the interface of application maintenance, the different types of click on content of identifying user and show corresponding upper
Hereafter menu etc..Generally above-mentioned framework layers is mainly used in the above-mentioned app layers user's request of acquisition
(such as:Start and preserve picture etc. with program, clickthrough, click) forwarded toward lower floor, or by lower floor
The content managed is distributed to upper strata by message or middle-agent's class mode, is shown with to user.
Dalvik is the Java Virtual Machine for Android platform.Dalvik is by optimization, it is allowed to limited
Internal memory in run the example of multiple virtual machines simultaneously, and each Dalvik is using independent as one
Linux processes are performed, and independent process can prevent all programs when virtual machine crashes to be all closed.
Dalvik virtual machine can be supported to have been converted into the Java application journeys of dex (Dalvik Executable) form
The operation of sequence, dex forms are a kind of compressed format for aiming at Dalvik designs, are adapted to internal memory and processor speed
The limited sorts of systems of degree.
It can be seen that, in android system, dex files can be directly at Dalvik virtual machine (Dalvik VM)
The virtual machine execution file of middle load operating.By ADT (Android Development Tools), pass through
Java source codes, can be converted to dex files by complicated compiling.Dex files are directed to embedded system
The result of optimization, the instruction code of Dalvik virtual machine is not the Java Virtual Machine instruction code of standard, but is made
With oneself exclusive a set of instruction set.Many class names, constant character string are shared in dex files, has been made
Its volume is smaller, and operational efficiency is also higher.It is worth mentioning that in android system, it is empty
It is dex files that plan machine performs file, and in other operating systems, above-mentioned virtual machine execution file can be
Other kinds of file, the application is not construed as limiting.
Below in conjunction with accompanying drawing, the technical scheme that each embodiment of the application is provided is described in detail.
Fig. 1 is the flow of the method for judging viral code of offer in the embodiment of the application one, including:
S101:Virtual machine execution file to application program carries out decompiling, obtains the function information of decompiling
Structure.
Above-mentioned application program can be mounted to the application program on mobile terminal.Above-mentioned virtual machine execution file
E.g. dex files.As it was previously stated, Android operation system includes application layer (app layers) and is
System ccf layer (framework layers), the application focuses on the research and improvement to app layers.But, this
Art personnel understand, when Android starts, Dalvik VM monitor all of program (APK texts
Part) and framework, and for they create a dependency tree.DalvikVM passes through this dependence
Tree comes for each program optimization code and stores in Dalvik cachings (dalvik-cache).So, institute
Having program operationally can all use the code for optimizing.When a program (or framework storehouse) is changed,
Dalvik VM re-optimization code and will be deposited in the buffer again.In cache/dalvik-cache
It is the dex files for depositing the Program Generating on system, and data/dalvik-cache is then storage data/app
The dex files of generation.That is, the application focuses on carrying out the dex files of data/app generations
Analysis and treatment.
Mode on obtaining dex files, can be by parsing APK (Android Package, Android
Installation kit) obtain.APK file can be a compressed package of zip forms, but its suffix name can be repaiied
Apk is changed to, after UnZip is decompressed, it is possible to obtain above-mentioned dex files.
Decompiling (or dis-assembling) is carried out to dex files various ways, and two kinds are exemplarily given here
Mode, those skilled in the art can on this basis expand other modes, and these modes are in the application
Protection domain within:
First way:Dex files are parsed according to dex file formats, obtains the function of each class
Information structure;According to the field in function information structure, the position of the function of dex files and big is determined
It is small, obtain the function information structure of decompiling.Wherein, by analytical function information structure, indicated
The list of the bytecode array field of the function position of dex files and the function size of instruction dex files is long
Degree field, so that it is determined that the position of the function of dex files and size.
The second way:Dex file reverses are compiled as Virtual Machine bytecodes using dex files decompiling instrument.
As it was previously stated, Dalvik virtual machine operation is Dalvik bytecodes, the Dalvik bytecodes can be with
It is exist in the form of a dex (Dalvik Executable) executable file, Dalvik virtual machine passes through
Dex files are explained to perform code.There are some instruments at present, dex file reverses can be assembled into Dalvik and converged
Compile code.The decompiling instrument of this kind of dex files is included but is not limited to:baksmali、Dedexer1.26、
Dexdump, dexinspecto03-12-12r, IDA Pro, androguard, dex2jar, 010Editor etc..
It can be seen that, by the decompiling to dex files, all function information structures of decompiling can be obtained.
Wherein, function information structure performs code comprising function, is by virtual machine instructions sequence in the embodiment of the present application
Row and virtual machine memonic symbol Sequence composition, such as following example, by Dalvik VM command sequence and
The memonic symbol Sequence composition function information structure of Dalvik VM.
For example, shown in Fig. 2 being to carry out the function letter that decompiling is obtained in the embodiment of the present application to dex files
Cease the example of structure.It can be seen that, dex files are decompiled into the command sequence and Dalvik VM of Dalvik VM
Memonic symbol sequence.
S102:The function information structure of the decompiling is parsed, the function information structure of the decompiling is extracted
In function instruction sequence.
Such as the example of figure 2 above, each in machine code field in the function information structure that decompiling is obtained
Capable preceding 2 numerals refer to make sequence (upper example left side is by circle part), and the corresponding part of command sequence
It is memonic symbol (upper example right side, is partly enclosed, not all selections).Memonic symbol is primarily to convenient use
Family exchanges and written in code.As above example, dex files can be obtained by the sequence of instructions of function by decompiling
It is classified as:“12 54 38 71 0c 6e 0c 6e 0a 38 54 54 6e 0c 6e 54 6e 0c 6e 0c 38 72 0a 39 12
38 54 6e 54 71 0e 01 28 54 13 6e”。
Memonic symbol sequence is:
“const/4iget-object if-eqz invoke-static move-result-object invoke-virtual
move-result-object invoke-virtual move-result if-eqz iget-object iget-object
invoke-virtual move-result-object invoke-virtual iget-object invoke-virtual
move-result-object invoke-virtual move-result-object if-eqz invoke-interface
move-result if-nez const/4if-eqz iget-object invoke-virtual iget-object invoke-static
return-void move goto iget-object const/16invoke-virtual”。
S103:It is determined that the function instruction sequence of the function instruction sequence extracted and default viral code
Between editing distance.
Editing distance (Edit Distance), also known as Levenshtein distances, refer between two word strings, by
One change into another needed for minimum edit operation number of times.Such as:Calculate the editor of cafe and coffee
Distance, by cafe operations for the process of coffee is:Cafe → caffe → coffe → coffee, then edited
Distance is 3.Typically, for two function instruction sequences, if between the two function instruction sequences
Editing distance is smaller, shows that the two function instruction sequence similarities are higher, that is, shows to be judged answering
The mutation being likely to belong to certain viral code in virus base is got over the code of program.
S104:Whether the editing distance for determining is judged less than predetermined threshold value, if the editing distance is less than
Predetermined threshold value, it is determined that the virtual machine execution file of the application program includes viral code.
Viral code (Virus code) refers to be propagated by storage medium or network, is being recognized without permission
Operating system integrality is destroyed in the case of card, the journey logic bomb of undisclosed secret information in system is stolen.
By taking mobile phone as an example, mobile phone malicious code refers to the malicious code for handheld devices such as mobile phone, PDA.Mobile phone
Malicious code can be simply divided into science malicious code and non-replicating malicious code.Wherein science
Malicious code mainly includes viral (Virus), worm (Worm), and non-replicating malicious code mainly includes
Backdoor Trojan (Trojan Horse), rogue software (Rogue Software), malice are mobile
Code (Malicious Mobile Code) and Rootkit programs etc..
For example, being by the command sequence that step S103 obtains function:“12 54 38 71 0c 6e 0c 6e 0a 38
54 54 6e 0c 6e 54 6e 0c 6e 0c 38 72 0a 39 12 38 54 6e 54 71 0e 01 28 54 13 6e”。
The command sequence of certain viral code present in default virus base is:“1238 54 71 0c 6e
0c 6e 0a 38 54 54 6e 0c 6e 54 6e 0c 6e 0c 38 72 0a 39 12 38 54 6e 54 71 0e 01 28
54 13 6e”。
By calculating the editing distance of above-mentioned two command sequence, editing distance=4 are obtained, it is assumed that predetermined threshold value
It is 5, then finds that the editing distance of above-mentioned two command sequence is less than above-mentioned predetermined threshold value by comparing,
Therefore can determine that the code of the program is the mutation of certain viral code in virus base, that is, viral code.One
As, above-mentioned predetermined threshold value can be preset based on experience value.
Determination on above-mentioned predetermined threshold value can include various ways, for example:By artificially rule of thumb setting
The fixed predetermined threshold value is how many, or determines above-mentioned predetermined threshold value according to certain computation rule.The application reality
Apply in example, in order to improve the accuracy of identification viral code, performed in the virtual machine for judging the application program
Whether comprising before viral code (step S104), methods described can also include file:
It is determined that the character sum of the function instruction sequence extracted.
The character sum of the function instruction sequence is defined as with the product of default value α (0 < α < 1)
The predetermined threshold value.
For example, it may be determined that command sequence obtained above:“1238 54 71 0c 6e 0c 6e 0a 38 54
The word of the 6e of 54 6e 0c 6e, 54 54 71 0e of 6e 0c 6e 0c 38 72 0a, 39 12 38 54 6e 01 28 54 13 "
Symbol sum is 72, then it is 0.05 (between 0~1) that can set default value, may finally be determined pre-
If threshold value is 72*0.05 ≈ 4.Wherein, the default value also can be empirical value.By above-mentioned steps,
The code of the function instruction sequence that similarity can be reached into more than 95% is defined as the mutation of viral code.
It should be noted that the application is not limited being carried out to malicious code using which kind of malicious code protectiving scheme
Detection, it is for instance possible to use sample characteristics killing (characteristic value scanning) presented hereinbefore, based on virtual machine
Killing or heuristic killing, it can in addition contain carry out similar sample clustering.Cluster on similar sample,
Specifically, a large amount of code samples can be directed to be classified according to editing distance (similarity), at two
When the editing distance of the function instruction sequence of code sample is less than predetermined threshold value, the two code samples are divided
To in same classification, so as to realize the automatic cluster of a large amount of code samples.It is worth addressing, the application
Also it is not restricted for matching algorithm, it is for instance possible to use fuzzy matching algorithm presented hereinbefore or similar
Matching algorithm etc..
The flow of the method for judging viral code that Fig. 3 is provided for another embodiment of the application, including:
S201:Virtual machine execution file to application program carries out decompiling, obtains the function information of decompiling
Structure;Above-mentioned application program can be mounted to the application program on mobile terminal.
S202:The function information structure of the decompiling is parsed, the function information structure of the decompiling is extracted
In memonic symbol sequence;
S203:It is determined that between the memonic symbol sequence extracted and the memonic symbol sequence of default viral code
Editing distance;
S204:Whether the editing distance for determining is judged less than predetermined threshold value, if the editing distance is less than
Predetermined threshold value, it is determined that the virtual machine execution file of the application program includes viral code.
The present embodiment is similar with a upper embodiment, and used as a kind of alternative embodiment, its difference is:This reality
Example is applied by extracting the memonic symbol sequence in function information structure, and is determined using memonic symbol sequence to be identified
Application code and viral code between editing distance, finally further according to editing distance (similarity)
Virtual machine execution file to determine application program includes viral code.Additionally, above-mentioned pre- on how to determine
If the content of threshold value is referred to above-described embodiment, no longer repeated herein.
It can be seen that, in the method that above-described embodiment is provided, by the application program to being installed on intelligent terminal
The analysis and decompiling of virtual machine execution file, can obtain the function of decompiling corresponding with the application program
Function instruction sequence (or memonic symbol sequence) in message structure, and determine to extract using editing distance algorithm
The function instruction sequence (or memonic symbol sequence) for arriving and default viral code function instruction sequence (or
Memonic symbol sequence) between editing distance, finally it is determined that the editing distance be less than predetermined threshold value when,
Determine that the virtual machine execution file of the application program includes viral code, it is possible thereby to accurate judge intelligence eventually
Whether certain application program on end belongs to the character string by changing viral code reference (in dex files)
To reach the program of purpose free to kill, so as to ensure the safety of intelligent terminal.
Fig. 4 is the module map of the device for judging viral code of offer in the embodiment of the application one.In the device
The function of each unit is similar with the function of each step in the above method, therefore the device is referred to above method reality
Apply the particular content of example.The device includes:
Decompiling unit 401, decompiling is carried out for the virtual machine execution file to application program, obtains anti-
The function information structure of compiling;
Extraction unit 402, the function information structure for parsing the decompiling extracts the decompiling
Function instruction sequence in function information structure;
Editing distance determining unit 403, for the function instruction sequence and default disease that determine to extract
Editing distance between the function instruction sequence of malicious code;
Whether viral code determining unit 404, the editing distance for judging to determine is less than predetermined threshold value,
When the editing distance is less than predetermined threshold value, determine that the virtual machine execution file of the application program includes disease
Malicious code.
By said apparatus, can accurately judge whether certain application program on intelligent terminal belongs to by repairing
Change viral code to quote the character string (in dex files) to reach the program of purpose free to kill, so that it is guaranteed that
The safety of intelligent terminal.
Determination on above-mentioned predetermined threshold value can include various ways, for example:By artificially rule of thumb setting
The fixed predetermined threshold value is how many, or determines above-mentioned predetermined threshold value according to certain computation rule.The application reality
Apply in example, in order to improve the accuracy of identification viral code, described device also includes:
Number of characters determining unit, for it is determined that the virtual machine execution file of the application program includes viral generation
Before code, it is determined that the character sum of the function instruction sequence extracted;
Predetermined threshold value determining unit, for multiplying the character of function instruction sequence sum and default value
Product is defined as the predetermined threshold value;Wherein, the default value is between 0~1.
In the embodiment of the present application, the decompiling unit 102 includes:
Information structure obtaining unit, for being entered to virtual machine execution file according to virtual machine execution file form
Row parsing, obtains the function information structure of each class;
Function information structure obtaining unit, for the field in the function information structure, determines institute
Position and the size of the function of virtual machine execution file are stated, the function information structure of the decompiling is obtained.
In the embodiment of the present application, the function information structure obtaining unit is used for:
The function information structure is parsed, the bytecode of the function position of instruction virtual machine execution file is obtained
The list length field of the function size of array field and instruction virtual machine execution file;
According to the bytecode array field and the list length field, determine that the virtual machine performs text
The position of the function of part and size.
In another embodiment of the application, the device of above-mentioned judgement viral code, including:
Decompiling unit 401, decompiling is carried out for the virtual machine execution file to application program, obtains anti-
The function information structure of compiling;Above-mentioned application program can be mounted to the application program on mobile terminal.
Extraction unit 402, the function information structure for parsing the decompiling extracts the decompiling
Memonic symbol sequence in function information structure;
Editing distance determining unit 403, for the memonic symbol sequence and default virus that determine to extract
Editing distance between the function instruction sequence of code;
Whether viral code determining unit 404, the editing distance for judging to determine is less than predetermined threshold value,
When the editing distance is less than predetermined threshold value, determine that the virtual machine execution file of the application program includes disease
Malicious code.
By said apparatus, can accurately judge whether certain application program on intelligent terminal belongs to by repairing
Change viral code to quote the character string (in dex files) to reach the program of purpose free to kill, so that it is guaranteed that
The safety of intelligent terminal.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or meter
Calculation machine program product.Therefore, the application can be using complete hardware embodiment, complete software embodiment or knot
Close the form of the embodiment in terms of software and hardware.And, the application can be used and wherein wrapped at one or more
Containing computer usable program code computer-usable storage medium (including but not limited to magnetic disk storage,
CD-ROM, optical memory etc.) on implement computer program product form.
The application is produced with reference to the method according to the embodiment of the present application, equipment (system) and computer program
The flow chart and/or block diagram of product is described.It should be understood that can by computer program instructions realize flow chart and
/ or block diagram in each flow and/or the flow in square frame and flow chart and/or block diagram and/
Or the combination of square frame.These computer program instructions to all-purpose computer, special-purpose computer, insertion can be provided
The processor of formula processor or other programmable data processing devices is producing a machine so that by calculating
The instruction of the computing device of machine or other programmable data processing devices is produced for realizing in flow chart one
The device of the function of being specified in individual flow or multiple one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or the treatment of other programmable datas to set
In the standby computer-readable memory for working in a specific way so that storage is in the computer-readable memory
Instruction produce include the manufacture of command device, the command device realization in one flow of flow chart or multiple
The function of being specified in one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices, made
Obtain and series of operation steps is performed on computer or other programmable devices to produce computer implemented place
Reason, so as to the instruction performed on computer or other programmable devices is provided for realizing in flow chart one
The step of function of being specified in flow or multiple one square frame of flow and/or block diagram or multiple square frames.
In a typical configuration, computing device includes one or more processors (CPU), input/defeated
Outgoing interface, network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory
And/or the form, such as read-only storage (ROM) or flash memory (flash RAM) such as Nonvolatile memory (RAM).
Internal memory is the example of computer-readable medium.
Computer-readable medium includes that permanent and non-permanent, removable and non-removable media can be by appointing
What method or technique realizes information Store.Information can be computer-readable instruction, data structure, program
Module or other data.The example of the storage medium of computer includes, but are not limited to phase transition internal memory
(PRAM), static RAM (SRAM), dynamic random access memory (DRAM), its
The random access memory (RAM) of his type, read-only storage (ROM), electrically erasable are read-only
Memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage
(CD-ROM), digital versatile disc (DVD) or other optical storages, magnetic cassette tape, tape magnetic
Disk storage or other magnetic storage apparatus or any other non-transmission medium, can be used for storage can be calculated
The information that equipment is accessed.Defined according to herein, computer-readable medium does not include temporary computer-readable matchmaker
Body (transitory media), such as data-signal and carrier wave of modulation.
Also, it should be noted that term " including ", "comprising" or its any other variant be intended to non-row
His property is included, so that process, method, commodity or equipment including a series of key elements not only include
Those key elements, but also other key elements including being not expressly set out, or also include for this process,
Method, commodity or the intrinsic key element of equipment.In the absence of more restrictions, by sentence " including
One ... " key element that limits, it is not excluded that in the process including the key element, method, commodity or set
Also there is other identical element in standby.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer journey
Sequence product.Therefore, the application can using complete hardware embodiment, complete software embodiment or combine software and
The form of the embodiment of hardware aspect.And, the application can be used and wherein include calculating at one or more
Machine usable program code computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM,
Optical memory etc.) on implement computer program product form.
Embodiments herein is the foregoing is only, the application is not limited to.For this area skill
For art personnel, the application can have various modifications and variations.All institutes within spirit herein and principle
Any modification, equivalent substitution and improvements of work etc., within the scope of should be included in claims hereof.
Claims (10)
1. it is a kind of judge viral code method, it is characterised in that including:
Virtual machine execution file to application program carries out decompiling, obtains the function information structure of decompiling;
The function information structure of the decompiling is parsed, the letter in the function information structure of the decompiling is extracted
Number command sequence;
It is determined that between the function instruction sequence extracted and the function instruction sequence of default viral code
Editing distance;
Whether the editing distance for determining is judged less than predetermined threshold value, if the editing distance is less than default threshold
Value, it is determined that the virtual machine execution file of the application program includes viral code.
2. the method for claim 1, it is characterised in that in the editing distance for judging to determine
Whether less than before predetermined threshold value, methods described also includes:
It is determined that the character sum of the function instruction sequence extracted;
The character sum of the function instruction sequence is defined as the predetermined threshold value with the product of default value;
Wherein, the default value is between 0~1.
3. the method for claim 1, it is characterised in that the virtual machine to application program is held
Style of writing part carries out decompiling, obtains the function information structure of decompiling, specifically includes:
Virtual machine execution file form according to application program is parsed to virtual machine execution file, obtains every
The function information structure of individual class;
According to the field in the function information structure, the position of the function of the virtual machine execution file is determined
Put and size, obtain the function information structure of the decompiling.
4. it is a kind of judge viral code method, it is characterised in that including:
Virtual machine execution file to application program carries out decompiling, obtains the function information structure of decompiling;
Parse the function information structure of the decompiling, helping in the function information structure of the extraction decompiling
Note symbol sequence;
It is determined that the volume between the memonic symbol sequence extracted and the memonic symbol sequence of default viral code
Collect distance;
Whether the editing distance for determining is judged less than predetermined threshold value, if the editing distance is less than default threshold
Value, it is determined that the virtual machine execution file of the application program includes viral code.
5. method as claimed in claim 4, it is characterised in that in the editing distance for judging to determine
Whether less than before predetermined threshold value, methods described also includes:
It is determined that the character sum of the memonic symbol sequence extracted;
The character sum of the memonic symbol sequence is defined as the predetermined threshold value with the product of default value;Its
In, the default value is between 0~1.
6. it is a kind of judge viral code device, it is characterised in that the device includes:
Decompiling unit, decompiling is carried out for the virtual machine execution file to application program, obtains decompiling
Function information structure;
Extraction unit, the function information structure for parsing the decompiling extracts the function of the decompiling
Function instruction sequence in message structure;
Editing distance determining unit, for the function instruction sequence for determining to extract and default viral generation
Editing distance between the function instruction sequence of code;
Whether viral code determining unit, the editing distance for judging to determine is less than predetermined threshold value,
When the editing distance is less than predetermined threshold value, determine that the virtual machine execution file of the application program includes virus
Code.
7. device as claimed in claim 6, it is characterised in that described device also includes:
Number of characters determining unit, for judge determine the editing distance whether less than predetermined threshold value it
Before, it is determined that the character sum of the function instruction sequence extracted;
Predetermined threshold value determining unit, for multiplying the character of function instruction sequence sum and default value
Product is defined as the predetermined threshold value;Wherein, the default value is between 0~1.
8. device as claimed in claim 6, it is characterised in that the decompiling unit includes:
Information structure obtaining unit, for according to the virtual machine execution file form of application program to virtual machine
Perform file to be parsed, obtain the function information structure of each class;
Function information structure obtaining unit, for the field in the function information structure, determines institute
Position and the size of the function of virtual machine execution file are stated, the function information structure of the decompiling is obtained.
9. it is a kind of judge viral code device, it is characterised in that the device includes:
Decompiling unit, decompiling is carried out for the virtual machine execution file to application program, obtains decompiling
Function information structure;
Extraction unit, the function information structure for parsing the decompiling extracts the function of the decompiling
Memonic symbol sequence in message structure;
Editing distance determining unit, for the memonic symbol sequence for determining to extract and default viral code
Function instruction sequence between editing distance;
Whether viral code determining unit, the editing distance for judging to determine is less than predetermined threshold value,
When the editing distance is less than predetermined threshold value, determine that the virtual machine execution file of the application program includes virus
Code.
10. device as claimed in claim 9, it is characterised in that described device also includes:
Number of characters determining unit, for judge determine the editing distance whether less than predetermined threshold value it
Before, it is determined that the character sum of the memonic symbol sequence extracted;
Predetermined threshold value determining unit, for the product of the total and default value by the character of the memonic symbol sequence
It is defined as the predetermined threshold value;Wherein, the default value is between 0~1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510971165.8A CN106909841A (en) | 2015-12-22 | 2015-12-22 | A kind of method and device for judging viral code |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510971165.8A CN106909841A (en) | 2015-12-22 | 2015-12-22 | A kind of method and device for judging viral code |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106909841A true CN106909841A (en) | 2017-06-30 |
Family
ID=59200979
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510971165.8A Pending CN106909841A (en) | 2015-12-22 | 2015-12-22 | A kind of method and device for judging viral code |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106909841A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107547547A (en) * | 2017-09-05 | 2018-01-05 | 成都知道创宇信息技术有限公司 | A kind of TCP CC recognition methods based on editing distance |
CN108446559A (en) * | 2018-02-13 | 2018-08-24 | 北京兰云科技有限公司 | A kind of recognition methods of APT tissue and device |
CN108491718A (en) * | 2018-02-13 | 2018-09-04 | 北京兰云科技有限公司 | A kind of method and device for realizing information classification |
CN108804920A (en) * | 2018-05-24 | 2018-11-13 | 河南省躬行信息科技有限公司 | A method of based on striding course behavior monitoring malicious code homology analysis |
CN110225007A (en) * | 2019-05-27 | 2019-09-10 | 国家计算机网络与信息安全管理中心 | The clustering method of webshell data on flows and controller and medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761475A (en) * | 2013-12-30 | 2014-04-30 | 北京奇虎科技有限公司 | Method and device for detecting malicious code in intelligent terminal |
CN103902910A (en) * | 2013-12-30 | 2014-07-02 | 北京奇虎科技有限公司 | Method and device for detecting malicious codes in intelligent terminal |
-
2015
- 2015-12-22 CN CN201510971165.8A patent/CN106909841A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761475A (en) * | 2013-12-30 | 2014-04-30 | 北京奇虎科技有限公司 | Method and device for detecting malicious code in intelligent terminal |
CN103902910A (en) * | 2013-12-30 | 2014-07-02 | 北京奇虎科技有限公司 | Method and device for detecting malicious codes in intelligent terminal |
Non-Patent Citations (1)
Title |
---|
赵作鹏: "《面向煤矿应急管理的数据处理关键技术研究》", 30 November 2013 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107547547A (en) * | 2017-09-05 | 2018-01-05 | 成都知道创宇信息技术有限公司 | A kind of TCP CC recognition methods based on editing distance |
CN107547547B (en) * | 2017-09-05 | 2020-06-02 | 成都知道创宇信息技术有限公司 | TCP CC identification method based on edit distance |
CN108446559A (en) * | 2018-02-13 | 2018-08-24 | 北京兰云科技有限公司 | A kind of recognition methods of APT tissue and device |
CN108491718A (en) * | 2018-02-13 | 2018-09-04 | 北京兰云科技有限公司 | A kind of method and device for realizing information classification |
CN108491718B (en) * | 2018-02-13 | 2022-03-04 | 北京兰云科技有限公司 | Method and device for realizing information classification |
CN108446559B (en) * | 2018-02-13 | 2022-03-29 | 北京兰云科技有限公司 | APT organization identification method and device |
CN108804920A (en) * | 2018-05-24 | 2018-11-13 | 河南省躬行信息科技有限公司 | A method of based on striding course behavior monitoring malicious code homology analysis |
CN108804920B (en) * | 2018-05-24 | 2021-09-28 | 河南省躬行信息科技有限公司 | Method for monitoring malicious code homology analysis based on cross-process behavior |
CN110225007A (en) * | 2019-05-27 | 2019-09-10 | 国家计算机网络与信息安全管理中心 | The clustering method of webshell data on flows and controller and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106909841A (en) | A kind of method and device for judging viral code | |
JP5992622B2 (en) | Malicious application diagnostic apparatus and method | |
CN103761475B (en) | Method and device for detecting malicious code in intelligent terminal | |
CN109564608A (en) | Updating virtual memory addresses of target application functions for updated versions of application binary code | |
Kapratwar et al. | Static and dynamic analysis of android malware | |
CN106250769B (en) | A kind of the source code data detection method and device of multistage filtering | |
Cho et al. | Security assessment of code obfuscation based on dynamic monitoring in android things | |
CN112148305B (en) | Application detection method, device, computer equipment and readable storage medium | |
CN102446255B (en) | Method and device for detecting page tamper | |
CN105653949B (en) | A kind of malware detection methods and device | |
US11030393B2 (en) | Estimation of document structure | |
CN108090360B (en) | Behavior feature-based android malicious application classification method and system | |
US20240054802A1 (en) | System and method for spatial encoding and feature generators for enhancing information extraction | |
Bhattacharya et al. | DMDAM: data mining based detection of android malware | |
CN103473104A (en) | Method for discriminating re-package of application based on keyword context frequency matrix | |
CN112817877B (en) | Abnormal script detection method and device, computer equipment and storage medium | |
CN106803040A (en) | Virus signature processing method and processing device | |
CN106874760A (en) | A kind of Android malicious code sorting techniques based on hierarchy type SimHash | |
Linoy et al. | Exploring Ethereum’s blockchain anonymity using smart contract code attribution | |
CN105631336B (en) | Detect the system and method for the malicious file in mobile device | |
CN106909844A (en) | The sorting technique and device of a kind of application program sample | |
CN107122663A (en) | A kind of detection method for injection attack and device | |
Feichtner et al. | Obfuscation-resilient code recognition in Android apps | |
CN106909839A (en) | A kind of method and device for extracting sample code feature | |
Guo et al. | WLTDroid: repackaging detection approach for android applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170630 |