CN110175045A - Android application program beats again bag data processing method and processing device - Google Patents

Android application program beats again bag data processing method and processing device Download PDF

Info

Publication number
CN110175045A
CN110175045A CN201910417798.2A CN201910417798A CN110175045A CN 110175045 A CN110175045 A CN 110175045A CN 201910417798 A CN201910417798 A CN 201910417798A CN 110175045 A CN110175045 A CN 110175045A
Authority
CN
China
Prior art keywords
code
application program
program
calling
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910417798.2A
Other languages
Chinese (zh)
Inventor
徐国爱
郭燕慧
沈月东
王浩宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201910417798.2A priority Critical patent/CN110175045A/en
Publication of CN110175045A publication Critical patent/CN110175045A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/70Software maintenance or management
    • G06F8/71Version control; Configuration management

Abstract

The application provides Android application program and beats again bag data processing method and processing device.Method includes obtaining original application program and beating again original application program the application pair to be detected for beating again packet application program generated after packet, is identified in original application program for calling the first of third party library to call code and beat again in packet application program for calling the second of third party library to call code;The first program code other than the first calling code is obtained from extraction in original application program, and from beat again in packet application program extract obtain except second call code in addition to the second program code, the API Calls code for including in first program code and the API Calls code for including in the second program code are compared, modification information of second program code relative to the first API Calls code of first program code is obtained, to identify the Description of Revision for beating again the code block and each code block that carry out beating again packet in packet application program.

Description

Android application program beats again bag data processing method and processing device
Technical field
This application involves technical field of software development more particularly to a kind of Android application program to beat again bag data processing method And device.
Background technique
With the rapid proliferation of Android (Android) mobile intelligent terminal, the type and quantity of Android application (APP) are more next It is more.The developer of some malice cracks Android legal copy application, repacks after modifying code and issues on the market, not only invades Violate the interests of developer, also seriously threatens the safety and privacy of user.
Numerous studies have been done in the safety analysis field of software both at home and abroad, some feasible beat again has been proposed and wraps detection side Method mainly includes the detection of the packet of beating again based on Code Clones and beats again packet detection based on Android application feature, for detecting Whether Android application has carried out beating again packet.
Rival or malice developer are generally increased to Android application using the method for Code obfuscation in the prior art Program beat again the difficulty of packet.The prior art beats again packet inspection method, only can detecte whether Android application program carries out Packet is beaten again, can not identify the Description of Revision of the code block and each code block that carry out beating again packet in Android application program, in turn Targetedly instruction can not be proposed to Code obfuscation.
Summary of the invention
The application provides a kind of Android application program and beats again bag data processing method and processing device, to solve in the prior art The Description of Revision of the code block and each code block that carry out beating again packet in Android application program can not be identified by beating again packet inspection method The technical issues of.
In a first aspect, the embodiment of the invention provides a kind of Android application programs to beat again bag data processing method, comprising:
Application pair to be detected is obtained, the application to be detected is to including original application program and to the original application program Beat again packet after generate beat again packet application program;
It identifies in the original application program for calling the first of third party library code and the packet of beating again is called to apply For calling the second of third party library to call code in program;
The first program code obtained in addition to described first calls code is extracted from the original application program, and From second program code for beating again and extracting and being obtained in addition to described second calls code in packet application program, wherein described It include the API Calls code for calling API in first program code and second program code;
The API tune that will include in the API Calls code for including in first program code and second program code It is compared with code, obtains the first API Calls code of second program code relative to first program code Modification information.
Second aspect, the embodiment of the invention provides a kind of Android application programs to beat again bag data analytical equipment, comprising:
Program obtains module, for obtaining application pair to be detected, the application to be detected to include original application program and To the original application program beat again packet after generate beat again packet application program;
Identification module, for identification for calling the first of third party library to call code and institute in the original application program It states and beats again in packet application program for calling the second of third party library to call code;
Extraction module, for extracting obtain in addition to described first calls code first from the original application program Program code, and from it is described beat again in packet application program extract obtain except it is described second call code in addition to the second program generation Code, wherein include the API Calls code for calling API in first program code and second program code;
First contrast module, API Calls code and second program for that will include in first program code The API Calls code for including in code is compared, and obtains second program code relative to first program code The modification information of first API Calls code.
The third aspect, the embodiment of the invention provides a kind of Android application programs to beat again bag data processing equipment, including deposits Reservoir, processor;
Memory: for storing the processor-executable instruction;
Wherein, the processor is configured to: execute the executable instruction to realize that first aspect is described in any item Method.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage medium, the computer-readable storage Computer executed instructions are stored in medium, for realizing above-mentioned first party when the computer executed instructions are executed by processor The described in any item methods in face.
Android application program provided in an embodiment of the present invention beats again bag data processing method and processing device, and obtaining includes original answer The application pair to be detected for beating again packet application program generated after packet is beaten again with program and to the original application program, described in identification For calling the first of third party library to call code and described beat again in packet application program for calling the in original application program The second of three-party library calls code;First obtained in addition to described first calls code is extracted from the original application program Program code, and from it is described beat again in packet application program extract obtain except it is described second call code in addition to the second program generation Code, the API Calls code that will include in the API Calls code for including in first program code and second program code It is compared, obtains the modification letter of first API Calls code of second program code relative to first program code Breath, i.e. the second program code beat again packet code block and the modification information of the code block relative to the first program code, so as to Targetedly instruction is proposed to Code obfuscation.
Detailed description of the invention
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure Example, and together with specification for explaining the principles of this disclosure.
Fig. 1 is the flow diagram that the Android application program that one embodiment of the invention provides beats again bag data processing method;
Fig. 2 be another embodiment of the present invention provides Android application program beat again bag data processing method process signal Figure;
Fig. 3 is the process signal that the Android application program that further embodiment of this invention provides beats again bag data processing method Figure;
Fig. 4 is the process signal that the Android application program that yet another embodiment of the invention provides beats again bag data processing method Figure;
Fig. 5 is the process signal that the Android application program that the next embodiment of the present invention provides beats again bag data processing method Figure;
Fig. 6 is the process signal that the Android application program that a further embodiment of the present invention provides beats again bag data processing method Figure;
Fig. 7 is the structural schematic diagram that the Android application program that one embodiment of the invention provides beats again bag data processing unit;
Fig. 8 be another embodiment of the present invention provides Android application program beat again the structural representation of bag data processing unit Figure;
Fig. 9 is the hardware configuration signal that the Android application program that one embodiment of the invention provides beats again bag data processing equipment Figure.
Through the above attached drawings, it has been shown that the specific embodiment of the disclosure will be hereinafter described in more detail.These attached drawings It is not intended to limit the scope of this disclosure concept by any means with verbal description, but is by referring to specific embodiments Those skilled in the art illustrate the concept of the disclosure.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all implementations consistent with this disclosure.On the contrary, they be only with it is such as appended The example of the consistent device and method of some aspects be described in detail in claims, the disclosure.
In addition, reference term " one embodiment ", " some embodiments ", " example ", " specific example " or " some examples " Deng description to mean that particular features, structures, materials, or characteristics described in conjunction with this embodiment or example are contained in of the invention In at least one embodiment or example.In the present specification, schematic expression of the above terms are necessarily directed to identical Embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in any one or more embodiments Or it can be combined in any suitable manner in example.
In numerous Android application Malwares, the application for beating again Packet type occupies very high ratio, beats again packet application, Refer in the class file injection original application by malicious code, and repacks and generate new apk file, malice therein Code is voluntarily started by the monitoring system time, has not only invaded the interests of developer, has also seriously threatened the peace of user Complete and privacy.
Numerous studies have been done in the safety analysis field of software both at home and abroad, some feasible beat again has been proposed and wraps detection side Method mainly includes that the detection of the packet of beating again based on Code Clones and the packet of beating again based on Android application feature detect, above two heavy Be packaged detection method, be used to detection Android application program whether carried out beating again packet, can not identify in Android application program into Row beats again the code block of packet and the Description of Revision of each code block.Bag data processing side is beaten again in Android application provided by the present application Method is compared by the API Calls code to original application program and the API Calls code for beating again packet application program, is identified The modification information that packet application program beat again relative to original application program packet code block and the code block is beaten again, is subsequent It carries out Code obfuscation and guidance is provided.
One feature of Android application is many applications all using third party library, and third party library is a kind of important reusable These libraries can be used to construct the application of oneself in software resource, Android application developer, and an application can be used multiple the Three-party library;Beating again packet application includes the modification of core code and repairing for third party library code to the modification of the code of original application Change.How core code is identified to the technical solution of the application and the technical solution of the application with specifically embodiment below Modification information, the modification information of third party library are described in detail.These specific embodiments can be combined with each other below, The same or similar concept or process may be repeated no more in certain embodiments.Below in conjunction with attached drawing, to the application Embodiment be described.
Fig. 1 is the flow diagram that the Android application program that one embodiment of the invention provides beats again bag data processing method. As shown in Figure 1, this method comprises:
S101, application pair to be detected is obtained, the application to be detected is to including original application program and original answer to described What is generated after beating again packet with program beats again packet application program.
Original application program is the legal application program of developer's exploitation, and an original application program can be corresponding with more It is a to beat again packet application program.It should be understood that the original application program and the packet application program of beating again are Code obfuscation Program bag afterwards.
Then Android application is compiled into Dalvik bytecode generally by written in Java, in the compilation process of Android application In, Java source code is compiled into class file first.Then, these class files and third party library can be compiled into Dalvik word Save code.When developer writes source code, Java source code and third party library have apparent boundary, however, when source code is compiled After being translated into Dalvik bytecode, the boundary between third party library and main program is difficult to define.Therefore obtain application to be detected to rear, instead The application pair to be detected is compiled, the smali code of original application program is obtained and beats again the smali code of packet application program.
For calling the first of third party library to call code and described beating again packet in S102, the identification original application program For calling the second of third party library to call code in application program.
Entrance of the domain name as third party library, Code obfuscation when, will not generally obscure domain name, in order to improve The recognition accuracy of third party library, optionally, based on the third party library called in domain-name extension information identification original application program.
Optionally, identify that being used to call the first of third party library to call code in the original application program includes: first Step: third party library identifier of the building based on domain-name extension information, step 2: decompiling is carried out to the original application program, The Smali code for obtaining the original application program carries out static analysis to the Smali code, is embodied as by just Then to each catalog extraction static nature in Smali code, static nature is that can still be maintained after Code obfuscation for matching Constant feature, optionally, static nature include the domain-name information of third party library, step 3: the domain-name information extracted is made For the input of above-mentioned third party library identifier, the identification to domain-name information is realized, and then determine the corresponding Smali of the domain-name information Whether code packet is third party library, step 4: finally determining in the Smali code of original application program for calling above-mentioned the The first of three-party library calls code.It should be understood that the first calling code can call multiple third party libraries.
Optionally, third party library of the data set training building based on domain-name extension information based on a large amount of application program is known Other device, specifically, known to each it includes all third party libraries application program, to the application program Smali code carries out static nature extraction, obtains the feature vector of Smali code catalogues at different levels, including domain-name extension information, right Catalogues at different levels are clustered by feature vector, and the result after cluster can be divided into two classes i.e. third party library or non-third party library, Traverse Android application Smali code all program bags, and without any set can be marked as third party library it Afterwards, all third party libraries of Android application are obtained.The application of a large amount of Android is trained acquisition third in the manner described above Square library identifier.Wherein, third party library is as a kind of reusable software resource, and identical third party library can have in different application Similar feature, it is assumed that the data of application program and sufficiently large and each third party library are much applied, then belonging to The code of three-party library will be clustered in larger set.
Optionally, identification is beaten again in packet application program for calling the second of third party library code and identification first is called to adjust Identical with the method for code, details are not described herein.
S103, the first program generation obtained in addition to described first calls code is extracted from the original application program Code, and from it is described beat again in packet application program extract obtain except it is described second call code in addition to the second program code, In, it include the API Calls code for calling API in first program code and second program code.
First program code is core code of the original application program in addition to third party library, and the second program code is Beat again core code of the packet application program in addition to third party library.
S104, will include in the API Calls code for including in first program code and second program code API Calls code is compared, and obtains first API Calls generation of second program code relative to first program code The modification information of code.
Optionally, will include in the API Calls code for including in first program code and second program code API Calls code be compared, including judging in the API Calls permission in second program code and the first program code API Calls permission it is whether identical, if it is different, then in the second program code include different API Calls permissions code packet It is marked, if identical, illustrates that the second program code does not carry out beating again packet processing.
Optionally, the first API Calls code is second program code of each application centering to be detected relative to described first First API Calls code of the API Calls code of program code modification, each application pair to be detected can be there are many API Calls Code, optionally, every kind of API Calls code can include every there are many mode of modification, the modification information of every kind of API Calls code The modification information of the modification content once modified, the first API Calls code of each application pair to be detected includes various API Calls Concrete modification content in the modification number of code and every time modification.
Optionally, first API Calls code of second program code relative to first program code is being obtained Modification information after, can to first program code carry out Malicious Code Detection, to identify first program code In whether comprising other implantation malicious code.
Optionally, the application to be detected is multiple to having, and the first API Calls code of each application pair to be detected includes more Kind API Calls code carries out statistical disposition to the modification information of the first API Calls code of all applications pair to be detected, obtains The modification number of various API Calls codes carries out summation process to the modification number of various API Calls codes, obtains accumulative repair Change number;For every kind of API Calls code, the modification number of this kind of API Calls code and the accumulative modification number are asked into quotient Value, obtains the modification probability of this kind of API Calls code, i.e. this kind of API Calls code beats again packet probability.
Android application program provided in an embodiment of the present invention beats again bag data processing method, and obtaining includes original application program The application pair to be detected for beating again packet application program that generates after packet is beaten again with to the original application program, identifies described original answer It is used to call third party library in packet application program with the first calling code and described beat again in program for calling third party library Second call code;The the first program generation obtained in addition to described first calls code is extracted from the original application program Code, and from second program code for beating again and extracting and being obtained in addition to described second calls code in packet application program, it will The API Calls code for including in the API Calls code and second program code for including in first program code carries out Compare, obtains the modification information of first API Calls code of second program code relative to first program code, i.e., Second program code beats again packet code block and the modification information of the code block relative to the first program code, and acquisition, which is beaten again, answers It is intended to the packet behavior of beating again of program.
The calling of third party library usually there are three types of major way, first way be developer when using third party library not It modifies to it, third-party class or jar file can be directly introduced directly into application item by developer, specific body It is now stored in the catalogues at different levels of Smali code for third party library, can include using identical third party library in different application Exactly the same code characteristic calls the code block of the third party library not carry out beating again packet at this time.The second way is that difference is answered With the middle third party library for calling same domain name, but in different application, the inside API Calls of third party library are modified.Third Kind of mode is that the third party library with identical function classification is called in different application, and domain name is not in third party library in different application Together.Wherein the second way and the third mode are to have carried out beating again packet, following FIG. 2 and figure to the calling code of third party library 4 the embodiment described have carried out in detail the acquisition for beating again bag data and treatment process of the calling code for calling third party library Explanation.Below by Fig. 2 the embodiment described to being that the above-mentioned second way is illustrated.
Below by Fig. 2 the embodiment described to the third party library for calling same domain name in different application, but not With in application, the data handling procedure for beating again packet mode that the API Calls permission of third party library is modified is described in detail.
Fig. 2 be another embodiment of the present invention provides Android application program beat again bag data processing method process signal Figure.On the basis of the present embodiment embodiment described in Fig. 1, to obtaining for the API Calls permission of the third party library with same domain name Process is taken to be described in detail, as shown in Fig. 2, for calling the first of third party library in the identification original application program After calling code and the second calling code beaten again in packet application program for calling third party library, the method is also wrapped It includes:
S201, the identical third party library of domain name in the second calling code neutralization the first calling code is searched.
Optionally, the second calling code has invoked multiple third party libraries, and the first calling code also has invoked multiple third parties Library.Third calls code and first to call in the third party library of code calling, and some domain names are identical, and some domain names are different.Pass through The mode of domain name comparison searches second and code is called to neutralize the identical third party library of domain name in the first calling code.
S202, the comparison third party library call the API Calls in code and the second calling code described first Permission obtains the modification information for the second API Calls code that described second calls code to call code relative to described first.
Specifically, judge that first for calling the third party library of same domain name calls code and second to call code Whether the calling permission of API identical, if it is different, then repairing to the API Calls permission of the third party library in the second calling code It converts to breath to be recorded, if identical, illustrate that the second calling code does not carry out the third party library to beat again packet, that is, beat again packet and answer It does not carry out beating again packet with third party library code identical with domain name in original application program in program.
For each application pair to be detected, it can obtain by Fig. 1 to Fig. 2 the embodiment described and beat again packet application It carries out beating again the code block of packet and the Description of Revision of each code block in program.In practical application, application to be detected is multiple to having, The the first API Calls code and the second API Calls code of each application pair to be detected include a variety of API Calls codes, to more The packet application program analysis result of beating again of a application pair to be detected carries out statistical disposition, can obtain each in a large amount of Android application Kind API Calls code is beaten again the probability of packet, to provide priority criteria to subsequent Code obfuscation, below by described in Fig. 3 The process that embodiment counts the processing result of multiple applications pair to be detected is described in detail.
Fig. 3 is the process signal that the Android application program that further embodiment of this invention provides beats again bag data processing method Figure.The application to be detected is multiple to having, as shown in figure 3, the method also includes:
S301, to the modification information of the first API Calls code of all applications pair to be detected and the second API Calls code Modification information carries out statistical disposition, obtains the modification number and modification data of every kind of API Calls code.
First API Calls code is second program code of each application centering to be detected relative to first program code The API Calls code of modification, the first API Calls code of each application pair to be detected can there are many API Calls codes, optional Ground, every kind of API Calls code can include modifying each time there are many mode of modification, the modification information of every kind of API Calls code Modification content, the modification information of the first API Calls code of each application pair to be detected includes repairing for various API Calls codes Change the concrete modification content in number and every time modification.
Therefore it is multiple to be detected using clock synchronization when having, the modification of the first API Calls code of all applications pair to be detected is believed It ceases and carries out statistical disposition, in the modification of the available corresponding modification number of every kind of API Calls code and every kind of API Calls code Hold.It should be understood that every kind of API Calls code herein is all to be detected includes using centering the first API Calls code The type of API Calls code.
Likewise, the second API Calls code of each application pair to be detected includes a variety of API Calls codes, to second The modification information of API Calls code carries out statistical disposition, the corresponding modification number of available every kind of API Calls code with The modification content of every kind of API Calls code.It should be understood that every kind of API Calls code herein is all applications pair to be detected In the second API Calls code API Calls code for including type.
The modification of the statistical disposition result of modification information to the first API Calls code and the second API Calls code is believed The statistical disposition result of breath asks union to handle, and specifically, the modification number of identical API Calls code is sought union, obtained every The modification number of kind API Calls code.
S302, summation process is carried out to all modification numbers, obtains accumulative modification number.
S303, it is directed to every kind of API Calls code, by the modification number of this kind of API Calls code and the accumulative modification time Number carries out seeking quotient, determines the modification probability of this kind of API Calls code, i.e. this kind of API Calls code beats again packet probability, identification The API Calls code of crucial type.
Optionally, to the modification higher API Calls code of probability, the modification content of its API Calls code is counted.
Android application program provided in an embodiment of the present invention beats again bag data processing method, by answering largely to be detected With pair beat again that packet processing result is for statistical analysis, calculate the code segment for obtaining and most generally being modified when beating again packet, modification Content includes the modification of the modification of API Calls code and the third party library of calling, key code is identified, so as in Code obfuscation Stage emphasis is protected, and the efficiency and quality of Code obfuscation are improved.
Fig. 4 is the process signal that the Android application program that yet another embodiment of the invention provides beats again bag data processing method Figure.The progress for beating again bag data on the basis of the present embodiment embodiment described in Fig. 1, to the third party library with different domain names Processing, as shown in figure 3, for calling the first of third party library to call code and described in the identification original application program After beating again the second calling code in packet application program for calling third party library, the method also includes:
S401, it is called described second and searches third calling code in code;Wherein, the third calls code to be called Third party library domain name relative to it is described first calling code change.
The third party that the domain name for the third party library that third calls code to be called calls code to call relative to described first The domain name in library changes, i.e., third calls the third party library that calls in code to attach most importance to the third party library of packing;It should be understood that Third calls code to have invoked multiple third party libraries.
S402, function classification is carried out to the third party library that the third calls code to be called by presetting classifier, really The corresponding functional category of third party library that the fixed third calls code to be called.
Functional category division is carried out to third party library, the classification for being easy to be beaten again the third party library of packet is obtained, by the category Emphasis of the third party library as Code obfuscation, provide foundation for Code obfuscation.In practical application, third party library can be drawn Divide many classifications, optionally, the functional category of third party library includes advertisement base, social networks, third-party analysis library, map/fixed Position service, game engine and developing instrument.
Optionally, default classifier is obtained by machine learning algorithm training.It is applied using a large amount of Android as input, needle To each application, first using Apktool by its from DEX bytecode decompiling be Smali intermediate code;Then, to Smali generation Each catalog extraction static nature in code, including high-level catalogue such as " com.google.ads " also include low-level mesh It records such as " com.google.ads.util ", the reason is that in order to find the third party of maximum possible in code hierarchical structure Library (third party library root);Then, third party library is detected using the method for feature clustering, the packet for belonging to third party library can quilt Gather relatively large set for the third party library detected, extracts various features then using the machine learning for having supervision Method trains classifier this classifier the third party library of identification can be carried out to the division of functional category.
In practical application, the second calling code and original application for being used to call third party library of packet application program will be beaten again For calling the first of third party library code is called to be compared in program, carry out beating again packet in user's identification the second calling code Third party library and carry out beating again the modification information in the third party library of packet.Specifically, believed first by the domain name of third party library Second calling code is divided into third and code and the 4th is called to call code by breath, the third party that wherein third calls code to call Library is changed relative to the domain name for the third party library that the first calling code calls, the third called in the 4th calling code Fang Ku is identical as the domain name of third party library that first calls code to call, i.e., third is called in code and carried out to third party library Beat again packet, carrying out function by presetting the third party library that the domain name that classifier calls code to call third changes divides Class obtains the functional category for easily being beaten again the third party library of packet;And the third party library and first called in the 4th calling code is adjusted Domain name with the third party library of code calling is identical, i.e., the 4th third party library for calling code to call is included in first and calls code In the third party library of calling, by the API Calls permission and for calling code to the 4th of the third party library for calling same domain name The API Calls permission of one calling code compares, and determines the third party library that packet is beaten again in the 4th calling code, and specific Modification information.
Optionally, statistical disposition is carried out to the third party library that the third of all applications pair to be detected calls code to call, obtained The modification number that third calls each third party library in code is obtained, for each third party library, by the modification number of the third party library Quotient is asked with the accumulative modification number of all third party libraries, obtains the modification number of the third party library, identification domain name is often repaired The third party library changed.
Android application program provided in an embodiment of the present invention beats again bag data processing method, is obtaining the second program code phase Except the modification information for beating again packet code block and the code block of the first program code, the use of packet application program will be beaten again In call third party library second calling code and original application program in for call third party library first call code into Row compares, and beat again the third party library of packet in identification the second calling code and beat again the modification letter in the third party library of packet Breath, more comprehensively complete obtain beat again packet code block and modification information.
Bag data processing method is beaten again by Android application program described in Fig. 1 to Fig. 4, can obtain and beat again packet using journey The code block of packet and the modification information of each code block are most easily beaten again in sequence.In practical application, beat again in packet application program Interface can be also modified, below to the interface for how obtaining the change interface for beating again packet application program and the change interface Modification information is illustrated.
Fig. 5 is the process signal that the Android application program that the next embodiment of the present invention provides beats again bag data processing method Figure.As shown in figure 5, the method also includes:
S501, obtain respectively postrun first interface of the original application program and it is described beat again packet application program operation Second contact surface afterwards.
First interface includes the postrun main interface of original application program and each sub-interface;Second contact surface includes beating again packet to answer With each sub-interface of the postrun main interface of program.
Optionally, obtain first interface the first expandable mark language XML file and the second contact surface the Two XML files specifically obtain the postrun main interface of original application program and corresponding first XML file of each sub-interface, obtain The postrun main interface of packet application program and corresponding second XML file of each sub-interface must be beaten again.
S502, first interface and the second contact surface are compared, obtains the second contact surface relative to described The interface modification information at the first interface.
Optionally, the second XML file of first XML file at the first interface and second contact surface is compared, obtains institute State interface modification information of the second contact surface relative to first interface.Specifically, postrun for packet application program is beaten again Each sub-interface, by the first XML file of identical sub-interface in corresponding second XML file of the sub-interface and the first interface Interface element compares, and obtains interface modification information of the second contact surface relative to first interface.Optionally, interface Modification information includes interface change element and the position on interface for changing element.
Optionally, first by the first XML file of the main interface of original application program and the main boundary for beating again packet application program Second XML file in face carries out interface element comparison, if interface element does not convert between the two, using automation traversal Mode switch to and beat again the postrun each sub-interface of packet application program, then execute above-mentioned interface contrast operation, obtain each The interface modification information of sub-interface;If the first XML file of the main interface of original application program and the master for beating again packet application program Second XML file at interface is different, then it represents that and second contact surface is modified, directly the corresponding interface modification information of preservation main interface, It marks the second contact surface to attach most importance to and is packaged interface.Optionally, automation traversal mode is breadth traversal.
Android application program provided in an embodiment of the present invention beats again bag data processing method, by transporting to original application program It the first interface after row and beats again the postrun second contact surface of packet application program and compares, obtain the change for beating again packet application program The interface modification information at more interface and the change interface.
In practical application, application to be detected is multiple to having, to multiple applications to be detected to the system for carrying out interface change data Meter analysis, can get interface element most easily modified on the interface and interface for being easiest to be modified.Below by described in Fig. 6 Embodiment be described in detail.
Fig. 6 is the process signal that the Android application program that a further embodiment of the present invention provides beats again bag data processing method Figure.As shown in fig. 6, the method also includes:
S601, statistical disposition is carried out to the interface modification information of multiple second XML files, beats again packet described in acquisition and answers With the modification number of each runnable interface of program.
S602, summation process is carried out to the modification number of each runnable interface, obtains the accumulative modification number in interface.
S603, it is directed to each runnable interface, the accumulative modification number of the modification number and the interface to the runnable interface Quotient is sought, determines the modification probability of the runnable interface.
In practical application, application to be detected is multiple to having, and each application to be detected beats again packet application program to corresponding one, Each beating again the second contact surface of packet application program includes multiple runnable interfaces, and each runnable interface is corresponding with the second XML file.
The interface modification information of the second all XML files of multiple applications pair to be detected is classified, is obtained multiple The interface element modification information of runnable interface and each runnable interface carries out statistical disposition to each runnable interface, obtains each operation The modification number at interface;Then summation process is carried out to the modification number of each runnable interface, obtains the accumulative modification number in interface;Needle To each runnable interface, the accumulative modification number of the modification number of the runnable interface and interface is sought into quotient, obtains operation circle The modification probability in face, that is, beat again packet probability.
Optionally, to the modification higher runnable interface of probability, the interface element modification information at its interface is recorded.
Optionally, statistical disposition is carried out to the interface element modification information of each runnable interface, obtains the change of each interface element More number, for example, the change number of payment button;The change number of each interface element is subjected to summation process, obtains Interface Element The accumulative change number of element;For each interface element, the change number of the interface element and the accumulative change number are asked Quotient obtains the change probability of the interface element.
Android application program provided in an embodiment of the present invention beats again bag data processing method, by answering largely to be detected With pair interface modification information carry out statistical disposition, obtain each runnable interface and beat again the probability of packet, to learn that emphasis is beaten again The interface of packet is that emphasis is protected in Code obfuscation.
It in one embodiment, will be each to be detected to a large amount of application to be detected to carrying out beating again bag data processing Using pair API Calls code modification information, the 4th call code call third party library and interface modification information carry out Summarize and count, obtain the code block modified of emphasis, modification data hold include API Calls code modification, the of calling The modification of three-party library and the modification at interface are laid special stress on protecting when Code obfuscation to changing code block.It optionally, will be each The modification data for the code block that emphasis is modified generate examining report, and save, so that user checks.
Provided Android application program beats again bag data processing method based on the above embodiment, and the embodiment of the present invention is into one Step provides the Installation practice for realizing above method embodiment.
Fig. 7 is the structural schematic diagram that the Android application program that one embodiment of the invention provides beats again bag data processing unit. As shown in fig. 7, it includes that program obtains module 710, identification module 720, mentions that the Android application program, which beats again bag data processing unit, Modulus block 730 and the first contrast module 740.
Program obtains module 710, and for obtaining application pair to be detected, the application to be detected is to including original application program With to the original application program beat again packet after generate beat again packet application program;
Identification module 720, for identification for calling the first of third party library to call code in the original application program It beats again in packet application program with described for calling the second of third party library to call code;
Extraction module 730 obtains in addition to described first calls code for extracting from the original application program First program code, and from it is described beat again in packet application program extract obtain except it is described second call code in addition to the second journey Sequence code, wherein include the API Calls code for calling API in first program code and second program code;
First contrast module 740, API Calls code and second journey for that will include in first program code The API Calls code for including in sequence code is compared, and obtains second program code relative to first program code The first API Calls code modification information.
Android application program provided in an embodiment of the present invention beats again bag data processing unit, and obtaining includes original application program The application pair to be detected for beating again packet application program that generates after packet is beaten again with to the original application program, identifies described original answer It is used to call third party library in packet application program with the first calling code and described beat again in program for calling third party library Second call code;The the first program generation obtained in addition to described first calls code is extracted from the original application program Code, and from second program code for beating again and extracting and being obtained in addition to described second calls code in packet application program, it will The API Calls code for including in the API Calls code and second program code for including in first program code carries out Compare, obtains the modification information of first API Calls code of second program code relative to first program code, i.e., Second program code beats again packet code block and the modification information of the code block relative to the first program code, so as to code Obscure and is laid special stress on protecting.
Fig. 8 be another embodiment of the present invention provides Android application program beat again the structural representation of bag data processing unit Figure.As shown in figure 8, it further includes that interface obtains module 750, interface comparison mould that the Android application program, which beats again bag data processing unit, Block 760 calls code extraction module 770, the second contrast module and statistical module 790.
Interface obtains module 750, for obtaining postrun first interface of the original application program and described heavy respectively It is packaged the postrun second contact surface of application program.
Interface obtains module 750, also particularly useful for the first expandable mark language XML file for obtaining first interface With the second XML file of the second contact surface.
Interface contrast module 760 obtains described second for comparing to first interface and the second contact surface Interface modification information of the interface relative to first interface.
Interface contrast module 760 is compared also particularly useful for by first XML file and second XML file, Obtain interface modification information of the second contact surface relative to first interface.
Code extraction module 770 is called, searches third calling code in code for calling described second;Wherein, institute The domain name for stating the third party library that third calls code to be called is changed relative to the first calling code;Pass through default point Class device carries out function classification to the third party library that the third calls code to be called, and determines that the third calls code to be called The corresponding functional category of third party library.
Second contrast module 780 calls code to neutralize domain name phase in the first calling code for searching described second Same third party library;It compares the third party library and calls the API Calls in code and the second calling code described first Permission obtains the modification information for the second API Calls code that described second calls code to call code relative to described first.
Statistical module 790, the modification information and second for the first API Calls code to all applications pair to be detected The modification information of API Calls code carries out statistical disposition, obtains the modification number of every kind of API Calls code;To all modifications Number carries out summation process, obtains accumulative modification number;For every kind of API Calls code, by the modification of this kind of API Calls code Number and the accumulative modification number carry out seeking quotient, determine the modification probability of this kind of API Calls code.
Statistical module 790 carries out Statistics Division also particularly useful for the interface modification information to multiple second XML files Reason obtains the change number of the modification number and each interface element of respectively beating again packet interface;Modification time to packet interface is respectively beaten again Number is summed, and the accumulative modification number in interface is obtained;The accumulative modification time of modification number and the interface to packet interface is respectively beaten again Number seeks quotient, and according to the size of each quotient, determines the corresponding modification probability for beating again packet interface;Change to each component Number summation obtains accumulative change number;Change number and the accumulative change number to each interface element ask quotient, and root According to the size of each quotient, the change probability of corresponding interface element is determined.
Android application program provided in an embodiment of the present invention beats again bag data processing unit, is obtaining the second program code phase Except the modification information for beating again packet code block and the code block of the first program code, it can also identify and beat again packet application Whether the third party library of Calling is beaten again packet and carries out beating again the modification information in the third party library of packet, can also be obtained The change interface of packet application program and the interface modification information at the change interface are beaten again, complete obtain beats again packet using journey comprehensively Packet code block and modification information are beaten again in sequence, and packet modification information and interface modification information progress Statistics Division are beaten again to code Reason calculates the code segment and interface element for obtaining and most generally being modified when beating again packet, to carry out in Code obfuscation stage emphasis Protection, improves the quality and efficiency of Code obfuscation.
The Android application program of Fig. 7 and embodiment illustrated in fig. 8, which beats again bag data processing unit 700, can be used for executing above-mentioned side The technical solution of each embodiment in method, it is similar that the realization principle and technical effect are similar, and details are not described herein again for the present embodiment.
It should be understood that Android application program shown in figure 7 above and Fig. 8 beats again the division of the modules of bag data processing unit A kind of only division of logic function can be completely or partially integrated on a physical entity in actual implementation, can also be with It is physically separate.And these modules can be realized all by way of processing element calls with software;It can also be all with hard The form of part is realized;It can realize that part of module passes through hardware by way of processing element calls with part of module with software Form realize.Furthermore these modules completely or partially can integrate together, can also independently realize.Processing described here Element can be a kind of integrated circuit, the processing capacity with signal.During realization, each step of the above method or more Modules can be completed by the integrated logic circuit of the hardware in processor elements or the instruction of software form.
Fig. 9 is the hardware configuration signal that the Android application program that one embodiment of the invention provides beats again bag data processing equipment Figure.As shown in figure 9, it includes: at least one storage that Android application program provided in this embodiment, which beats again bag data processing equipment 800, Device 810, processor 820 and computer program;Wherein, computer program is stored in memory 810, and be configured as by Device 820 is managed to execute to realize that above-mentioned Android application program such as beats again bag data processing method.
It will be understood by those skilled in the art that Fig. 9 is only the example that Android application program beats again bag data processing equipment, The restriction that bag data processing equipment is beaten again Android application program is not constituted, Android application program beats again bag data processing equipment It may include perhaps combining certain components or different components than illustrating more or fewer components, such as the Android is answered Beating again bag data processing equipment with program can also include input-output equipment, network access equipment, bus etc..
In addition, it is stored thereon with computer program the embodiment of the invention provides a kind of readable storage medium storing program for executing, the computer Program is executed by processor to realize method described in any of the above-described embodiment.
Above-mentioned readable storage medium storing program for executing can be by any kind of volatibility or non-volatile memory device or they Combination is realized, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), it is erasable can Program read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash memory Reservoir, disk or CD.Readable storage medium storing program for executing can be any usable medium that general or specialized computer can access.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Pipe present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: its according to So be possible to modify the technical solutions described in the foregoing embodiments, or to some or all of the technical features into Row equivalent replacement;And these are modified or replaceed, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme.

Claims (10)

1. a kind of Android application program beats again bag data processing method characterized by comprising
Obtain application pair to be detected, the application to be detected is to including original application program and beat again the original application program What is generated after packet beats again packet application program;
It identifies in the original application program for calling the first of third party library to call code and described beating again packet application program In for call third party library second call code;
The first program code obtained in addition to described first calls code is extracted from the original application program, and from institute State the second program code for beating again and extracting and being obtained in addition to described second calls code in packet application program, wherein described first It include the API Calls code for calling API in program code and second program code;
The API Calls generation that will include in the API Calls code for including in first program code and second program code Code is compared, and obtains the modification of first API Calls code of second program code relative to first program code Information.
2. the method according to claim 1, wherein for calling the in the identification original application program After first calling code of three-party library and the second calling code beaten again in packet application program for calling third party library, The method also includes:
Searching described second calls code and described first to call the identical third party library of domain name of code calling;
It compares the third party library and calls the API Calls permission in code and the second calling code described first, obtain The modification information of described second the second API Calls code for calling code to call code relative to described first.
3. according to the method described in claim 2, it is characterized in that, the application to be detected is multiple to having,
The method also includes:
The modification information of modification information and the second API Calls code to the first API Calls code of all applications pair to be detected Statistical disposition is carried out, the modification number of every kind of API Calls code is obtained;
Summation process is carried out to all modification numbers, obtains accumulative modification number;
For every kind of API Calls code, carry out the modification number of this kind of API Calls code and the accumulative modification number to ask quotient Value, determines the modification probability of this kind of API Calls code.
4. the method according to claim 1, wherein for calling the in the identification original application program After first calling code of three-party library and the second calling code beaten again in packet application program for calling third party library, The method also includes:
It is called described second and searches third calling code in code;Wherein, the third party that the third calls code to be called The domain name in library is changed relative to the first calling code;
Function classification is carried out to the third party library that the third calls code to be called by presetting classifier, determines the third The corresponding functional category of third party library for calling code to be called.
5. the method according to claim 1, wherein
Postrun first interface of the original application program is obtained respectively and described beats again packet application program postrun second Interface;
First interface and the second contact surface are compared, obtain the second contact surface relative to first interface Interface modification information.
6. according to the method described in claim 5, it is characterized in that, described to obtain the original application program respectively postrun First interface and it is described beat again the postrun second contact surface of packet application program after, which comprises
Obtain the first expandable mark language XML file at first interface and the second XML file of the second contact surface;
It is described that first interface and the second contact surface are compared, the second contact surface is obtained relative to first boundary The interface modification information in face, comprising:
First XML file and second XML file are compared, obtain the second contact surface relative to described first The interface modification information at interface.
7. a kind of Android application program beats again bag data processing unit characterized by comprising
Program obtains module, and for obtaining application pair to be detected, the application to be detected is to including original application program and to institute State original application program beat again packet after generate beat again packet application program;
Identification module, for identification for calling the first of third party library to call code and described heavy in the original application program It is packaged in application program for calling the second of third party library to call code;
Extraction module, for extracting the first program obtained in addition to described first calls code from the original application program Code, and from it is described beat again in packet application program extract obtain except it is described second call code in addition to the second program code, Wherein, in first program code and second program code include API Calls code for calling API;
First contrast module, API Calls code and second program code for that will include in first program code In include API Calls code be compared, obtain first of second program code relative to first program code The modification information of API Calls code.
8. device according to claim 7, it is characterised in that further include:
Interface obtains module, applies for obtaining postrun first interface of the original application program and the packet of beating again respectively The postrun second contact surface of program;
Interface contrast module obtains the second contact surface phase for comparing to first interface and the second contact surface For the interface modification information at first interface.
9. a kind of Android application program beats again bag data processing equipment, which is characterized in that including memory, processor;
Memory: for storing the processor-executable instruction;
Wherein, the processor is configured to: execute the executable instruction to realize as described in any one of claim 1 to 6 Method.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer in the computer readable storage medium It executes instruction, for realizing side such as claimed in any one of claims 1 to 6 when the computer executed instructions are executed by processor Method.
CN201910417798.2A 2019-05-20 2019-05-20 Android application program beats again bag data processing method and processing device Pending CN110175045A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910417798.2A CN110175045A (en) 2019-05-20 2019-05-20 Android application program beats again bag data processing method and processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910417798.2A CN110175045A (en) 2019-05-20 2019-05-20 Android application program beats again bag data processing method and processing device

Publications (1)

Publication Number Publication Date
CN110175045A true CN110175045A (en) 2019-08-27

Family

ID=67691684

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910417798.2A Pending CN110175045A (en) 2019-05-20 2019-05-20 Android application program beats again bag data processing method and processing device

Country Status (1)

Country Link
CN (1) CN110175045A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112148305A (en) * 2020-10-28 2020-12-29 腾讯科技(深圳)有限公司 Application detection method and device, computer equipment and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473346A (en) * 2013-09-24 2013-12-25 北京大学 Android re-packed application detection method based on application programming interface
CN104091121A (en) * 2014-06-12 2014-10-08 上海交通大学 Method for detecting, removing and recovering malicious codes of Android repackaging malicious software
CN105389508A (en) * 2015-11-10 2016-03-09 工业和信息化部电信研究院 Detection method and apparatus for re-packaged Android application
US20170169223A1 (en) * 2015-12-11 2017-06-15 Institute For Information Industry Detection system and method thereof
CN106951780A (en) * 2017-02-08 2017-07-14 中国科学院信息工程研究所 Beat again the static detection method and device of bag malicious application
CN107169323A (en) * 2017-05-11 2017-09-15 南京大学 Packet inspection method is beaten again in a kind of Android application based on layout cluster figure
US20180144132A1 (en) * 2016-11-18 2018-05-24 Sichuan University Kind of android malicious code detection method on the base of community structure analysis

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473346A (en) * 2013-09-24 2013-12-25 北京大学 Android re-packed application detection method based on application programming interface
CN104091121A (en) * 2014-06-12 2014-10-08 上海交通大学 Method for detecting, removing and recovering malicious codes of Android repackaging malicious software
CN105389508A (en) * 2015-11-10 2016-03-09 工业和信息化部电信研究院 Detection method and apparatus for re-packaged Android application
US20170169223A1 (en) * 2015-12-11 2017-06-15 Institute For Information Industry Detection system and method thereof
US20180144132A1 (en) * 2016-11-18 2018-05-24 Sichuan University Kind of android malicious code detection method on the base of community structure analysis
CN106951780A (en) * 2017-02-08 2017-07-14 中国科学院信息工程研究所 Beat again the static detection method and device of bag malicious application
CN107169323A (en) * 2017-05-11 2017-09-15 南京大学 Packet inspection method is beaten again in a kind of Android application based on layout cluster figure

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
周昊: "Android重打包应用检测技术及Android恶意软件分析技术研究", 《中国优秀硕士学位论文全文数据库(电子期刊)信息科技辑》 *
沈月东等: "基于域名扩展信息的Android第三方库识别与分类", 《中国科技论文在线:HTTP://WWW.PAPER.EDU.CN/RELEASEPAPER/CONTENT/201901-206》 *
王浩宇等: "基于代码克隆检测技术的Android应用重打包检测", 《中国科学:信息科学》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112148305A (en) * 2020-10-28 2020-12-29 腾讯科技(深圳)有限公司 Application detection method and device, computer equipment and readable storage medium

Similar Documents

Publication Publication Date Title
Li et al. Vulnerability detection with fine-grained interpretations
Compton et al. Embedding java classes with code2vec: Improvements from variable obfuscation
CN110737899B (en) Intelligent contract security vulnerability detection method based on machine learning
CN104123493B (en) The safety detecting method and device of application program
Zhan et al. Automated third-party library detection for android applications: Are we there yet?
CN102054149B (en) Method for extracting malicious code behavior characteristic
CN105989283B (en) A kind of method and device identifying virus mutation
CN106203113B (en) The privacy leakage monitoring method of Android application file
Crussell et al. Andarwin: Scalable detection of android application clones based on semantics
Jimenez et al. Vulnerability prediction models: A case study on the linux kernel
CN107341401A (en) A kind of malicious application monitoring method and equipment based on machine learning
CN107832619A (en) Vulnerability of application program automatic excavating system and method under Android platform
CN107657177A (en) A kind of leak detection method and device
Karim et al. Mining android apps to recommend permissions
CN108694320B (en) Method and system for measuring sensitive application dynamic under multiple security environments
Zhang et al. BDA: practical dependence analysis for binary executables by unbiased whole-program path sampling and per-path abstract interpretation
Wang et al. LSCDroid: Malware detection based on local sensitive API invocation sequences
KR102302484B1 (en) Method for mobile malware classification based feature selection, recording medium and device for performing the method
CN105825084B (en) Method for carrying out matching detection to the object with image
CN106250761A (en) A kind of unit identifying web automation tools and method
Zhan et al. A systematic assessment on Android third-party library detection tools
CN112817877B (en) Abnormal script detection method and device, computer equipment and storage medium
Feichtner et al. Obfuscation-resilient code recognition in Android apps
CN110175045A (en) Android application program beats again bag data processing method and processing device
CN109246113A (en) A kind of the SQL injection leak detection method and device of REST API

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190827