CN105205356B - Packet inspection method is beaten again in a kind of APP applications - Google Patents
Packet inspection method is beaten again in a kind of APP applications Download PDFInfo
- Publication number
- CN105205356B CN105205356B CN201510595733.9A CN201510595733A CN105205356B CN 105205356 B CN105205356 B CN 105205356B CN 201510595733 A CN201510595733 A CN 201510595733A CN 105205356 B CN105205356 B CN 105205356B
- Authority
- CN
- China
- Prior art keywords
- installation kit
- app
- web page
- uniformity
- mrow
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000007689 inspection Methods 0.000 title claims abstract description 28
- 238000009434 installation Methods 0.000 claims abstract description 63
- 238000010009 beating Methods 0.000 claims abstract description 31
- 238000012545 processing Methods 0.000 claims abstract description 31
- 238000012856 packing Methods 0.000 claims abstract description 11
- 238000001514 detection method Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 7
- 230000006378 damage Effects 0.000 description 4
- 230000006837 decompression Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012502 risk assessment Methods 0.000 description 2
- 230000004308 accommodation Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000002513 implantation Methods 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/12—Protecting executable software
- G06F21/121—Restricting unauthorised execution of programs
- G06F21/128—Restricting unauthorised execution of programs involving web programs, i.e. using technology especially used in internet, generally interacting with a web browser, e.g. hypertext markup language [HTML], applets, java
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
- G06F16/9566—URL specific, e.g. using aliases, detecting broken or misspelled links
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Technology Law (AREA)
- Computer Hardware Design (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of APP applications to beat again packet inspection method, and whether the installation kit for detecting APP packing generations comprises the following steps by beating again bag processing:Comprise the following steps:Judge each file internal contained by installation kit whether between uniformity and/or contained different files whether with uniformity, if so, installation kit is then judged without bag processing is beaten again, if it is not, then judging that installation kit is handled by beating again bag.Packet inspection method is beaten again in the APP applications of this proposition need not obtain original APP in addition, it is not required that be compared third party APP and original APP, have more preferable practicality and flexibility.
Description
Technical field
The present invention relates to a kind of APP applications to beat again packet inspection method.
Background technology
With the development of mobile Internet, increasing people carries out traditional the Internet activity on mobile phone, such as sees
Video, payment, do shopping, log in social network sites etc..Smart mobile phone occupies increasingly consequence in daily life, greatly
Enrich and facilitate everybody life.At the same time, various malice mobile phone attack softwares emerge in an endless stream.It is mobile phone to beat again bag APP
The important sources of Malware, the sound development of the Android ecosystems is destroyed, causes the interests of developer to incur loss,
The propagation of Malware is promoted, endangers the security privacy of domestic consumer.The APP for beating again bag compares with original APP, generally
Maintain constant but can increase or delete some functions, as replaced some pictures in APP, give up the address in URL,
The webpage of fishing website is loaded, or adds some additional modules to steal the information of user.
Hybrid APP refer to the APP between both Web APP and Native APP, while use html language
Developed with java language, had both Web APP and Native APP advantage concurrently, had cross-platform excellent well
Gesture.Hybrid APP and Native APP and Web APP have a larger difference again simultaneously, APP points of Hybrid for Native layers with
Web layers, Native layers mainly realize that Web layers are mainly realized using HTML5 with java language, and Native APP mainly by
Java codes are realized.On Native APP beat again bag detection method mainly by suspicious APK compared with original APK,
Judge whether to beat again bag according to the similitude for comparing content, similarity is higher, and APK signature is different, then the APK is exactly
Beat again bag.Compared with Native APP, decompiling is more prone to Hybrid APP, because Native APP mainly use java
Code is realized, if it is desired to which it is that java codes are modified again to beat again Native APP bag and must carry out decompiling.And
Hybrid APP are mainly realized by Html5, as long as being decompressed to APK file it is seen that main Web realizes code,
Attacker is easy to understand Html and JavaScript code, so as to easy realization modification and beats again bag.Therefore
Hybrid APP are easier to be beaten again bag than Native APP.
Scheme described in patent CN201210204247 discloses a kind of detection method of repackage application in android market, should
The dex files that method is applied according to each Android calculate the length of used character string, using this as differentiation different application
Condition code, the editing distance between condition code by calculating different application obtain the similarity between different Android applications, and
By the similarity compared with given threshold value, packaging applications of attaching most importance to are determined whether.The present invention can be used as detection Android
A kind of effective supplementary means of Malware, but this method is primarily directed to Native APP, it is impossible to solve well
Hybrid APP's beats again bag problem, and when can not find original APP, this method will be unable to one APP of detection and be
Beat again the APP of bag.
Scheme described in patent CN201410261034 discloses a kind of malicious code for beating again Android bag Malware
Detection, excision and recovery method, be made up of by being established to the malice entrance class of known malicious program fuzzy hash value
Feature database, for being matched with the entrance class of the program to be detected after dis-assembling;Then cut off to be beaten again to wrap successively and add
The resource file of the complete malicious snippets of code and malicious code that enter, finally find out to beat again in packet procedures and original program is implemented
The code snippet of modification, recover its original function.The present invention uses for the rogue program of current Android platform getting worse
Beat again bag implantation malicious code this main propagation characteristic, detect and cut off the malicious code being implanted into those normal procedures
Part.This method is not the test problems for beating again bag primarily directed to the detection in malice generation, it is this for malicious code come
Counterweight packing, which carries out detection, has larger limitation.If it is that substituted for some pictures or one in master APP to beat again bag
A little resources, then this packet inspection method of beating again will can't detect, and this method be also mainly for Native APP, by
Had differences in Hybrid APP and Native APP program structures, this method does not apply to for Hybrid APP.
Scheme described in patent CN201310438647 discloses a kind of Android based on application programming interface and beats again bag
Using detection method.First correspond to be handled with program file, obtain smali code files;For each file, from
The service condition of Android application program DLL, statistic frequency information are extracted in smali codes;Then by file it
Between be compared to each other to be clustered, by similarity is high, the file more than number of iterations is considered as third party library;Remove third party library
Disturb and then in units of application file, the program file high to similarity clusters;Author is finally combined to sign
Information, judge whether to have between application program and beat again bag relation., can be extensive using technical scheme provided by the invention
Counterweight packing application is automatically detected in the application of application market rank, there is very high efficiency and accuracy.But the party
Method does not account for Hybrid APP and Native APP difference, so for Hybrid primarily directed to Native APP
APP is not applied to, and when can not find original APP, this method will be unable to detect the APP whether an APP beats again bag.
Scheme described in patent US2014082729A discloses a kind of beats again bag by risk analysis to calculate one
APP risk, the risk analysis method by calculate in the APP whether have malicious code judge the APP whether be by
Bag is beaten again, the method for employing blacklist is matched.The packing detection of this method counterweight has larger limitation, if beaten again
Some resource files in master APP are simply substituted for during bag, this detection method will be difficult to detect, and this method
And mainly for Native APP, do not applied to for Hybrid APP.
The content of the invention
Present invention aims at a kind of APP is proposed using packet inspection method is beaten again, to solve existing application weight packaging method
The technical problem not strong to Hybrid APP applications applicabilities.
Therefore, the present invention proposes that packet inspection method is beaten again in a kind of APP applications, for detecting the installation kit of APP packing generations
Whether by beating again bag processing, comprise the following steps:It is consistent to judge whether each file internal contained by the installation kit has
Whether there is uniformity between property and/or contained different files, if having uniformity, judge the installation kit without overweight
Packing is handled, if not having uniformity, judges the installation kit by beating again bag processing.
Preferably, judge whether each file internal contained by the installation kit has uniformity and/or contained difference
Whether there is uniformity can use one or more of progress in following manner between file:
Mode one, judge whether the local file that the installation kit accesses has unanimously with the file contained by the installation kit
Property;
Mode two, judge whether is network file chained address Main Domain corresponding with the APP that the installation kit accesses
With uniformity;
Mode three, judge whether the content of web page files in the installation kit has uniformity;
Mode four, judge whether the applicating category in the installation kit between files in different types has uniformity.
Preferably, using the mode for the moment, the APP beats again packet inspection method and comprised the following steps:
All decompressing files and decompressing files names contained by S31, the acquisition installation kit;
S32, obtain all local filenames that the installation kit accesses;
S33, each local filename is handled as follows:Judge the local filename and each decompressing files
The filename similarity of name, if the local filename and the filename similarity of all decompressing files names are both less than the first threshold
Value, then the installation kit is by beating again bag processing;If the local filename and the file famous prime minister of a certain decompressing files name
It is not less than first threshold like degree, then the installation kit is handled without bag is beaten again.
When preferably, using the mode two, the APP beats again packet inspection method and comprised the following steps:
All decompressing files contained by S41, the acquisition installation kit;
S42, obtain the all-network file chained address that the installation kit accesses;
S43, each chained address is handled as follows:Judge the chained address and each subdomain in white list
The domain name similarity of name, if the chained address and the domain name similarity of all subdomain names are both less than Second Threshold, the peace
Dress bag is by beating again bag processing;If the chained address and the domain name similarity of a certain subdomain name are not less than Second Threshold, institute
State installation kit and handled without bag is beaten again;Wherein, the white list includes subdomain name corresponding with the APP.
When preferably, using the mode three, the APP beats again packet inspection method and comprised the following steps:
S51, the content characteristic values for obtaining each web page files in the installation kit, remember the content of i-th of web page files
Characteristic value is H (i);
S52, obtain the Difference of content D (s) between all web page files in the installation kit:
Wherein, nhFor the number of web page files in installation kit, H (j) is the content characteristic values of j-th of web page files, d (H
(i), H (j)) Difference of content between i-th of web page files and j-th of web page files, 1≤i < j≤nh, and i, j is just
Integer;
S53, obtain the content consistency f between all web page files in the installation kithtml:
fhtml=1-2D (s)/[nh(nh+1)]
Judge fhtmlWhether the 3rd threshold value is not less than, if so, then the installation kit is handled without bag is beaten again;If it is not, then
The installation kit is by beating again bag processing.
Preferably, in step S51, the content characteristic values using the keyword of i-th of web page files as i-th of web page files
H(i);
In step S52, using different keyword numbers between i-th of web page files and j-th of web page files as described interior
Hold diversity factor d (H (i), H (j)).
When preferably, using the mode four, the APP beats again packet inspection method and comprised the following steps:
S71, obtain apply class corresponding to web page files in the installation kit, xml document and JavaScript file respectively
Not;
S72, the uniformity P for obtaining applicating category between web page files and JavaScript fileHJ;
S73, the uniformity Q for obtaining applicating category between xml document and JavaScript fileXJ;
The uniformity F of applicating category between S74, acquisition web page files, xml document and JavaScript file:
F=w1*PHJ+w2*QXJ;
Wherein, w1And w2For positive number;Judge whether F values are not less than the 4th threshold value, if so, then the installation kit is without overweight
Packing is handled, if it is not, then the installation kit is by beating again bag processing.
Preferably, in step S71,
Applicating category corresponding to the JavaScript file isnjFor JavaScript file
Middle API number, J (j) are applicating category corresponding to j-th of API in JavaScript file;
Applicating category corresponding to the web page files isnhFor the number of web page files, H (h)
For applicating category corresponding to keyword in h-th of web page files;
Applicating category corresponding to the xml document is X, and X is applicating category corresponding to keyword in all xml documents.
In step S72, the uniformity P of applicating category between web page files and JavaScript fileHJ:
Wherein, p (h, j) represents the uniformity of applicating category between h-th of web page files and j-th of API, and formula is:
P (h, j)=| J (j) ∩ H (h) |/| J (j) ∪ H (h) |,
It is 0 that if h-th of web page files, which does not call j-th of API, p (h, j),;Wherein, | J (j) ∩ H (h) | it is J (j) and H
(h) element number in occuring simultaneously, | J (j) ∪ H (h) | for J (j) and H (h) and the element number of concentration;
In step S73, the uniformity Q of applicating category between xml document and JavaScript fileXJ:
QXJ=min q (1), q (2) ..., q (na)};
Wherein, q (j) represents the uniformity of applicating category between all xml documents and j-th of API, naRepresent
API number, formula are in JavaScript file:
Q (j)=| J (j) ∩ X |/| J (j) ∪ X |,
Wherein, | J (j) ∩ X | the element number in occuring simultaneously for X and J (j).
Preferably, 0 < w2< w1< 1, w1+w2=1.
Preferably, the APP is Hybrid APP.
APP applications proposed by the present invention are beaten again packet inspection method and are realized that " uniformity is former using APP " consistency principle "
Then " refer to normal APP to realize same major function, perhaps work(inside between its each file internal and different files
Can on certain contact, the characteristics of uniformity be present.This APP applications beat again packet inspection method and do not have to carry out with original APK
Compare, just can judge whether the APK is beaten again bag, can effectively improve using the verification and measurement ratio for beating again bag.
Brief description of the drawings
Fig. 1 is that bag overhaul flow chart one is beaten again in the APP applications of the specific embodiment of the invention;
Fig. 2 is that bag overhaul flow chart two is beaten again in the APP applications of the specific embodiment of the invention.
Embodiment
With reference to embodiment and compare accompanying drawing the present invention is described in further detail.It is emphasized that
What the description below was merely exemplary, the scope being not intended to be limiting of the invention and its application.
With reference to the following drawings, non-limiting and nonexcludability embodiment will be described, wherein identical reference represents
Identical part, unless stated otherwise.
The present invention proposes a kind of APP application and beats again packet inspection method, for detect APP packing generations installation kit whether
By beating again bag processing.In order to ensure there is one with original APK on interface and function by the installation kit for beating again bag processing
Fixed similitude, the common following some situations of modification that attacker makes to installation kit:
1) parameter of loadurl () in Java code is changed, changes the webpage of APP loadings, the APP is loaded Fishing net
Stand so as to steal the sensitive information of user;
2) link URL in html webpage file is changed, the resource for making to be loaded in URL changes;
3) html webpage file is changed, such as increases the html webpage file in the APP;
4) JavaScript code is changed, changes the function of realization.
For these potential attacks, it is every according to contained by installation kit that packet inspection method is beaten again in APP applications proposed by the present invention
Individual file internal whether between uniformity and/or contained different files whether with uniformity, if so, then judging the peace
Dress bag is without bag processing is beaten again, if it is not, then judging the installation kit by beating again bag processing.Here uniformity refers to one not
There should be consistent theme between file contained by the installation kit of bag through beating again, and realize consistent function, be this hair referring to Fig. 1
Bag overhaul flow chart one is beaten again in the APP applications of bright embodiment.Packet inspection method is beaten again in the APP applications of this proposition need not
Original APP is obtained in addition, it is not required that is compared third party APP and original APP, with more preferable practicality and flexibly
Property.
For convenience of description, by taking APK installation kits as an example, other installation kits such as IPA etc. has similar process process.The present invention one
In embodiment, packet inspection method is beaten again in APP applications can use the one or more in following manner to carry out:
Detection mode one:
Judge APK access local file and APK contained by file whether there is uniformity, if so, then the APK without
Beat again bag processing;If it is not, then the APK is by beating again bag processing.In one embodiment, comprise the following steps:
S1, decompression APK, obtain all decompressing files and decompressing files names;
S2, in decompressing files by obtaining loadurl () parameter, obtain all local filenames of APK access;
S3, each local filename is handled as follows successively:Judge the local filename and each decompression text
The filename similarity of part name, is calculated using the mode of string matching, if the local filename and all decompression texts
The filename similarity of part name is both less than default first threshold (such as 0.5), it is believed that loadurl () parameter consistency in APK
Destroyed, then the APK is by beating again bag processing;If the filename similarity of the local filename and a certain decompressing files name is not
Less than default first threshold, it is believed that loadurl () parameter consistency is not affected by destruction in APK, then the APK is without overweight
Packing is handled.
Detection mode two:
Judge whether the network file chained address Main Domain corresponding with APP that APK is accessed has uniformity, if so, then
The APK is handled without bag is beaten again;If it is not, then the APK is by beating again bag processing.In one embodiment, comprise the following steps:
S1, decompression APK file, obtain all decompressing files;
S2, the all-network file chained address that APK is accessed is obtained in decompressing files;
S3, each chained address is handled as follows successively:Judge the chained address and each height in white list
The domain name similarity of domain name, is calculated using KMP string matching algorithms, if the chained address and the domain of all subdomain names
Name similarity is both less than default Second Threshold (such as 0.5), it is believed that the link uniformity in APK is destroyed, then the APK passes through
Beat again bag processing;If the chained address and the domain name similarity of a certain subdomain name are not less than default Second Threshold, it is believed that in APK
Link uniformity be not affected by destruction, then the APK without beat again bag processing.Wherein, listed in white list related to the APP
Subdomain name.Such as the APP of Baidu's news, include the subdomain name for belonging to its main domain www.baidu.com in its white list
www.news.baidu.com。
Detection mode three:
Whether have uniformity, if so, then the APK is handled without bag is beaten again if judging the content of web page files in APK;If
No, then the APK is by beating again bag processing.In one embodiment, comprise the following steps:
S1, the content characteristic values for obtaining each web page files in APK, the content characteristic values of i-th of web page files of note are H
(i);In one embodiment, using the keyword of i-th of web page files as content characteristic values H (i), such as i-th of html
Web page files, to avoid extracting garbage, text is extracted after can first removing the label substance in the html webpage file
Content, then five most words of occurrence number are screened from content of text, as the content characteristic values of the html webpage file,
It is designated as:H (i)={ C (i)1,C(i)2,C(i)3,C(i)4,C(i)5, wherein, C (i)pRepresent to press in i-th of html webpage file
The word of occurrence number ranking pth, 1≤p≤5;
Assuming that n is shared in APKhIndividual html webpage file, nhFor positive integer, then the content characteristic value set of the APK can be remembered
For:S=H (1), H (2) ..., H (nh)};
S2, obtain the Difference of content D (s) between all web page files in APK:
Wherein, H (j) is the content characteristic values of j-th of web page files, and d (H (i), H (j)) is i-th of web page files and jth
Difference of content between individual web page files, such as different content characteristic value between i-th of web page files and j-th of web page files
Number, 1≤i < j≤nh, and i, j are positive integer;
S3, obtain the content consistency f between all web page files in APKhtml:
fhtml=1-2D (s)/[nh(nh+1)]
Judge fhtmlWhether default 3rd threshold value is not less than, if, it is believed that webpage uniformity is not affected by destruction in APK,
Then the APK is handled without bag is beaten again;If not, it is believed that webpage uniformity is destroyed in APK, then the APK is by beating again at bag
Reason.
Detection mode four:
Judge whether the APK types in APK between files in different types have uniformity, if so, then the APK is without overweight
Packing is handled;If it is not, then the APK is by beating again bag processing.Include web page files (such as html webpage text in a usual APK
Part, htm files), the different type type such as xml document and javascript files, in one embodiment, including following step
Suddenly:
S1, web page files, xml document and APK classifications corresponding to JavaScript file in APK are obtained respectively, in this hair
In bright embodiment, APK classifications include:Map, navigation, communication, friend-making, shopping, weather, news, video, education, study, reason
Wealth, game;
(1) JavaScript file:
Successively to each API processing in JavaScript file:Obtained by searching per class APK API Calls table
Take APK classifications corresponding to the API;Such as j-th of API is appeared in map class APK API Calls table, then APK corresponding to the API
Classification is just " map ";Because an API may correspond to multiple APK classifications, APK corresponding to j-th of API is represented with set J (j)
Classification, the APK classifications as corresponding to J (j)={ A, B, C, D } represents j-th of API are A, B, C, D, A, B, C, and D presses possibility size
Sequence;
Wherein, the API Calls table per class APK can use following steps to obtain:
Sa, a large amount of APK are analyzed, the API that all APK are called in every class APK is extracted, obtained per class APK
Public API;
Sb, the API for belonging to other APK classes is deleted in every class APK public API, remaining API is considered such APK
Feature API, such APK all feature API be to form such APK API Calls table;For example, A, B, C are three classes
APK, wherein, A has 5 APK { A.1, A.2, A.3, A.4, A.5 }, and B has 5 APK { B.1, B.2, B.3, B.4, B.5 }, and C has 4
APK { C.1, C.2, C.3, C.4, C.5 }, then a.1 it is A classes APK if all called in { A.1, A.2, A.3, A.4, A.5 } a.1
Public API, and be a.1 A public API but be not B or C public API, therefore, a.1 it is A feature API, similarly finds out A classes
APK other features API, form A classes APK API Calls table;
(2) web page files:
Each html webpage file is handled successively:Extracted using HtmlParser in the html webpage file
One keyword;The keyword is searched in html Keyword Lists, obtains APK classifications corresponding to the html webpage file, is used
Set H (h) represents APK classifications corresponding to h-th of html webpage file;
Wherein, html Keyword Lists can use following steps to obtain:
Sa, a large amount of html webpage files are analyzed, html webpage file corresponding to all APK in every class APK is closed
Key word is extracted, and obtains the public keyword per class APK;
Sb, the keyword for belonging to other classes is deleted in every class APK public keyword successively, remaining keyword is thought
It is such APK feature critical word, such APK all feature critical words are the html Keyword Lists for forming such APK;
(3) xml document:
All xml documents are handled:Content of text is extracted from all xml documents using regular expression, is utilized
TextRank extracts a keyword from content of text;The keyword is searched in xml Keyword Lists, obtains all xml
APK classifications corresponding to file, the APK classifications corresponding to X represents all xml documents;Wherein, the acquisition side of xtml Keyword Lists
Method is similar with html Keyword Lists, will not be repeated here;
S2, the uniformity P for obtaining APK types between html webpage file and JavaScript fileHJ:
Wherein, nhRepresent the number of html webpage file, naRepresent the number of API in JavaScript file, p (h, j) table
Show the uniformity of APK types between h-th of html webpage file and j-th of API, j, h are positive integer, and p (h, j) formula is:
P (h, j)=| J (j) ∩ H (h) |/| J (j) ∪ H (h) |,
It is 0 that if h-th of html webpage file, which does not call j-th of API, p (h, j),;| J (j) ∩ H (h) | it is J (j) and H
(h) element number in occuring simultaneously, | J (j) ∪ H (h) | for J (j) and H (h) and the element number of concentration;P (h, j) value is bigger,
The uniformity of APK types is bigger between h-th of html webpage file and j-th of API;
S3, the uniformity Q for obtaining APK types between xml document and JavaScript fileXJ:
QXJ=min q (1), q (2) ..., q (na)};
Wherein, q (j) represents the uniformity of APK types between all xml documents and j-th of API, and calculation formula is:
Q (j)=| J (j) ∩ X |/| J (j) ∪ X |,
Wherein, | J (j) ∩ X | the element number in occuring simultaneously for X and J (j);Q (j) values are bigger, all xml documents and j-th
Uniformity between API is bigger;
The uniformity F of APK types between S4, acquisition html webpage file, xml document and JavaScript file;
In Hybrid APP, xml document only handles layout, and html webpage file has tangible meaning and function, institute
It is bigger than contribution of the xml document to whole Hybrid APP with html webpage file, in an embodiment of the present invention, consider this two
Class file is to the Different Effects power of global consistency, if the weights of xml document are w (0 < w < 1/2), the power of html webpage file
It is worth and is for (1-w), F calculation formula:
F=w*QXJ+(1-w)*PHJ
F values are bigger, illustrate that the global consistency of the APK is bigger.Judge whether F values are not less than default 4th threshold value, if
It is, it is believed that APK global consistency is not affected by destruction, then the APK is without bag processing is beaten again, if not, it is believed that overall the one of APK
Cause property is destroyed, then the APK is by beating again bag processing.
Bag overhaul flow chart two is beaten again referring to the APP applications that Fig. 2 is the specific embodiment of the invention, it is proposed by the present invention
The difference that packet inspection method take into account Hybrid APP and Native APP is beaten again in APP applications, using joining in Hybrid APP
Number uniformity, link uniformity, html file consistences and global consistency detect to realize, can be in large-scale application market
Bag is beaten again Hybrid APP in the application of rank and carries out automatic detection, and there is very high efficiency and accuracy.
It would be recognized by those skilled in the art that it is possible that numerous accommodations are made to above description, so embodiment is only
For describing one or more particular implementations.
Although having been described above and describing the example embodiment for being counted as the present invention, it will be apparent to those skilled in the art that
It can be variously modified and replaced, without departing from the spirit of the present invention.Furthermore it is possible to many modifications are made with by spy
Condition of pledging love is fitted to the religious doctrine of the present invention, without departing from invention described herein central concept.So the present invention is unrestricted
In specific embodiment disclosed here, but the present invention all embodiments that may also include belonging to the scope of the invention and its equivalent
Thing.
Claims (5)
1. packet inspection method is beaten again in a kind of APP applications, for whether detecting the installation kit of APP packing generations by beating again at bag
Reason, it is characterised in that comprise the following steps:Judge each file internal contained by the installation kit whether have uniformity and/
Or whether there is uniformity between contained different files, if having uniformity, judge the installation kit without beat again bag
Processing, if not having uniformity, judge the installation kit by beating again bag processing;
Judge each file internal contained by the installation kit whether have between uniformity and/or contained different files whether
One or more of progress in following manner can be used with uniformity:
Mode one, judge whether the local file that the installation kit accesses has uniformity with the file contained by the installation kit;
Mode two, judge whether the network file chained address subdomain name corresponding with the APP that the installation kit accesses has
Uniformity;
Mode three, judge whether the content of web page files in the installation kit has uniformity;
Mode four, judge whether the applicating category in the installation kit between files in different types has uniformity;
Wherein using the mode for the moment, the APP beats again packet inspection method and comprised the following steps:
All decompressing files and decompressing files names contained by S31, the acquisition installation kit;
S32, obtain all local filenames that the installation kit accesses;
S33, each local filename is handled as follows:Judge the local filename and each decompressing files name
Filename similarity, if the local filename and the filename similarity of all decompressing files names are both less than first threshold,
The installation kit is by beating again bag processing;If the filename similarity of the local filename and a certain decompressing files name is not
Less than first threshold, then the installation kit is without beating again bag processing;
When wherein using the mode two, the APP beats again packet inspection method and comprised the following steps:
All decompressing files contained by S41, the acquisition installation kit;
S42, obtain the all-network file chained address that the installation kit accesses;
S43, each chained address is handled as follows:Judge the chained address and each subdomain name in white list
Domain name similarity, if the chained address and the domain name similarity of all subdomain names are both less than Second Threshold, the installation kit
By beating again bag processing;If the chained address and the domain name similarity of a certain subdomain name are not less than Second Threshold, the peace
Dress bag is handled without bag is beaten again;Wherein, the white list includes subdomain name corresponding with the APP;
When wherein using the mode three, the APP beats again packet inspection method and comprised the following steps:
S51, the content characteristic values for obtaining each web page files in the installation kit, remember the content characteristic of i-th of web page files
It is worth for H (i);
S52, obtain the Difference of content D (s) between all web page files in the installation kit:
<mrow>
<mi>D</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<msub>
<mi>n</mi>
<mi>h</mi>
</msub>
</msubsup>
<msubsup>
<mi>&Sigma;</mi>
<mrow>
<mi>j</mi>
<mo>=</mo>
<mi>i</mi>
<mo>+</mo>
<mn>1</mn>
</mrow>
<msub>
<mi>n</mi>
<mi>h</mi>
</msub>
</msubsup>
<mi>d</mi>
<mrow>
<mo>(</mo>
<mi>H</mi>
<mo>(</mo>
<mi>i</mi>
<mo>)</mo>
<mo>,</mo>
<mi>H</mi>
<mo>(</mo>
<mi>j</mi>
<mo>)</mo>
<mo>)</mo>
</mrow>
</mrow>
Wherein, nhFor the number of web page files in installation kit, H (j) is the content characteristic values of j-th of web page files, d (H (i), H
(j)) the Difference of content between i-th of web page files and j-th of web page files, 1≤i < j≤nh, and i, j are positive integer;
S53, obtain the content consistency f between all web page files in the installation kithtml:
fhtml=1-2D (s)/[nh(nh+1)]
Judge fhtmlWhether the 3rd threshold value is not less than, if so, then the installation kit is handled without bag is beaten again;If it is not, the then peace
Dress bag is by beating again bag processing;
When wherein using the mode four, the APP beats again packet inspection method and comprised the following steps:
S71, web page files in the installation kit, xml document and applicating category corresponding to JavaScript file are obtained respectively;
S72, the uniformity P for obtaining applicating category between web page files and JavaScript fileHJ;
S73, the uniformity Q for obtaining applicating category between xml document and JavaScript fileXJ;
The uniformity F of applicating category between S74, acquisition web page files, xml document and JavaScript file:
F=w1*PHJ+w2*QXJ;
Wherein, w1And w2For positive number;Judge whether F values are not less than the 4th threshold value, if so, then the installation kit without beat again bag
Processing, if it is not, then the installation kit is by beating again bag processing.
2. packet inspection method is beaten again in APP applications as claimed in claim 1, it is characterised in that in step S51, by i-th of webpage
Content characteristic values H (i) of the keyword of file as i-th of web page files;
In step S52, using different keyword numbers between i-th of web page files and j-th of web page files as the interior tolerance
Different degree d (H (i), H (j)).
3. packet inspection method is beaten again in APP applications as claimed in claim 1, it is characterised in that in step S71,
Applicating category corresponding to the JavaScript file isnjFor API in JavaScript file
Number, J (j) be JavaScript file in applicating category corresponding to j-th of API;
Applicating category corresponding to the web page files isnhFor the number of web page files, H (h) is h
Applicating category corresponding to keyword in individual web page files;
Applicating category corresponding to the xml document is X, and X is applicating category corresponding to keyword in all xml documents;
In step S72, the uniformity P of applicating category between web page files and JavaScript fileHJ:
<mrow>
<msub>
<mi>P</mi>
<mrow>
<mi>H</mi>
<mi>J</mi>
</mrow>
</msub>
<mo>=</mo>
<msub>
<mi>min</mi>
<mrow>
<mn>1</mn>
<mo>&le;</mo>
<mi>h</mi>
<mo>&le;</mo>
<msub>
<mi>n</mi>
<mi>h</mi>
</msub>
<mo>,</mo>
<mn>1</mn>
<mo>&le;</mo>
<mi>j</mi>
<mo>&le;</mo>
<msub>
<mi>n</mi>
<mi>j</mi>
</msub>
</mrow>
</msub>
<mo>{</mo>
<mi>p</mi>
<mrow>
<mo>(</mo>
<mi>h</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
<mo>|</mo>
<mi>p</mi>
<mrow>
<mo>(</mo>
<mi>h</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
<mo>&NotEqual;</mo>
<mn>0</mn>
<mo>}</mo>
<mo>;</mo>
</mrow>
Wherein, p (h, j) represents the uniformity of applicating category between h-th of web page files and j-th of API, and formula is:
P (h, j)=| J (j) ∩ H (h) |/| J (j) ∪ H (h) |,
It is 0 that if h-th of web page files, which does not call j-th of API, p (h, j),;Wherein, | J (j) ∩ H (h) | handed over for J (j) and H (h)
The element number of concentration, | J (j) ∪ H (h) | for J (j) and H (h) and the element number of concentration;
In step S73, the uniformity Q of applicating category between xml document and JavaScript fileXJ:
QXJ=min q (1), q (2) ..., q (na)};
Wherein, q (j) represents the uniformity of applicating category between all xml documents and j-th of API, naRepresent JavaScript texts
API number, formula are in part:
Q (j)=| J (j) ∩ X |/| J (j) ∪ X |,
Wherein, | J (j) ∩ X | the element number in occuring simultaneously for X and J (j).
4. packet inspection method is beaten again in APP applications as claimed in claim 1, it is characterised in that 0 < w2< w1< 1, w1+w2=1.
5. packet inspection method is beaten again in the APP applications as described in any one of Claims 1 to 4, it is characterised in that the APP is
Hybrid APP。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510595733.9A CN105205356B (en) | 2015-09-17 | 2015-09-17 | Packet inspection method is beaten again in a kind of APP applications |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510595733.9A CN105205356B (en) | 2015-09-17 | 2015-09-17 | Packet inspection method is beaten again in a kind of APP applications |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105205356A CN105205356A (en) | 2015-12-30 |
CN105205356B true CN105205356B (en) | 2017-12-29 |
Family
ID=54953032
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510595733.9A Active CN105205356B (en) | 2015-09-17 | 2015-09-17 | Packet inspection method is beaten again in a kind of APP applications |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105205356B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105897923B (en) * | 2016-05-31 | 2019-04-30 | 中国科学院信息工程研究所 | A kind of APP installation kit network flow identification method |
CN106971098B (en) | 2016-10-11 | 2020-06-02 | 阿里巴巴集团控股有限公司 | Method and device for preventing repacking |
CN106951780B (en) * | 2017-02-08 | 2019-09-10 | 中国科学院信息工程研究所 | Beat again the static detection method and device of packet malicious application |
CN108958826B (en) * | 2017-05-22 | 2022-06-07 | 北京京东尚科信息技术有限公司 | Method and device for dynamically configuring application installation package |
CN108280647A (en) * | 2018-02-12 | 2018-07-13 | 北京金山安全软件有限公司 | Private key protection method and device for digital wallet, electronic equipment and storage medium |
CN109800575B (en) * | 2018-12-06 | 2023-06-20 | 成都网安科技发展有限公司 | Security detection method for Android application program |
CN109858249B (en) * | 2019-02-18 | 2020-08-07 | 暨南大学 | Rapid intelligent comparison and safety detection method for mobile malicious software big data |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101719821A (en) * | 2008-10-09 | 2010-06-02 | 爱思开电讯投资(中国)有限公司 | System for managing application program of intelligent card and method thereof |
CN104392181A (en) * | 2014-11-18 | 2015-03-04 | 北京奇虎科技有限公司 | SO file protection method and device and android installation package reinforcement method and system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8875298B2 (en) * | 2012-02-16 | 2014-10-28 | Nec Laboratories America, Inc. | Method for scalable analysis of android applications for security vulnerability |
-
2015
- 2015-09-17 CN CN201510595733.9A patent/CN105205356B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101719821A (en) * | 2008-10-09 | 2010-06-02 | 爱思开电讯投资(中国)有限公司 | System for managing application program of intelligent card and method thereof |
CN104392181A (en) * | 2014-11-18 | 2015-03-04 | 北京奇虎科技有限公司 | SO file protection method and device and android installation package reinforcement method and system |
Non-Patent Citations (1)
Title |
---|
基于改进贝叶斯分类的Android恶意软件检测;张思琪;《综合电子信息技术》;20140630;第40卷(第6期);第73-76页 * |
Also Published As
Publication number | Publication date |
---|---|
CN105205356A (en) | 2015-12-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105205356B (en) | Packet inspection method is beaten again in a kind of APP applications | |
Sahingoz et al. | Machine learning based phishing detection from URLs | |
Fan et al. | Dapasa: detecting android piggybacked apps through sensitive subgraph analysis | |
Patil et al. | Malicious URLs detection using decision tree classifiers and majority voting technique | |
Wang et al. | A deep learning approach for detecting malicious JavaScript code | |
Schmidt et al. | Static analysis of executables for collaborative malware detection on android | |
Liu et al. | A novel approach for detecting browser-based silent miner | |
Buber et al. | NLP based phishing attack detection from URLs | |
CN102541937B (en) | Webpage information detection method and system | |
Ceschin et al. | The need for speed: An analysis of brazilian malware classifiers | |
CN107180192A (en) | Android malicious application detection method and system based on multi-feature fusion | |
CN107659570A (en) | Webshell detection methods and system based on machine learning and static and dynamic analysis | |
Zhang et al. | SaaS: A situational awareness and analysis system for massive android malware detection | |
CN107463844B (en) | WEB Trojan horse detection method and system | |
Alhaidari et al. | ZeVigilante: Detecting Zero‐Day Malware Using Machine Learning and Sandboxing Analysis Techniques | |
Yuan et al. | A novel approach for malicious URL detection based on the joint model | |
CN103268449A (en) | Method and system for detecting mobile phone malicious codes at high speed | |
Gonzalez et al. | Authorship attribution of android apps | |
Sanglerdsinlapachai et al. | Web phishing detection using classifier ensemble | |
CN105205398B (en) | It is a kind of that shell side method is looked into based on APK shell adding software dynamic behaviours | |
CN116010947A (en) | Android malicious software detection method based on heterogeneous network | |
CN104866764A (en) | Object reference graph-based Android cellphone malicious software detection method | |
Moon et al. | Compact feature hashing for machine learning based malware detection | |
Gorji et al. | Detecting obfuscated JavaScript malware using sequences of internal function calls | |
Casino et al. | Analysis and correlation of visual evidence in campaigns of malicious office documents |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 518055 Guangdong city of Shenzhen province Nanshan District Xili of Tsinghua Applicant after: Graduate School at Shenzhen, Tsinghua University Address before: 518000 Guangdong city in Shenzhen Province, Nanshan District City Xili Shenzhen Tsinghua Campus of Tsinghua University Applicant before: Graduate School at Shenzhen, Tsinghua University |
|
COR | Change of bibliographic data | ||
GR01 | Patent grant | ||
GR01 | Patent grant |