CN105205356B - Packet inspection method is beaten again in a kind of APP applications - Google Patents

Packet inspection method is beaten again in a kind of APP applications Download PDF

Info

Publication number
CN105205356B
CN105205356B CN201510595733.9A CN201510595733A CN105205356B CN 105205356 B CN105205356 B CN 105205356B CN 201510595733 A CN201510595733 A CN 201510595733A CN 105205356 B CN105205356 B CN 105205356B
Authority
CN
China
Prior art keywords
installation kit
app
web page
uniformity
mrow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510595733.9A
Other languages
Chinese (zh)
Other versions
CN105205356A (en
Inventor
肖喜
张少峰
李清
胡光武
江勇
夏树涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Graduate School Tsinghua University
Original Assignee
Shenzhen Graduate School Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Graduate School Tsinghua University filed Critical Shenzhen Graduate School Tsinghua University
Priority to CN201510595733.9A priority Critical patent/CN105205356B/en
Publication of CN105205356A publication Critical patent/CN105205356A/en
Application granted granted Critical
Publication of CN105205356B publication Critical patent/CN105205356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/121Restricting unauthorised execution of programs
    • G06F21/128Restricting unauthorised execution of programs involving web programs, i.e. using technology especially used in internet, generally interacting with a web browser, e.g. hypertext markup language [HTML], applets, java
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of APP applications to beat again packet inspection method, and whether the installation kit for detecting APP packing generations comprises the following steps by beating again bag processing:Comprise the following steps:Judge each file internal contained by installation kit whether between uniformity and/or contained different files whether with uniformity, if so, installation kit is then judged without bag processing is beaten again, if it is not, then judging that installation kit is handled by beating again bag.Packet inspection method is beaten again in the APP applications of this proposition need not obtain original APP in addition, it is not required that be compared third party APP and original APP, have more preferable practicality and flexibility.

Description

Packet inspection method is beaten again in a kind of APP applications
Technical field
The present invention relates to a kind of APP applications to beat again packet inspection method.
Background technology
With the development of mobile Internet, increasing people carries out traditional the Internet activity on mobile phone, such as sees Video, payment, do shopping, log in social network sites etc..Smart mobile phone occupies increasingly consequence in daily life, greatly Enrich and facilitate everybody life.At the same time, various malice mobile phone attack softwares emerge in an endless stream.It is mobile phone to beat again bag APP The important sources of Malware, the sound development of the Android ecosystems is destroyed, causes the interests of developer to incur loss, The propagation of Malware is promoted, endangers the security privacy of domestic consumer.The APP for beating again bag compares with original APP, generally Maintain constant but can increase or delete some functions, as replaced some pictures in APP, give up the address in URL, The webpage of fishing website is loaded, or adds some additional modules to steal the information of user.
Hybrid APP refer to the APP between both Web APP and Native APP, while use html language Developed with java language, had both Web APP and Native APP advantage concurrently, had cross-platform excellent well Gesture.Hybrid APP and Native APP and Web APP have a larger difference again simultaneously, APP points of Hybrid for Native layers with Web layers, Native layers mainly realize that Web layers are mainly realized using HTML5 with java language, and Native APP mainly by Java codes are realized.On Native APP beat again bag detection method mainly by suspicious APK compared with original APK, Judge whether to beat again bag according to the similitude for comparing content, similarity is higher, and APK signature is different, then the APK is exactly Beat again bag.Compared with Native APP, decompiling is more prone to Hybrid APP, because Native APP mainly use java Code is realized, if it is desired to which it is that java codes are modified again to beat again Native APP bag and must carry out decompiling.And Hybrid APP are mainly realized by Html5, as long as being decompressed to APK file it is seen that main Web realizes code, Attacker is easy to understand Html and JavaScript code, so as to easy realization modification and beats again bag.Therefore Hybrid APP are easier to be beaten again bag than Native APP.
Scheme described in patent CN201210204247 discloses a kind of detection method of repackage application in android market, should The dex files that method is applied according to each Android calculate the length of used character string, using this as differentiation different application Condition code, the editing distance between condition code by calculating different application obtain the similarity between different Android applications, and By the similarity compared with given threshold value, packaging applications of attaching most importance to are determined whether.The present invention can be used as detection Android A kind of effective supplementary means of Malware, but this method is primarily directed to Native APP, it is impossible to solve well Hybrid APP's beats again bag problem, and when can not find original APP, this method will be unable to one APP of detection and be Beat again the APP of bag.
Scheme described in patent CN201410261034 discloses a kind of malicious code for beating again Android bag Malware Detection, excision and recovery method, be made up of by being established to the malice entrance class of known malicious program fuzzy hash value Feature database, for being matched with the entrance class of the program to be detected after dis-assembling;Then cut off to be beaten again to wrap successively and add The resource file of the complete malicious snippets of code and malicious code that enter, finally find out to beat again in packet procedures and original program is implemented The code snippet of modification, recover its original function.The present invention uses for the rogue program of current Android platform getting worse Beat again bag implantation malicious code this main propagation characteristic, detect and cut off the malicious code being implanted into those normal procedures Part.This method is not the test problems for beating again bag primarily directed to the detection in malice generation, it is this for malicious code come Counterweight packing, which carries out detection, has larger limitation.If it is that substituted for some pictures or one in master APP to beat again bag A little resources, then this packet inspection method of beating again will can't detect, and this method be also mainly for Native APP, by Had differences in Hybrid APP and Native APP program structures, this method does not apply to for Hybrid APP.
Scheme described in patent CN201310438647 discloses a kind of Android based on application programming interface and beats again bag Using detection method.First correspond to be handled with program file, obtain smali code files;For each file, from The service condition of Android application program DLL, statistic frequency information are extracted in smali codes;Then by file it Between be compared to each other to be clustered, by similarity is high, the file more than number of iterations is considered as third party library;Remove third party library Disturb and then in units of application file, the program file high to similarity clusters;Author is finally combined to sign Information, judge whether to have between application program and beat again bag relation., can be extensive using technical scheme provided by the invention Counterweight packing application is automatically detected in the application of application market rank, there is very high efficiency and accuracy.But the party Method does not account for Hybrid APP and Native APP difference, so for Hybrid primarily directed to Native APP APP is not applied to, and when can not find original APP, this method will be unable to detect the APP whether an APP beats again bag.
Scheme described in patent US2014082729A discloses a kind of beats again bag by risk analysis to calculate one APP risk, the risk analysis method by calculate in the APP whether have malicious code judge the APP whether be by Bag is beaten again, the method for employing blacklist is matched.The packing detection of this method counterweight has larger limitation, if beaten again Some resource files in master APP are simply substituted for during bag, this detection method will be difficult to detect, and this method And mainly for Native APP, do not applied to for Hybrid APP.
The content of the invention
Present invention aims at a kind of APP is proposed using packet inspection method is beaten again, to solve existing application weight packaging method The technical problem not strong to Hybrid APP applications applicabilities.
Therefore, the present invention proposes that packet inspection method is beaten again in a kind of APP applications, for detecting the installation kit of APP packing generations Whether by beating again bag processing, comprise the following steps:It is consistent to judge whether each file internal contained by the installation kit has Whether there is uniformity between property and/or contained different files, if having uniformity, judge the installation kit without overweight Packing is handled, if not having uniformity, judges the installation kit by beating again bag processing.
Preferably, judge whether each file internal contained by the installation kit has uniformity and/or contained difference Whether there is uniformity can use one or more of progress in following manner between file:
Mode one, judge whether the local file that the installation kit accesses has unanimously with the file contained by the installation kit Property;
Mode two, judge whether is network file chained address Main Domain corresponding with the APP that the installation kit accesses With uniformity;
Mode three, judge whether the content of web page files in the installation kit has uniformity;
Mode four, judge whether the applicating category in the installation kit between files in different types has uniformity.
Preferably, using the mode for the moment, the APP beats again packet inspection method and comprised the following steps:
All decompressing files and decompressing files names contained by S31, the acquisition installation kit;
S32, obtain all local filenames that the installation kit accesses;
S33, each local filename is handled as follows:Judge the local filename and each decompressing files The filename similarity of name, if the local filename and the filename similarity of all decompressing files names are both less than the first threshold Value, then the installation kit is by beating again bag processing;If the local filename and the file famous prime minister of a certain decompressing files name It is not less than first threshold like degree, then the installation kit is handled without bag is beaten again.
When preferably, using the mode two, the APP beats again packet inspection method and comprised the following steps:
All decompressing files contained by S41, the acquisition installation kit;
S42, obtain the all-network file chained address that the installation kit accesses;
S43, each chained address is handled as follows:Judge the chained address and each subdomain in white list The domain name similarity of name, if the chained address and the domain name similarity of all subdomain names are both less than Second Threshold, the peace Dress bag is by beating again bag processing;If the chained address and the domain name similarity of a certain subdomain name are not less than Second Threshold, institute State installation kit and handled without bag is beaten again;Wherein, the white list includes subdomain name corresponding with the APP.
When preferably, using the mode three, the APP beats again packet inspection method and comprised the following steps:
S51, the content characteristic values for obtaining each web page files in the installation kit, remember the content of i-th of web page files Characteristic value is H (i);
S52, obtain the Difference of content D (s) between all web page files in the installation kit:
Wherein, nhFor the number of web page files in installation kit, H (j) is the content characteristic values of j-th of web page files, d (H (i), H (j)) Difference of content between i-th of web page files and j-th of web page files, 1≤i < j≤nh, and i, j is just Integer;
S53, obtain the content consistency f between all web page files in the installation kithtml
fhtml=1-2D (s)/[nh(nh+1)]
Judge fhtmlWhether the 3rd threshold value is not less than, if so, then the installation kit is handled without bag is beaten again;If it is not, then The installation kit is by beating again bag processing.
Preferably, in step S51, the content characteristic values using the keyword of i-th of web page files as i-th of web page files H(i);
In step S52, using different keyword numbers between i-th of web page files and j-th of web page files as described interior Hold diversity factor d (H (i), H (j)).
When preferably, using the mode four, the APP beats again packet inspection method and comprised the following steps:
S71, obtain apply class corresponding to web page files in the installation kit, xml document and JavaScript file respectively Not;
S72, the uniformity P for obtaining applicating category between web page files and JavaScript fileHJ
S73, the uniformity Q for obtaining applicating category between xml document and JavaScript fileXJ
The uniformity F of applicating category between S74, acquisition web page files, xml document and JavaScript file:
F=w1*PHJ+w2*QXJ
Wherein, w1And w2For positive number;Judge whether F values are not less than the 4th threshold value, if so, then the installation kit is without overweight Packing is handled, if it is not, then the installation kit is by beating again bag processing.
Preferably, in step S71,
Applicating category corresponding to the JavaScript file isnjFor JavaScript file Middle API number, J (j) are applicating category corresponding to j-th of API in JavaScript file;
Applicating category corresponding to the web page files isnhFor the number of web page files, H (h) For applicating category corresponding to keyword in h-th of web page files;
Applicating category corresponding to the xml document is X, and X is applicating category corresponding to keyword in all xml documents.
In step S72, the uniformity P of applicating category between web page files and JavaScript fileHJ
Wherein, p (h, j) represents the uniformity of applicating category between h-th of web page files and j-th of API, and formula is:
P (h, j)=| J (j) ∩ H (h) |/| J (j) ∪ H (h) |,
It is 0 that if h-th of web page files, which does not call j-th of API, p (h, j),;Wherein, | J (j) ∩ H (h) | it is J (j) and H (h) element number in occuring simultaneously, | J (j) ∪ H (h) | for J (j) and H (h) and the element number of concentration;
In step S73, the uniformity Q of applicating category between xml document and JavaScript fileXJ
QXJ=min q (1), q (2) ..., q (na)};
Wherein, q (j) represents the uniformity of applicating category between all xml documents and j-th of API, naRepresent API number, formula are in JavaScript file:
Q (j)=| J (j) ∩ X |/| J (j) ∪ X |,
Wherein, | J (j) ∩ X | the element number in occuring simultaneously for X and J (j).
Preferably, 0 < w2< w1< 1, w1+w2=1.
Preferably, the APP is Hybrid APP.
APP applications proposed by the present invention are beaten again packet inspection method and are realized that " uniformity is former using APP " consistency principle " Then " refer to normal APP to realize same major function, perhaps work(inside between its each file internal and different files Can on certain contact, the characteristics of uniformity be present.This APP applications beat again packet inspection method and do not have to carry out with original APK Compare, just can judge whether the APK is beaten again bag, can effectively improve using the verification and measurement ratio for beating again bag.
Brief description of the drawings
Fig. 1 is that bag overhaul flow chart one is beaten again in the APP applications of the specific embodiment of the invention;
Fig. 2 is that bag overhaul flow chart two is beaten again in the APP applications of the specific embodiment of the invention.
Embodiment
With reference to embodiment and compare accompanying drawing the present invention is described in further detail.It is emphasized that What the description below was merely exemplary, the scope being not intended to be limiting of the invention and its application.
With reference to the following drawings, non-limiting and nonexcludability embodiment will be described, wherein identical reference represents Identical part, unless stated otherwise.
The present invention proposes a kind of APP application and beats again packet inspection method, for detect APP packing generations installation kit whether By beating again bag processing.In order to ensure there is one with original APK on interface and function by the installation kit for beating again bag processing Fixed similitude, the common following some situations of modification that attacker makes to installation kit:
1) parameter of loadurl () in Java code is changed, changes the webpage of APP loadings, the APP is loaded Fishing net Stand so as to steal the sensitive information of user;
2) link URL in html webpage file is changed, the resource for making to be loaded in URL changes;
3) html webpage file is changed, such as increases the html webpage file in the APP;
4) JavaScript code is changed, changes the function of realization.
For these potential attacks, it is every according to contained by installation kit that packet inspection method is beaten again in APP applications proposed by the present invention Individual file internal whether between uniformity and/or contained different files whether with uniformity, if so, then judging the peace Dress bag is without bag processing is beaten again, if it is not, then judging the installation kit by beating again bag processing.Here uniformity refers to one not There should be consistent theme between file contained by the installation kit of bag through beating again, and realize consistent function, be this hair referring to Fig. 1 Bag overhaul flow chart one is beaten again in the APP applications of bright embodiment.Packet inspection method is beaten again in the APP applications of this proposition need not Original APP is obtained in addition, it is not required that is compared third party APP and original APP, with more preferable practicality and flexibly Property.
For convenience of description, by taking APK installation kits as an example, other installation kits such as IPA etc. has similar process process.The present invention one In embodiment, packet inspection method is beaten again in APP applications can use the one or more in following manner to carry out:
Detection mode one:
Judge APK access local file and APK contained by file whether there is uniformity, if so, then the APK without Beat again bag processing;If it is not, then the APK is by beating again bag processing.In one embodiment, comprise the following steps:
S1, decompression APK, obtain all decompressing files and decompressing files names;
S2, in decompressing files by obtaining loadurl () parameter, obtain all local filenames of APK access;
S3, each local filename is handled as follows successively:Judge the local filename and each decompression text The filename similarity of part name, is calculated using the mode of string matching, if the local filename and all decompression texts The filename similarity of part name is both less than default first threshold (such as 0.5), it is believed that loadurl () parameter consistency in APK Destroyed, then the APK is by beating again bag processing;If the filename similarity of the local filename and a certain decompressing files name is not Less than default first threshold, it is believed that loadurl () parameter consistency is not affected by destruction in APK, then the APK is without overweight Packing is handled.
Detection mode two:
Judge whether the network file chained address Main Domain corresponding with APP that APK is accessed has uniformity, if so, then The APK is handled without bag is beaten again;If it is not, then the APK is by beating again bag processing.In one embodiment, comprise the following steps:
S1, decompression APK file, obtain all decompressing files;
S2, the all-network file chained address that APK is accessed is obtained in decompressing files;
S3, each chained address is handled as follows successively:Judge the chained address and each height in white list The domain name similarity of domain name, is calculated using KMP string matching algorithms, if the chained address and the domain of all subdomain names Name similarity is both less than default Second Threshold (such as 0.5), it is believed that the link uniformity in APK is destroyed, then the APK passes through Beat again bag processing;If the chained address and the domain name similarity of a certain subdomain name are not less than default Second Threshold, it is believed that in APK Link uniformity be not affected by destruction, then the APK without beat again bag processing.Wherein, listed in white list related to the APP Subdomain name.Such as the APP of Baidu's news, include the subdomain name for belonging to its main domain www.baidu.com in its white list www.news.baidu.com。
Detection mode three:
Whether have uniformity, if so, then the APK is handled without bag is beaten again if judging the content of web page files in APK;If No, then the APK is by beating again bag processing.In one embodiment, comprise the following steps:
S1, the content characteristic values for obtaining each web page files in APK, the content characteristic values of i-th of web page files of note are H (i);In one embodiment, using the keyword of i-th of web page files as content characteristic values H (i), such as i-th of html Web page files, to avoid extracting garbage, text is extracted after can first removing the label substance in the html webpage file Content, then five most words of occurrence number are screened from content of text, as the content characteristic values of the html webpage file, It is designated as:H (i)={ C (i)1,C(i)2,C(i)3,C(i)4,C(i)5, wherein, C (i)pRepresent to press in i-th of html webpage file The word of occurrence number ranking pth, 1≤p≤5;
Assuming that n is shared in APKhIndividual html webpage file, nhFor positive integer, then the content characteristic value set of the APK can be remembered For:S=H (1), H (2) ..., H (nh)};
S2, obtain the Difference of content D (s) between all web page files in APK:
Wherein, H (j) is the content characteristic values of j-th of web page files, and d (H (i), H (j)) is i-th of web page files and jth Difference of content between individual web page files, such as different content characteristic value between i-th of web page files and j-th of web page files Number, 1≤i < j≤nh, and i, j are positive integer;
S3, obtain the content consistency f between all web page files in APKhtml
fhtml=1-2D (s)/[nh(nh+1)]
Judge fhtmlWhether default 3rd threshold value is not less than, if, it is believed that webpage uniformity is not affected by destruction in APK, Then the APK is handled without bag is beaten again;If not, it is believed that webpage uniformity is destroyed in APK, then the APK is by beating again at bag Reason.
Detection mode four:
Judge whether the APK types in APK between files in different types have uniformity, if so, then the APK is without overweight Packing is handled;If it is not, then the APK is by beating again bag processing.Include web page files (such as html webpage text in a usual APK Part, htm files), the different type type such as xml document and javascript files, in one embodiment, including following step Suddenly:
S1, web page files, xml document and APK classifications corresponding to JavaScript file in APK are obtained respectively, in this hair In bright embodiment, APK classifications include:Map, navigation, communication, friend-making, shopping, weather, news, video, education, study, reason Wealth, game;
(1) JavaScript file:
Successively to each API processing in JavaScript file:Obtained by searching per class APK API Calls table Take APK classifications corresponding to the API;Such as j-th of API is appeared in map class APK API Calls table, then APK corresponding to the API Classification is just " map ";Because an API may correspond to multiple APK classifications, APK corresponding to j-th of API is represented with set J (j) Classification, the APK classifications as corresponding to J (j)={ A, B, C, D } represents j-th of API are A, B, C, D, A, B, C, and D presses possibility size Sequence;
Wherein, the API Calls table per class APK can use following steps to obtain:
Sa, a large amount of APK are analyzed, the API that all APK are called in every class APK is extracted, obtained per class APK Public API;
Sb, the API for belonging to other APK classes is deleted in every class APK public API, remaining API is considered such APK Feature API, such APK all feature API be to form such APK API Calls table;For example, A, B, C are three classes APK, wherein, A has 5 APK { A.1, A.2, A.3, A.4, A.5 }, and B has 5 APK { B.1, B.2, B.3, B.4, B.5 }, and C has 4 APK { C.1, C.2, C.3, C.4, C.5 }, then a.1 it is A classes APK if all called in { A.1, A.2, A.3, A.4, A.5 } a.1 Public API, and be a.1 A public API but be not B or C public API, therefore, a.1 it is A feature API, similarly finds out A classes APK other features API, form A classes APK API Calls table;
(2) web page files:
Each html webpage file is handled successively:Extracted using HtmlParser in the html webpage file One keyword;The keyword is searched in html Keyword Lists, obtains APK classifications corresponding to the html webpage file, is used Set H (h) represents APK classifications corresponding to h-th of html webpage file;
Wherein, html Keyword Lists can use following steps to obtain:
Sa, a large amount of html webpage files are analyzed, html webpage file corresponding to all APK in every class APK is closed Key word is extracted, and obtains the public keyword per class APK;
Sb, the keyword for belonging to other classes is deleted in every class APK public keyword successively, remaining keyword is thought It is such APK feature critical word, such APK all feature critical words are the html Keyword Lists for forming such APK;
(3) xml document:
All xml documents are handled:Content of text is extracted from all xml documents using regular expression, is utilized TextRank extracts a keyword from content of text;The keyword is searched in xml Keyword Lists, obtains all xml APK classifications corresponding to file, the APK classifications corresponding to X represents all xml documents;Wherein, the acquisition side of xtml Keyword Lists Method is similar with html Keyword Lists, will not be repeated here;
S2, the uniformity P for obtaining APK types between html webpage file and JavaScript fileHJ
Wherein, nhRepresent the number of html webpage file, naRepresent the number of API in JavaScript file, p (h, j) table Show the uniformity of APK types between h-th of html webpage file and j-th of API, j, h are positive integer, and p (h, j) formula is:
P (h, j)=| J (j) ∩ H (h) |/| J (j) ∪ H (h) |,
It is 0 that if h-th of html webpage file, which does not call j-th of API, p (h, j),;| J (j) ∩ H (h) | it is J (j) and H (h) element number in occuring simultaneously, | J (j) ∪ H (h) | for J (j) and H (h) and the element number of concentration;P (h, j) value is bigger, The uniformity of APK types is bigger between h-th of html webpage file and j-th of API;
S3, the uniformity Q for obtaining APK types between xml document and JavaScript fileXJ
QXJ=min q (1), q (2) ..., q (na)};
Wherein, q (j) represents the uniformity of APK types between all xml documents and j-th of API, and calculation formula is:
Q (j)=| J (j) ∩ X |/| J (j) ∪ X |,
Wherein, | J (j) ∩ X | the element number in occuring simultaneously for X and J (j);Q (j) values are bigger, all xml documents and j-th Uniformity between API is bigger;
The uniformity F of APK types between S4, acquisition html webpage file, xml document and JavaScript file;
In Hybrid APP, xml document only handles layout, and html webpage file has tangible meaning and function, institute It is bigger than contribution of the xml document to whole Hybrid APP with html webpage file, in an embodiment of the present invention, consider this two Class file is to the Different Effects power of global consistency, if the weights of xml document are w (0 < w < 1/2), the power of html webpage file It is worth and is for (1-w), F calculation formula:
F=w*QXJ+(1-w)*PHJ
F values are bigger, illustrate that the global consistency of the APK is bigger.Judge whether F values are not less than default 4th threshold value, if It is, it is believed that APK global consistency is not affected by destruction, then the APK is without bag processing is beaten again, if not, it is believed that overall the one of APK Cause property is destroyed, then the APK is by beating again bag processing.
Bag overhaul flow chart two is beaten again referring to the APP applications that Fig. 2 is the specific embodiment of the invention, it is proposed by the present invention The difference that packet inspection method take into account Hybrid APP and Native APP is beaten again in APP applications, using joining in Hybrid APP Number uniformity, link uniformity, html file consistences and global consistency detect to realize, can be in large-scale application market Bag is beaten again Hybrid APP in the application of rank and carries out automatic detection, and there is very high efficiency and accuracy.
It would be recognized by those skilled in the art that it is possible that numerous accommodations are made to above description, so embodiment is only For describing one or more particular implementations.
Although having been described above and describing the example embodiment for being counted as the present invention, it will be apparent to those skilled in the art that It can be variously modified and replaced, without departing from the spirit of the present invention.Furthermore it is possible to many modifications are made with by spy Condition of pledging love is fitted to the religious doctrine of the present invention, without departing from invention described herein central concept.So the present invention is unrestricted In specific embodiment disclosed here, but the present invention all embodiments that may also include belonging to the scope of the invention and its equivalent Thing.

Claims (5)

1. packet inspection method is beaten again in a kind of APP applications, for whether detecting the installation kit of APP packing generations by beating again at bag Reason, it is characterised in that comprise the following steps:Judge each file internal contained by the installation kit whether have uniformity and/ Or whether there is uniformity between contained different files, if having uniformity, judge the installation kit without beat again bag Processing, if not having uniformity, judge the installation kit by beating again bag processing;
Judge each file internal contained by the installation kit whether have between uniformity and/or contained different files whether One or more of progress in following manner can be used with uniformity:
Mode one, judge whether the local file that the installation kit accesses has uniformity with the file contained by the installation kit;
Mode two, judge whether the network file chained address subdomain name corresponding with the APP that the installation kit accesses has Uniformity;
Mode three, judge whether the content of web page files in the installation kit has uniformity;
Mode four, judge whether the applicating category in the installation kit between files in different types has uniformity;
Wherein using the mode for the moment, the APP beats again packet inspection method and comprised the following steps:
All decompressing files and decompressing files names contained by S31, the acquisition installation kit;
S32, obtain all local filenames that the installation kit accesses;
S33, each local filename is handled as follows:Judge the local filename and each decompressing files name Filename similarity, if the local filename and the filename similarity of all decompressing files names are both less than first threshold, The installation kit is by beating again bag processing;If the filename similarity of the local filename and a certain decompressing files name is not Less than first threshold, then the installation kit is without beating again bag processing;
When wherein using the mode two, the APP beats again packet inspection method and comprised the following steps:
All decompressing files contained by S41, the acquisition installation kit;
S42, obtain the all-network file chained address that the installation kit accesses;
S43, each chained address is handled as follows:Judge the chained address and each subdomain name in white list Domain name similarity, if the chained address and the domain name similarity of all subdomain names are both less than Second Threshold, the installation kit By beating again bag processing;If the chained address and the domain name similarity of a certain subdomain name are not less than Second Threshold, the peace Dress bag is handled without bag is beaten again;Wherein, the white list includes subdomain name corresponding with the APP;
When wherein using the mode three, the APP beats again packet inspection method and comprised the following steps:
S51, the content characteristic values for obtaining each web page files in the installation kit, remember the content characteristic of i-th of web page files It is worth for H (i);
S52, obtain the Difference of content D (s) between all web page files in the installation kit:
<mrow> <mi>D</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>)</mo> </mrow> <mo>=</mo> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>h</mi> </msub> </msubsup> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>j</mi> <mo>=</mo> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> <msub> <mi>n</mi> <mi>h</mi> </msub> </msubsup> <mi>d</mi> <mrow> <mo>(</mo> <mi>H</mi> <mo>(</mo> <mi>i</mi> <mo>)</mo> <mo>,</mo> <mi>H</mi> <mo>(</mo> <mi>j</mi> <mo>)</mo> <mo>)</mo> </mrow> </mrow>
Wherein, nhFor the number of web page files in installation kit, H (j) is the content characteristic values of j-th of web page files, d (H (i), H (j)) the Difference of content between i-th of web page files and j-th of web page files, 1≤i < j≤nh, and i, j are positive integer;
S53, obtain the content consistency f between all web page files in the installation kithtml
fhtml=1-2D (s)/[nh(nh+1)]
Judge fhtmlWhether the 3rd threshold value is not less than, if so, then the installation kit is handled without bag is beaten again;If it is not, the then peace Dress bag is by beating again bag processing;
When wherein using the mode four, the APP beats again packet inspection method and comprised the following steps:
S71, web page files in the installation kit, xml document and applicating category corresponding to JavaScript file are obtained respectively;
S72, the uniformity P for obtaining applicating category between web page files and JavaScript fileHJ
S73, the uniformity Q for obtaining applicating category between xml document and JavaScript fileXJ
The uniformity F of applicating category between S74, acquisition web page files, xml document and JavaScript file:
F=w1*PHJ+w2*QXJ
Wherein, w1And w2For positive number;Judge whether F values are not less than the 4th threshold value, if so, then the installation kit without beat again bag Processing, if it is not, then the installation kit is by beating again bag processing.
2. packet inspection method is beaten again in APP applications as claimed in claim 1, it is characterised in that in step S51, by i-th of webpage Content characteristic values H (i) of the keyword of file as i-th of web page files;
In step S52, using different keyword numbers between i-th of web page files and j-th of web page files as the interior tolerance Different degree d (H (i), H (j)).
3. packet inspection method is beaten again in APP applications as claimed in claim 1, it is characterised in that in step S71,
Applicating category corresponding to the JavaScript file isnjFor API in JavaScript file Number, J (j) be JavaScript file in applicating category corresponding to j-th of API;
Applicating category corresponding to the web page files isnhFor the number of web page files, H (h) is h Applicating category corresponding to keyword in individual web page files;
Applicating category corresponding to the xml document is X, and X is applicating category corresponding to keyword in all xml documents;
In step S72, the uniformity P of applicating category between web page files and JavaScript fileHJ
<mrow> <msub> <mi>P</mi> <mrow> <mi>H</mi> <mi>J</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>min</mi> <mrow> <mn>1</mn> <mo>&amp;le;</mo> <mi>h</mi> <mo>&amp;le;</mo> <msub> <mi>n</mi> <mi>h</mi> </msub> <mo>,</mo> <mn>1</mn> <mo>&amp;le;</mo> <mi>j</mi> <mo>&amp;le;</mo> <msub> <mi>n</mi> <mi>j</mi> </msub> </mrow> </msub> <mo>{</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>|</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>h</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>&amp;NotEqual;</mo> <mn>0</mn> <mo>}</mo> <mo>;</mo> </mrow>
Wherein, p (h, j) represents the uniformity of applicating category between h-th of web page files and j-th of API, and formula is:
P (h, j)=| J (j) ∩ H (h) |/| J (j) ∪ H (h) |,
It is 0 that if h-th of web page files, which does not call j-th of API, p (h, j),;Wherein, | J (j) ∩ H (h) | handed over for J (j) and H (h) The element number of concentration, | J (j) ∪ H (h) | for J (j) and H (h) and the element number of concentration;
In step S73, the uniformity Q of applicating category between xml document and JavaScript fileXJ
QXJ=min q (1), q (2) ..., q (na)};
Wherein, q (j) represents the uniformity of applicating category between all xml documents and j-th of API, naRepresent JavaScript texts API number, formula are in part:
Q (j)=| J (j) ∩ X |/| J (j) ∪ X |,
Wherein, | J (j) ∩ X | the element number in occuring simultaneously for X and J (j).
4. packet inspection method is beaten again in APP applications as claimed in claim 1, it is characterised in that 0 < w2< w1< 1, w1+w2=1.
5. packet inspection method is beaten again in the APP applications as described in any one of Claims 1 to 4, it is characterised in that the APP is Hybrid APP。
CN201510595733.9A 2015-09-17 2015-09-17 Packet inspection method is beaten again in a kind of APP applications Active CN105205356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510595733.9A CN105205356B (en) 2015-09-17 2015-09-17 Packet inspection method is beaten again in a kind of APP applications

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510595733.9A CN105205356B (en) 2015-09-17 2015-09-17 Packet inspection method is beaten again in a kind of APP applications

Publications (2)

Publication Number Publication Date
CN105205356A CN105205356A (en) 2015-12-30
CN105205356B true CN105205356B (en) 2017-12-29

Family

ID=54953032

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510595733.9A Active CN105205356B (en) 2015-09-17 2015-09-17 Packet inspection method is beaten again in a kind of APP applications

Country Status (1)

Country Link
CN (1) CN105205356B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105897923B (en) * 2016-05-31 2019-04-30 中国科学院信息工程研究所 A kind of APP installation kit network flow identification method
CN106971098B (en) 2016-10-11 2020-06-02 阿里巴巴集团控股有限公司 Method and device for preventing repacking
CN106951780B (en) * 2017-02-08 2019-09-10 中国科学院信息工程研究所 Beat again the static detection method and device of packet malicious application
CN108958826B (en) * 2017-05-22 2022-06-07 北京京东尚科信息技术有限公司 Method and device for dynamically configuring application installation package
CN108280647A (en) * 2018-02-12 2018-07-13 北京金山安全软件有限公司 Private key protection method and device for digital wallet, electronic equipment and storage medium
CN109800575B (en) * 2018-12-06 2023-06-20 成都网安科技发展有限公司 Security detection method for Android application program
CN109858249B (en) * 2019-02-18 2020-08-07 暨南大学 Rapid intelligent comparison and safety detection method for mobile malicious software big data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719821A (en) * 2008-10-09 2010-06-02 爱思开电讯投资(中国)有限公司 System for managing application program of intelligent card and method thereof
CN104392181A (en) * 2014-11-18 2015-03-04 北京奇虎科技有限公司 SO file protection method and device and android installation package reinforcement method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8875298B2 (en) * 2012-02-16 2014-10-28 Nec Laboratories America, Inc. Method for scalable analysis of android applications for security vulnerability

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719821A (en) * 2008-10-09 2010-06-02 爱思开电讯投资(中国)有限公司 System for managing application program of intelligent card and method thereof
CN104392181A (en) * 2014-11-18 2015-03-04 北京奇虎科技有限公司 SO file protection method and device and android installation package reinforcement method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于改进贝叶斯分类的Android恶意软件检测;张思琪;《综合电子信息技术》;20140630;第40卷(第6期);第73-76页 *

Also Published As

Publication number Publication date
CN105205356A (en) 2015-12-30

Similar Documents

Publication Publication Date Title
CN105205356B (en) Packet inspection method is beaten again in a kind of APP applications
Sahingoz et al. Machine learning based phishing detection from URLs
Fan et al. Dapasa: detecting android piggybacked apps through sensitive subgraph analysis
Patil et al. Malicious URLs detection using decision tree classifiers and majority voting technique
Wang et al. A deep learning approach for detecting malicious JavaScript code
Schmidt et al. Static analysis of executables for collaborative malware detection on android
Liu et al. A novel approach for detecting browser-based silent miner
Buber et al. NLP based phishing attack detection from URLs
CN102541937B (en) Webpage information detection method and system
Ceschin et al. The need for speed: An analysis of brazilian malware classifiers
CN107180192A (en) Android malicious application detection method and system based on multi-feature fusion
CN107659570A (en) Webshell detection methods and system based on machine learning and static and dynamic analysis
Zhang et al. SaaS: A situational awareness and analysis system for massive android malware detection
CN107463844B (en) WEB Trojan horse detection method and system
Alhaidari et al. ZeVigilante: Detecting Zero‐Day Malware Using Machine Learning and Sandboxing Analysis Techniques
Yuan et al. A novel approach for malicious URL detection based on the joint model
CN103268449A (en) Method and system for detecting mobile phone malicious codes at high speed
Gonzalez et al. Authorship attribution of android apps
Sanglerdsinlapachai et al. Web phishing detection using classifier ensemble
CN105205398B (en) It is a kind of that shell side method is looked into based on APK shell adding software dynamic behaviours
CN116010947A (en) Android malicious software detection method based on heterogeneous network
CN104866764A (en) Object reference graph-based Android cellphone malicious software detection method
Moon et al. Compact feature hashing for machine learning based malware detection
Gorji et al. Detecting obfuscated JavaScript malware using sequences of internal function calls
Casino et al. Analysis and correlation of visual evidence in campaigns of malicious office documents

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518055 Guangdong city of Shenzhen province Nanshan District Xili of Tsinghua

Applicant after: Graduate School at Shenzhen, Tsinghua University

Address before: 518000 Guangdong city in Shenzhen Province, Nanshan District City Xili Shenzhen Tsinghua Campus of Tsinghua University

Applicant before: Graduate School at Shenzhen, Tsinghua University

COR Change of bibliographic data
GR01 Patent grant
GR01 Patent grant