CN107622201A - A kind of Android platform clone's application program quick determination method of anti-reinforcing - Google Patents

A kind of Android platform clone's application program quick determination method of anti-reinforcing Download PDF

Info

Publication number
CN107622201A
CN107622201A CN201710842026.4A CN201710842026A CN107622201A CN 107622201 A CN107622201 A CN 107622201A CN 201710842026 A CN201710842026 A CN 201710842026A CN 107622201 A CN107622201 A CN 107622201A
Authority
CN
China
Prior art keywords
node
application program
tree
clone
determination method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710842026.4A
Other languages
Chinese (zh)
Other versions
CN107622201B (en
Inventor
林亚平
吕方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University
Original Assignee
Hunan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University filed Critical Hunan University
Priority to CN201710842026.4A priority Critical patent/CN107622201B/en
Publication of CN107622201A publication Critical patent/CN107622201A/en
Application granted granted Critical
Publication of CN107622201B publication Critical patent/CN107622201B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of Android platform of anti-reinforcing to clone application program quick determination method, including pretreatment stage and two stages of accurate detection.In pretreatment stage, key construction vector is extracted from the function description of application program by using natural language processing technique, utilizes the similar suspicious clone's application program pair of the improved quick locating function of the searching method based on balanced binary tree.In formal detection-phase, the present invention, which proposes, a kind of based on interface layout feature and is totally independent of the application program birthmark of source code, the influence of reinforcement technique can be effective against, the similarity calculating method of the editing distance based on tree is finally used, the similitude between suspicious Cloning processes pair can be accurately calculated.The present invention can be effective against the interference of reinforcement technique, while realize the quick detection of clone's application, have very strong practicality.

Description

A kind of Android platform clone's application program quick determination method of anti-reinforcing
Technical field
The present invention relates to mobile security, clone software detection field, the Android platform clone of particularly a kind of anti-reinforcing Application program quick determination method.
Background technology
Android gradually occupies the leading position of Mobile Market, but has also attracted substantial amounts of malicious attack simultaneously.Wherein Most rogue program carries out fast propagation by way of clone has applied, and threat is not only caused to user security, The income of legal developer is also have impact on simultaneously.At present, the detection method for application being cloned for Android platform can be divided into two Class:One kind is the detection method based on code reuse:By carrying out static analysis to program code, specific program birthmark is constructed Such as program dependency graph, program flow diagram, complete the calculating of similarity;It is another kind of, it is based on the similar detection method in interface: By comparing the similitude of user interface, such as resource file, spatial layout feature, clone's detection is carried out.
Above-mentioned clone's detection method prevents them from being widely used in the presence of some defects.Wherein, based on code The detection method of reuse is dependent on the analysis to code characteristic, and increasing cloning attack person uses reinforcement technique at present Hide the DEX code files in Cloning processes so that the detection method failure based on code reuse;And the side based on interface feature Time overhead of the method when interface feature extracts and performs Similarity Measure is big, has significant limitation.
The content of the invention
The present invention is intended to provide a kind of Android platform clone's application program quick determination method of anti-reinforcing, effectively rule Keep away the influence of code reinforcement technique so that clone's detection method has more practicality.
In order to solve the above technical problems, the technical solution adopted in the present invention is:A kind of Android platform gram of anti-reinforcing Grand application program quick determination method, comprises the following steps:
1) balanced binary tree index of the construction based on keyword vector;
2) the function description of intended application is inputted, the keyword vector of dynamic dimension is extracted using Stanford Parser;
3) similar application program is described using keyword vector fast search function in balanced binary tree index, added Alternative set;
4) application program in the alternative set is decompressed respectively and conversion operation, obtains/res/ All XML format topology files under layout catalogues;
5) topology file is filtered, screens out the outside topology file of third party library introducing;
6) topology file after filtering is converted into the layout tree of counter structure, is loaded into internal memory;
7) merger operation is performed successively to the layout tree of loading, since root node, by level by different layout trees Identical element merges, and generates final program birthmark;
8) calculated using the similitude of program birthmark of the computational methods of the editing distance based on tree to ultimately generating, Similarity exceedes the program of threshold value to clone be present.
In step 1), the building method of balanced binary tree index comprises the following steps:
1) keyword vector set V corresponding to n application program is inputted;
2) any vectorial V being directed in keyword vector set Vi, construction leaf node ui, wherein ui.V=Vi
3) by node uiIt is inserted into node set CurrentNodeSet;
If 4) untreated node be present in node set CurrentNodeSet, circulation performs step 5)~step 7);
5) two leaf node u ' and u " in CurrentNodeSet set are arbitrarily chosen, according to leaf node u ' and u ' ' Structure node u is as father node, wherein u.V=u ' .V ∪ u " .V;
6) father node u is inserted into temporary set TempNodeSet;
7) node during TempNodeSet is gathered is entirely insertable in CurrentNodeSet set, is removed Data in TempNodeSet;
8) when judging CurrentNodeSet set sizes for 1, end loop;Return in CurrentNodeSet set only One node is as root node.
In step 2), using based on greed Depth Priority Algorithm in index tree fast search function describe it is similar Application program.
The specific implementation process of step 3) includes:
1) balanced binary tree index node is inputted;
If 2) present node is non-leaf nodes, and if correlation calculations score RScore (u.V, Q) be more than threshold gamma, then The Relevance scores of the node or so child node are calculated, then according to child node score just, recurrence performs search operation successively; Otherwise, current search is terminated;If present node u is leaf node, and if Relevance scores RScore (u.V, Q) be more than threshold value γ, then insert new element in results set RList<RScore(u.V,Q),u>;
3) returning result set RList, that is, the results set of preliminary screening is obtained.
Threshold gamma=0.75.
In step 7), the specific implementation process that the identical element in different layout trees merges is included by level:
1) for two layout trees lt1 and lt2, initiation parameter depth is layout tree lt1 and the minimum value of lt2 height Add 1;The root node of initialization matching tree is root;It is lt1 to set root node root left subtrees;Root node root right subtrees are set For lt2;
2) since root node root, all child nodes of the tree at i-th layer will be matched and be added to set NiIn;
3) according to greedy regular from set NiMiddle search isomorphism node is to (va,vb);
4) by node vbWhole child nodes copy to isomorphism node vaUnder, deletion of node vb
5) the root node root of matching tree is returned.
In step 8), the similitude using program birthmark of the computational methods of the editing distance based on tree to ultimately generating is entered Row calculates.
The specific implementation process that the similitude of program birthmark to ultimately generating is calculated includes:Each application program pair The program birthmark b of a tree structure should be generated, two are calculated using the similarity calculating method of the editing distance based on tree Program birthmark biAnd bjThe distance between Tedij, similarity r is obtained after normalizedijIf rijSimilitude is exceeded Threshold θ, then confirm clone's relation be present between former application program.
Similarity threshold θ=0.9.
Compared with prior art, the advantageous effect of present invention is that:The present invention utilizes the interface layout feature applied Enter the construction of line program birthmark, therefore can effectively evade the influence of code reinforcement technique;Simultaneously in pretreatment stage, using gram Existing functional similarity between grand application, by way of constructing tree index, the quick screening of suspicious clone's application is completed, Effectively increase the detection speed of detection method so that clone's detection method has more practicality.
Brief description of the drawings
Fig. 1 is the inventive method flow chart;
Fig. 2 is the flow chart of the external file filter method based on statistics.
Embodiment
As shown in figure 1, the present invention comprises the following steps:
1. initialization:Construct the balanced binary tree index based on keyword vector;
2. inputting the function description of intended application, the keyword vector of dynamic dimension is extracted using Stanford Parser;
3. using the Depth Priority Algorithm based on greed, fast search function describes similar application in index tree Program, add alternative set;
4. the application program in pair obtained suspicious clone's pool of applications is decompressed respectively and various conversion Operation, obtain/res/layout catalogues under all XML format topology files;
5. a pair topology file filters, the outside layout text of third party library introducing is screened out using Statistics-Based Method Part;
6. the topology file after filtering to be converted into the layout tree of counter structure, it is loaded into internal memory;
7. the layout tree of pair loading performs merger operation successively, simple Greedy strategy is taken since root node, by level Identical element in different layout trees is merged, generates final program birthmark;
8. using the computational methods of the editing distance (Tree edit distance) based on tree to the program that ultimately generates The similitude of birthmark is calculated, and similarity exceedes the program of threshold value to clone be present.
The Android clones that the present invention proposes a kind of anti-reinforcing apply quick determination method, and this method is in pretreatment rank Duan Liyong Cloning processes functional similarity realizes the fast search of suspect program, reduces the amount of calculation of subsequent detection algorithm, from And improve the whole detection speed of detection method.In formal detection-phase, interface layout feature is based entirely on by construction Program birthmark completes Similarity measures, so as to being effective against the influence of code reinforcement technique.The operation of pretreatment stage Including keyword extraction, index tree structure and keyword vector fast search;The operation of formal detection-phase includes layout text Part extraction, external file filtering, program birthmark construction and Similarity measures.
Keyword extraction:Described for the function of application of input, we are using natural language processing instrument to original Description is pre-processed, only extraction and the maximally related one group of nominal keyword of application function.
Index tree is built:In original application program set, the pass of a dynamic dimension is extracted for each application program Key word vector, it is then bottom-up in a recursive manner as the leaf node of index tree, after successively merging two-by-two, structure A balanced binary tree based on keyword is produced, for preserving all keyword vectors.Specific algorithm is as follows:
Keyword vector fast search:The search of keyword vector is performed since the upright point of index tree, is calculated current The characteristic vector preserved in node or so child node is related between object vector, chooses the son that correlation exceedes specific threshold γ Node recurrence is scanned for, and search is terminated when searching leaf node or child node correlation is less than threshold gamma.Specific algorithm It is as follows, wherein RScore (V1,V2) function return vector V1And V2Between correlation:
Topology file extraction filtering:An Android application program is inputted, using file decompression order to installation file After being decompressed, the topology file of binary format is directly obtained;Can be by binary system topology file using format converter tools Be converted to readable XML format document..
External file filters:For the obtained all topology files of extraction, calculate the MD5 cryptographic Hash of file, successively with The MD5 values of preservation in database are compared, if MD5 values are existing and the frequency of respective file appearance is more than specific threshold When, then assert the topology file for the outside file introduced.The flow of external file filter method based on statistics is as follows:
Program birthmark constructs:Layout tree is loaded into internal memory corresponding to being constructed from the layout document after filtering, then uses Merger (GWFM) rule of breadth-first based on greed carries out merger operation to layout tree successively, final to merge generation one only One tree-like program birthmark.Existence anduniquess corresponding relation between the program birthmark and application interface spatial layout feature of merger generation, cloth The order by merging of office tree does not interfere with the final structure of program birthmark.Conflation algorithm based on greed is as follows:
Similarity measures:The program birthmark b of one tree structure of each corresponding generation of application program, uses the volume based on tree The similarity calculating method for collecting distance (Tree Edit Distance) calculates two program birthmark biAnd bjThe distance between Tedij, similarity r is obtained after normalizedijIf rijExceeded similarity threshold θ, then confirm former application program it Between clone's relation be present.

Claims (9)

1. Android platform clone's application program quick determination method of a kind of anti-reinforcing, it is characterised in that including following step Suddenly:
1) balanced binary tree index of the construction based on keyword vector;
2) the function description of intended application is inputted, the keyword vector of dynamic dimension is extracted using Stanford Parser;
3) similar application program is described using keyword vector fast search function in balanced binary tree index, added alternative Set;
4) application program in the alternative set is decompressed respectively and conversion operation, obtains/res/layout mesh All XML format topology files under record;
5) topology file is filtered, screens out the outside topology file of third party library introducing;
6) topology file after filtering is converted into the layout tree of counter structure, is loaded into internal memory;
7) merger operation is performed successively to the layout tree of loading, will be identical in different layout trees by level since root node Element merges, and generates final program birthmark;
8) calculated using the similitude of program birthmark of the computational methods of the editing distance based on tree to ultimately generating, it is similar Degree exceedes the program of threshold value to clone be present.
2. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist In in step 1), the building method of balanced binary tree index comprises the following steps:
1) keyword vector set V corresponding to n application program is inputted;
2) any vectorial V being directed in keyword vector set Vi, construction leaf node ui, wherein ui.V=Vi
3) by node uiIt is inserted into node set CurrentNodeSet;
If 4) untreated node be present in node set CurrentNodeSet, circulation performs step 5)~step 7);
5) two leaf node u ' and u " in CurrentNodeSet set are arbitrarily chosen, is constructed and saved according to leaf node u ' and u " Point u is as father node, wherein u.V=u ' .V ∪ u " .V;
6) father node u is inserted into temporary set TempNodeSet;
7) node during TempNodeSet is gathered is entirely insertable in CurrentNodeSet set, is removed in TempNodeSet Data;
8) when judging CurrentNodeSet set sizes for 1, end loop;Return and uniquely saved in CurrentNodeSet set Point is used as root node.
3. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist In in step 2), using the Depth Priority Algorithm based on greed, fast search function describes similar answer in index tree Use program.
4. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist In the specific implementation process of step 3) includes:
1) balanced binary tree index node is inputted;
If 2) present node is non-leaf nodes, and if correlation calculations score RScore (u.V, Q) be more than threshold gamma, then calculate The Relevance scores of the node or so child node, then according to child node score just, recurrence performs search operation successively;It is no Then, current search is terminated;If present node u is leaf node, and if Relevance scores RScore (u.V, Q) be more than threshold gamma, Then new element is inserted in results set RList<RScore(u.V,Q),u>;
3) returning result set RList, that is, the results set of preliminary screening is obtained.
5. Android platform clone's application program quick determination method of anti-reinforcing according to claim 5, its feature exist In threshold gamma=0.75.
6. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist In in step 7), the specific implementation process that the identical element in different layout trees merges is included by level:
1) for two layout trees lt1 and lt2, initiation parameter depth is layout tree lt1 and the minimum value of lt2 height adds 1; The root node of initialization matching tree is root;It is lt1 to set root node root left subtrees;Set root node root right subtrees be lt2;
2) since root node root, all child nodes of the tree at i-th layer will be matched and be added to set NiIn;
3) according to greedy regular from set NiMiddle search isomorphism node is to (va,vb);
4) by node vbWhole child nodes copy to isomorphism node vaUnder, deletion of node vb
5) the root node root of matching tree is returned.
7. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist In in step 8), being counted using the similitude of program birthmark of the computational methods of the editing distance based on tree to ultimately generating Calculate.
8. Android platform clone's application program quick determination method of anti-reinforcing according to claim 7, its feature exist In the specific implementation process that the similitude of the program birthmark to ultimately generating is calculated includes:Each application program is corresponding to give birth to Into the program birthmark b of a tree structure, two programs are calculated using the similarity calculating method of the editing distance based on tree Birthmark biAnd bjThe distance between Tedij, similarity r is obtained after normalizedijIf rijSimilarity threshold is exceeded θ, then confirm clone's relation be present between former application program.
9. Android platform clone's application program quick determination method of anti-reinforcing according to claim 8, its feature exist In similarity threshold θ=0.9.
CN201710842026.4A 2017-09-18 2017-09-18 A kind of Android platform clone's application program rapid detection method of anti-reinforcing Active CN107622201B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710842026.4A CN107622201B (en) 2017-09-18 2017-09-18 A kind of Android platform clone's application program rapid detection method of anti-reinforcing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710842026.4A CN107622201B (en) 2017-09-18 2017-09-18 A kind of Android platform clone's application program rapid detection method of anti-reinforcing

Publications (2)

Publication Number Publication Date
CN107622201A true CN107622201A (en) 2018-01-23
CN107622201B CN107622201B (en) 2018-07-24

Family

ID=61090608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710842026.4A Active CN107622201B (en) 2017-09-18 2017-09-18 A kind of Android platform clone's application program rapid detection method of anti-reinforcing

Country Status (1)

Country Link
CN (1) CN107622201B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542509A (en) * 2018-11-13 2019-03-29 北京梆梆安全科技有限公司 A kind of risk checking method and device of resource file
CN113312029A (en) * 2021-06-11 2021-08-27 四川大学 Interface recommendation method and device, electronic equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120233163A1 (en) * 2011-03-08 2012-09-13 Google Inc. Detecting application similarity
CN103577323A (en) * 2013-09-27 2014-02-12 西安交通大学 Dynamic key command sequence birthmark-based software plagiarism detecting method
CN103870721A (en) * 2014-03-04 2014-06-18 西安交通大学 Multi-thread software plagiarism detection method based on thread slice birthmarks
CN103984883A (en) * 2014-05-21 2014-08-13 湘潭大学 Class dependency graph based Android application similarity detection method
CN105373601A (en) * 2015-11-09 2016-03-02 国家计算机网络与信息安全管理中心 Keyword word frequency characteristic-based multimode matching method
CN105550540A (en) * 2014-10-31 2016-05-04 中国移动通信集团江苏有限公司 Detection method and device for homogenization application

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120233163A1 (en) * 2011-03-08 2012-09-13 Google Inc. Detecting application similarity
CN103577323A (en) * 2013-09-27 2014-02-12 西安交通大学 Dynamic key command sequence birthmark-based software plagiarism detecting method
CN103870721A (en) * 2014-03-04 2014-06-18 西安交通大学 Multi-thread software plagiarism detection method based on thread slice birthmarks
CN103984883A (en) * 2014-05-21 2014-08-13 湘潭大学 Class dependency graph based Android application similarity detection method
CN105550540A (en) * 2014-10-31 2016-05-04 中国移动通信集团江苏有限公司 Detection method and device for homogenization application
CN105373601A (en) * 2015-11-09 2016-03-02 国家计算机网络与信息安全管理中心 Keyword word frequency characteristic-based multimode matching method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
何文才等: "基于最小距离分类器的Android恶意软件检测方案", 《计算机应用研究》 *
焦四辈等: "一种抗混淆的大规模Android应用相似性检测方法", 《计算机研究与发展》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542509A (en) * 2018-11-13 2019-03-29 北京梆梆安全科技有限公司 A kind of risk checking method and device of resource file
CN113312029A (en) * 2021-06-11 2021-08-27 四川大学 Interface recommendation method and device, electronic equipment and medium
CN113312029B (en) * 2021-06-11 2023-09-08 四川大学 Interface recommendation method and device, electronic equipment and medium

Also Published As

Publication number Publication date
CN107622201B (en) 2018-07-24

Similar Documents

Publication Publication Date Title
CN111428044B (en) Method, device, equipment and storage medium for acquiring supervision and identification results in multiple modes
CN106709345B (en) Method, system and equipment for deducing malicious code rules based on deep learning method
CN111783100A (en) Source code vulnerability detection method for code graph representation learning based on graph convolution network
Fan et al. Answering graph pattern queries using views
CN111639337B (en) Unknown malicious code detection method and system for massive Windows software
CN111460472B (en) Encryption algorithm identification method based on deep learning graph network
CN108491228B (en) Binary vulnerability code clone detection method and system
CN109492355B (en) Software anti-analysis method and system based on deep learning
US11100218B2 (en) Systems and methods for improving accuracy in recognizing and neutralizing injection attacks in computer services
CN113221032A (en) Link risk detection method, device and storage medium
CN112286575A (en) Intelligent contract similarity detection method and system based on graph matching model
CN107622201B (en) A kind of Android platform clone&#39;s application program rapid detection method of anti-reinforcing
CN112580331A (en) Method and system for establishing knowledge graph of policy text
CN106682514B (en) System calling sequence feature pattern set generation method based on subgraph mining
CN106649262B (en) Method for protecting sensitive information of enterprise hardware facilities in social media
CN105468972B (en) A kind of mobile terminal document detection method
CN105243327B (en) A kind of secure file processing method
CN110990834B (en) Static detection method, system and medium for android malicious software
CN111881446B (en) Industrial Internet malicious code identification method and device
CN112906391A (en) Meta-event extraction method and device, electronic equipment and storage medium
US10002254B2 (en) Systems and methods for SQL type evaluation to detect evaluation flaws
CN113971283A (en) Malicious application program detection method and device based on features
US20170068820A1 (en) Systems and methods for sql value evaluation to detect evaluation flaws
CN111310186A (en) Method, device and system for detecting confusion command line
CN112765606A (en) Malicious code homology analysis method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant