CN107622201A - A kind of Android platform clone's application program quick determination method of anti-reinforcing - Google Patents
A kind of Android platform clone's application program quick determination method of anti-reinforcing Download PDFInfo
- Publication number
- CN107622201A CN107622201A CN201710842026.4A CN201710842026A CN107622201A CN 107622201 A CN107622201 A CN 107622201A CN 201710842026 A CN201710842026 A CN 201710842026A CN 107622201 A CN107622201 A CN 107622201A
- Authority
- CN
- China
- Prior art keywords
- node
- application program
- tree
- clone
- determination method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
The invention discloses a kind of Android platform of anti-reinforcing to clone application program quick determination method, including pretreatment stage and two stages of accurate detection.In pretreatment stage, key construction vector is extracted from the function description of application program by using natural language processing technique, utilizes the similar suspicious clone's application program pair of the improved quick locating function of the searching method based on balanced binary tree.In formal detection-phase, the present invention, which proposes, a kind of based on interface layout feature and is totally independent of the application program birthmark of source code, the influence of reinforcement technique can be effective against, the similarity calculating method of the editing distance based on tree is finally used, the similitude between suspicious Cloning processes pair can be accurately calculated.The present invention can be effective against the interference of reinforcement technique, while realize the quick detection of clone's application, have very strong practicality.
Description
Technical field
The present invention relates to mobile security, clone software detection field, the Android platform clone of particularly a kind of anti-reinforcing
Application program quick determination method.
Background technology
Android gradually occupies the leading position of Mobile Market, but has also attracted substantial amounts of malicious attack simultaneously.Wherein
Most rogue program carries out fast propagation by way of clone has applied, and threat is not only caused to user security,
The income of legal developer is also have impact on simultaneously.At present, the detection method for application being cloned for Android platform can be divided into two
Class:One kind is the detection method based on code reuse:By carrying out static analysis to program code, specific program birthmark is constructed
Such as program dependency graph, program flow diagram, complete the calculating of similarity;It is another kind of, it is based on the similar detection method in interface:
By comparing the similitude of user interface, such as resource file, spatial layout feature, clone's detection is carried out.
Above-mentioned clone's detection method prevents them from being widely used in the presence of some defects.Wherein, based on code
The detection method of reuse is dependent on the analysis to code characteristic, and increasing cloning attack person uses reinforcement technique at present
Hide the DEX code files in Cloning processes so that the detection method failure based on code reuse;And the side based on interface feature
Time overhead of the method when interface feature extracts and performs Similarity Measure is big, has significant limitation.
The content of the invention
The present invention is intended to provide a kind of Android platform clone's application program quick determination method of anti-reinforcing, effectively rule
Keep away the influence of code reinforcement technique so that clone's detection method has more practicality.
In order to solve the above technical problems, the technical solution adopted in the present invention is:A kind of Android platform gram of anti-reinforcing
Grand application program quick determination method, comprises the following steps:
1) balanced binary tree index of the construction based on keyword vector;
2) the function description of intended application is inputted, the keyword vector of dynamic dimension is extracted using Stanford Parser;
3) similar application program is described using keyword vector fast search function in balanced binary tree index, added
Alternative set;
4) application program in the alternative set is decompressed respectively and conversion operation, obtains/res/
All XML format topology files under layout catalogues;
5) topology file is filtered, screens out the outside topology file of third party library introducing;
6) topology file after filtering is converted into the layout tree of counter structure, is loaded into internal memory;
7) merger operation is performed successively to the layout tree of loading, since root node, by level by different layout trees
Identical element merges, and generates final program birthmark;
8) calculated using the similitude of program birthmark of the computational methods of the editing distance based on tree to ultimately generating,
Similarity exceedes the program of threshold value to clone be present.
In step 1), the building method of balanced binary tree index comprises the following steps:
1) keyword vector set V corresponding to n application program is inputted;
2) any vectorial V being directed in keyword vector set Vi, construction leaf node ui, wherein ui.V=Vi;
3) by node uiIt is inserted into node set CurrentNodeSet;
If 4) untreated node be present in node set CurrentNodeSet, circulation performs step 5)~step 7);
5) two leaf node u ' and u " in CurrentNodeSet set are arbitrarily chosen, according to leaf node u ' and u ' '
Structure node u is as father node, wherein u.V=u ' .V ∪ u " .V;
6) father node u is inserted into temporary set TempNodeSet;
7) node during TempNodeSet is gathered is entirely insertable in CurrentNodeSet set, is removed
Data in TempNodeSet;
8) when judging CurrentNodeSet set sizes for 1, end loop;Return in CurrentNodeSet set only
One node is as root node.
In step 2), using based on greed Depth Priority Algorithm in index tree fast search function describe it is similar
Application program.
The specific implementation process of step 3) includes:
1) balanced binary tree index node is inputted;
If 2) present node is non-leaf nodes, and if correlation calculations score RScore (u.V, Q) be more than threshold gamma, then
The Relevance scores of the node or so child node are calculated, then according to child node score just, recurrence performs search operation successively;
Otherwise, current search is terminated;If present node u is leaf node, and if Relevance scores RScore (u.V, Q) be more than threshold value
γ, then insert new element in results set RList<RScore(u.V,Q),u>;
3) returning result set RList, that is, the results set of preliminary screening is obtained.
Threshold gamma=0.75.
In step 7), the specific implementation process that the identical element in different layout trees merges is included by level:
1) for two layout trees lt1 and lt2, initiation parameter depth is layout tree lt1 and the minimum value of lt2 height
Add 1;The root node of initialization matching tree is root;It is lt1 to set root node root left subtrees;Root node root right subtrees are set
For lt2;
2) since root node root, all child nodes of the tree at i-th layer will be matched and be added to set NiIn;
3) according to greedy regular from set NiMiddle search isomorphism node is to (va,vb);
4) by node vbWhole child nodes copy to isomorphism node vaUnder, deletion of node vb;
5) the root node root of matching tree is returned.
In step 8), the similitude using program birthmark of the computational methods of the editing distance based on tree to ultimately generating is entered
Row calculates.
The specific implementation process that the similitude of program birthmark to ultimately generating is calculated includes:Each application program pair
The program birthmark b of a tree structure should be generated, two are calculated using the similarity calculating method of the editing distance based on tree
Program birthmark biAnd bjThe distance between Tedij, similarity r is obtained after normalizedijIf rijSimilitude is exceeded
Threshold θ, then confirm clone's relation be present between former application program.
Similarity threshold θ=0.9.
Compared with prior art, the advantageous effect of present invention is that:The present invention utilizes the interface layout feature applied
Enter the construction of line program birthmark, therefore can effectively evade the influence of code reinforcement technique;Simultaneously in pretreatment stage, using gram
Existing functional similarity between grand application, by way of constructing tree index, the quick screening of suspicious clone's application is completed,
Effectively increase the detection speed of detection method so that clone's detection method has more practicality.
Brief description of the drawings
Fig. 1 is the inventive method flow chart;
Fig. 2 is the flow chart of the external file filter method based on statistics.
Embodiment
As shown in figure 1, the present invention comprises the following steps:
1. initialization:Construct the balanced binary tree index based on keyword vector;
2. inputting the function description of intended application, the keyword vector of dynamic dimension is extracted using Stanford Parser;
3. using the Depth Priority Algorithm based on greed, fast search function describes similar application in index tree
Program, add alternative set;
4. the application program in pair obtained suspicious clone's pool of applications is decompressed respectively and various conversion
Operation, obtain/res/layout catalogues under all XML format topology files;
5. a pair topology file filters, the outside layout text of third party library introducing is screened out using Statistics-Based Method
Part;
6. the topology file after filtering to be converted into the layout tree of counter structure, it is loaded into internal memory;
7. the layout tree of pair loading performs merger operation successively, simple Greedy strategy is taken since root node, by level
Identical element in different layout trees is merged, generates final program birthmark;
8. using the computational methods of the editing distance (Tree edit distance) based on tree to the program that ultimately generates
The similitude of birthmark is calculated, and similarity exceedes the program of threshold value to clone be present.
The Android clones that the present invention proposes a kind of anti-reinforcing apply quick determination method, and this method is in pretreatment rank
Duan Liyong Cloning processes functional similarity realizes the fast search of suspect program, reduces the amount of calculation of subsequent detection algorithm, from
And improve the whole detection speed of detection method.In formal detection-phase, interface layout feature is based entirely on by construction
Program birthmark completes Similarity measures, so as to being effective against the influence of code reinforcement technique.The operation of pretreatment stage
Including keyword extraction, index tree structure and keyword vector fast search;The operation of formal detection-phase includes layout text
Part extraction, external file filtering, program birthmark construction and Similarity measures.
Keyword extraction:Described for the function of application of input, we are using natural language processing instrument to original
Description is pre-processed, only extraction and the maximally related one group of nominal keyword of application function.
Index tree is built:In original application program set, the pass of a dynamic dimension is extracted for each application program
Key word vector, it is then bottom-up in a recursive manner as the leaf node of index tree, after successively merging two-by-two, structure
A balanced binary tree based on keyword is produced, for preserving all keyword vectors.Specific algorithm is as follows:
Keyword vector fast search:The search of keyword vector is performed since the upright point of index tree, is calculated current
The characteristic vector preserved in node or so child node is related between object vector, chooses the son that correlation exceedes specific threshold γ
Node recurrence is scanned for, and search is terminated when searching leaf node or child node correlation is less than threshold gamma.Specific algorithm
It is as follows, wherein RScore (V1,V2) function return vector V1And V2Between correlation:
Topology file extraction filtering:An Android application program is inputted, using file decompression order to installation file
After being decompressed, the topology file of binary format is directly obtained;Can be by binary system topology file using format converter tools
Be converted to readable XML format document..
External file filters:For the obtained all topology files of extraction, calculate the MD5 cryptographic Hash of file, successively with
The MD5 values of preservation in database are compared, if MD5 values are existing and the frequency of respective file appearance is more than specific threshold
When, then assert the topology file for the outside file introduced.The flow of external file filter method based on statistics is as follows:
Program birthmark constructs:Layout tree is loaded into internal memory corresponding to being constructed from the layout document after filtering, then uses
Merger (GWFM) rule of breadth-first based on greed carries out merger operation to layout tree successively, final to merge generation one only
One tree-like program birthmark.Existence anduniquess corresponding relation between the program birthmark and application interface spatial layout feature of merger generation, cloth
The order by merging of office tree does not interfere with the final structure of program birthmark.Conflation algorithm based on greed is as follows:
Similarity measures:The program birthmark b of one tree structure of each corresponding generation of application program, uses the volume based on tree
The similarity calculating method for collecting distance (Tree Edit Distance) calculates two program birthmark biAnd bjThe distance between
Tedij, similarity r is obtained after normalizedijIf rijExceeded similarity threshold θ, then confirm former application program it
Between clone's relation be present.
Claims (9)
1. Android platform clone's application program quick determination method of a kind of anti-reinforcing, it is characterised in that including following step
Suddenly:
1) balanced binary tree index of the construction based on keyword vector;
2) the function description of intended application is inputted, the keyword vector of dynamic dimension is extracted using Stanford Parser;
3) similar application program is described using keyword vector fast search function in balanced binary tree index, added alternative
Set;
4) application program in the alternative set is decompressed respectively and conversion operation, obtains/res/layout mesh
All XML format topology files under record;
5) topology file is filtered, screens out the outside topology file of third party library introducing;
6) topology file after filtering is converted into the layout tree of counter structure, is loaded into internal memory;
7) merger operation is performed successively to the layout tree of loading, will be identical in different layout trees by level since root node
Element merges, and generates final program birthmark;
8) calculated using the similitude of program birthmark of the computational methods of the editing distance based on tree to ultimately generating, it is similar
Degree exceedes the program of threshold value to clone be present.
2. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist
In in step 1), the building method of balanced binary tree index comprises the following steps:
1) keyword vector set V corresponding to n application program is inputted;
2) any vectorial V being directed in keyword vector set Vi, construction leaf node ui, wherein ui.V=Vi;
3) by node uiIt is inserted into node set CurrentNodeSet;
If 4) untreated node be present in node set CurrentNodeSet, circulation performs step 5)~step 7);
5) two leaf node u ' and u " in CurrentNodeSet set are arbitrarily chosen, is constructed and saved according to leaf node u ' and u "
Point u is as father node, wherein u.V=u ' .V ∪ u " .V;
6) father node u is inserted into temporary set TempNodeSet;
7) node during TempNodeSet is gathered is entirely insertable in CurrentNodeSet set, is removed in TempNodeSet
Data;
8) when judging CurrentNodeSet set sizes for 1, end loop;Return and uniquely saved in CurrentNodeSet set
Point is used as root node.
3. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist
In in step 2), using the Depth Priority Algorithm based on greed, fast search function describes similar answer in index tree
Use program.
4. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist
In the specific implementation process of step 3) includes:
1) balanced binary tree index node is inputted;
If 2) present node is non-leaf nodes, and if correlation calculations score RScore (u.V, Q) be more than threshold gamma, then calculate
The Relevance scores of the node or so child node, then according to child node score just, recurrence performs search operation successively;It is no
Then, current search is terminated;If present node u is leaf node, and if Relevance scores RScore (u.V, Q) be more than threshold gamma,
Then new element is inserted in results set RList<RScore(u.V,Q),u>;
3) returning result set RList, that is, the results set of preliminary screening is obtained.
5. Android platform clone's application program quick determination method of anti-reinforcing according to claim 5, its feature exist
In threshold gamma=0.75.
6. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist
In in step 7), the specific implementation process that the identical element in different layout trees merges is included by level:
1) for two layout trees lt1 and lt2, initiation parameter depth is layout tree lt1 and the minimum value of lt2 height adds 1;
The root node of initialization matching tree is root;It is lt1 to set root node root left subtrees;Set root node root right subtrees be
lt2;
2) since root node root, all child nodes of the tree at i-th layer will be matched and be added to set NiIn;
3) according to greedy regular from set NiMiddle search isomorphism node is to (va,vb);
4) by node vbWhole child nodes copy to isomorphism node vaUnder, deletion of node vb;
5) the root node root of matching tree is returned.
7. Android platform clone's application program quick determination method of anti-reinforcing according to claim 1, its feature exist
In in step 8), being counted using the similitude of program birthmark of the computational methods of the editing distance based on tree to ultimately generating
Calculate.
8. Android platform clone's application program quick determination method of anti-reinforcing according to claim 7, its feature exist
In the specific implementation process that the similitude of the program birthmark to ultimately generating is calculated includes:Each application program is corresponding to give birth to
Into the program birthmark b of a tree structure, two programs are calculated using the similarity calculating method of the editing distance based on tree
Birthmark biAnd bjThe distance between Tedij, similarity r is obtained after normalizedijIf rijSimilarity threshold is exceeded
θ, then confirm clone's relation be present between former application program.
9. Android platform clone's application program quick determination method of anti-reinforcing according to claim 8, its feature exist
In similarity threshold θ=0.9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710842026.4A CN107622201B (en) | 2017-09-18 | 2017-09-18 | A kind of Android platform clone's application program rapid detection method of anti-reinforcing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710842026.4A CN107622201B (en) | 2017-09-18 | 2017-09-18 | A kind of Android platform clone's application program rapid detection method of anti-reinforcing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107622201A true CN107622201A (en) | 2018-01-23 |
CN107622201B CN107622201B (en) | 2018-07-24 |
Family
ID=61090608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710842026.4A Active CN107622201B (en) | 2017-09-18 | 2017-09-18 | A kind of Android platform clone's application program rapid detection method of anti-reinforcing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107622201B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109542509A (en) * | 2018-11-13 | 2019-03-29 | 北京梆梆安全科技有限公司 | A kind of risk checking method and device of resource file |
CN113312029A (en) * | 2021-06-11 | 2021-08-27 | 四川大学 | Interface recommendation method and device, electronic equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120233163A1 (en) * | 2011-03-08 | 2012-09-13 | Google Inc. | Detecting application similarity |
CN103577323A (en) * | 2013-09-27 | 2014-02-12 | 西安交通大学 | Dynamic key command sequence birthmark-based software plagiarism detecting method |
CN103870721A (en) * | 2014-03-04 | 2014-06-18 | 西安交通大学 | Multi-thread software plagiarism detection method based on thread slice birthmarks |
CN103984883A (en) * | 2014-05-21 | 2014-08-13 | 湘潭大学 | Class dependency graph based Android application similarity detection method |
CN105373601A (en) * | 2015-11-09 | 2016-03-02 | 国家计算机网络与信息安全管理中心 | Keyword word frequency characteristic-based multimode matching method |
CN105550540A (en) * | 2014-10-31 | 2016-05-04 | 中国移动通信集团江苏有限公司 | Detection method and device for homogenization application |
-
2017
- 2017-09-18 CN CN201710842026.4A patent/CN107622201B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120233163A1 (en) * | 2011-03-08 | 2012-09-13 | Google Inc. | Detecting application similarity |
CN103577323A (en) * | 2013-09-27 | 2014-02-12 | 西安交通大学 | Dynamic key command sequence birthmark-based software plagiarism detecting method |
CN103870721A (en) * | 2014-03-04 | 2014-06-18 | 西安交通大学 | Multi-thread software plagiarism detection method based on thread slice birthmarks |
CN103984883A (en) * | 2014-05-21 | 2014-08-13 | 湘潭大学 | Class dependency graph based Android application similarity detection method |
CN105550540A (en) * | 2014-10-31 | 2016-05-04 | 中国移动通信集团江苏有限公司 | Detection method and device for homogenization application |
CN105373601A (en) * | 2015-11-09 | 2016-03-02 | 国家计算机网络与信息安全管理中心 | Keyword word frequency characteristic-based multimode matching method |
Non-Patent Citations (2)
Title |
---|
何文才等: "基于最小距离分类器的Android恶意软件检测方案", 《计算机应用研究》 * |
焦四辈等: "一种抗混淆的大规模Android应用相似性检测方法", 《计算机研究与发展》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109542509A (en) * | 2018-11-13 | 2019-03-29 | 北京梆梆安全科技有限公司 | A kind of risk checking method and device of resource file |
CN113312029A (en) * | 2021-06-11 | 2021-08-27 | 四川大学 | Interface recommendation method and device, electronic equipment and medium |
CN113312029B (en) * | 2021-06-11 | 2023-09-08 | 四川大学 | Interface recommendation method and device, electronic equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN107622201B (en) | 2018-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111428044B (en) | Method, device, equipment and storage medium for acquiring supervision and identification results in multiple modes | |
CN106709345B (en) | Method, system and equipment for deducing malicious code rules based on deep learning method | |
CN111783100A (en) | Source code vulnerability detection method for code graph representation learning based on graph convolution network | |
Fan et al. | Answering graph pattern queries using views | |
CN111639337B (en) | Unknown malicious code detection method and system for massive Windows software | |
CN111460472B (en) | Encryption algorithm identification method based on deep learning graph network | |
CN108491228B (en) | Binary vulnerability code clone detection method and system | |
CN109492355B (en) | Software anti-analysis method and system based on deep learning | |
US11100218B2 (en) | Systems and methods for improving accuracy in recognizing and neutralizing injection attacks in computer services | |
CN113221032A (en) | Link risk detection method, device and storage medium | |
CN112286575A (en) | Intelligent contract similarity detection method and system based on graph matching model | |
CN107622201B (en) | A kind of Android platform clone's application program rapid detection method of anti-reinforcing | |
CN112580331A (en) | Method and system for establishing knowledge graph of policy text | |
CN106682514B (en) | System calling sequence feature pattern set generation method based on subgraph mining | |
CN106649262B (en) | Method for protecting sensitive information of enterprise hardware facilities in social media | |
CN105468972B (en) | A kind of mobile terminal document detection method | |
CN105243327B (en) | A kind of secure file processing method | |
CN110990834B (en) | Static detection method, system and medium for android malicious software | |
CN111881446B (en) | Industrial Internet malicious code identification method and device | |
CN112906391A (en) | Meta-event extraction method and device, electronic equipment and storage medium | |
US10002254B2 (en) | Systems and methods for SQL type evaluation to detect evaluation flaws | |
CN113971283A (en) | Malicious application program detection method and device based on features | |
US20170068820A1 (en) | Systems and methods for sql value evaluation to detect evaluation flaws | |
CN111310186A (en) | Method, device and system for detecting confusion command line | |
CN112765606A (en) | Malicious code homology analysis method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |