CN109190370B - Android interface similarity calculation method based on control region distribution characteristics - Google Patents
Android interface similarity calculation method based on control region distribution characteristics Download PDFInfo
- Publication number
- CN109190370B CN109190370B CN201810711378.0A CN201810711378A CN109190370B CN 109190370 B CN109190370 B CN 109190370B CN 201810711378 A CN201810711378 A CN 201810711378A CN 109190370 B CN109190370 B CN 109190370B
- Authority
- CN
- China
- Prior art keywords
- tree
- control
- similarity
- information
- interface
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/563—Static detection by source code analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/566—Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/03—Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
- G06F2221/033—Test or assess software
Abstract
A method for calculating the similarity of an android interface based on control region distribution characteristics comprises the following steps of describing a model-control region distribution tree of application interface information during android operation and calculating the similarity of the interface: firstly, obtaining view level information of an application interface layout, wherein the information can be obtained by an existing third-party tool; then traversing the rectangular region information of each control in the view level, and constructing the rectangular regions into an R tree; and taking the constructed R tree as a control region distribution tree, and judging the similarity of the corresponding interfaces by comparing the similarity of the space region distribution tree. The system is characterized in that: the method improves the resistance of code confusion and the two types of anti-detection modes, and improves the accuracy of similarity calculation aiming at the mixed android application interface.
Description
Technical Field
The invention belongs to the field of android applications, software birthmarks, repacking detection and application fuzzy algorithms, and particularly relates to an android interface similarity calculation method based on control region distribution characteristics.
Background
With the increasing popularity of mobile devices, the number of mobile applications has also seen explosive growth, and thus has attracted the attention of many lawbreakers. Android applications are easy to be repackaged and released, and attackers add and modify partial codes in the repackaging process so as to achieve the illegal purpose of the attackers. In the prior art, mostly, the android application repackaging needs to be detected by identifying the software birthmark of the application, and the software birthmark of the application is extracted by analyzing the code or interface information of the application. However, due to the popularity of confusion and encryption in application today, the extraction of the software birthmark from the application code is severely interfered, and therefore, more and more work is focused on extracting the birthmark from the application interface, and an important part of the work is how to calculate the similarity between the interfaces. With the gradual trend that the hybrid android application becomes the mainstream mode of the android application development nowadays, the application interface information is greatly different from the traditional native android application. Practically speaking, the task of extracting the birthmarks of the mixed application interface is not performed at the present stage; whereas from theoretical analysis, the dynamic, untyped control properties in the Web section increase the difficulty of processing.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a method for calculating the similarity of an android interface based on the distribution characteristics of control areas. The method is based on the following realisations: the control in the android application interface always occupies a rectangular area (containing position and size information), the rectangular areas form a framework of the interface, and partial modification of the framework does not affect the similarity degree of the whole interface excessively. Therefore, we propose a control region distribution tree, a model for android runtime application interface information.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for calculating the similarity of an android interface based on control region distribution characteristics is characterized by comprising the following steps:
firstly, dynamically executing android application and collecting user interface information;
step two, constructing a control area distribution tree through user interface information;
and step three, judging the similarity between the application interfaces by comparing the similarity of the control region distribution trees.
In order to optimize the technical scheme, the specific measures adopted further comprise:
in the first step, for android application runtime interfaces to be compared, structural information of an android application user interface is obtained.
In the second step, the control area distribution tree is a tree-shaped data structure, one application interface corresponds to one control area distribution tree, nodes of the tree correspond to controls in the application interface one by one, each node contains area information corresponding to the control, and the area information of the control refers to the position and size information of a rectangular area occupied by the control on the interface presentation.
The second step specifically comprises:
step 2.1, firstly, obtaining view level information of the application interface layout;
step 2.2, traversing the rectangular region information of the bottommost control in the view level one by one according to any sequence, taking the rectangular regions as nodes, and inserting and constructing the nodes into an R tree;
and 2.3, taking the constructed R tree as a control area distribution tree.
The third step specifically comprises:
3.1, each node of a control region distribution tree contains the position and size information of a rectangular region occupied by the control, and the similarity of the rectangular region is defined by the Jaccard distance;
two rectangles r are arranged1、r2The corresponding area size is s1、s2The overlapping area of the two is soThen the similarity of the two rectangles is:
step 3.2, setting the two control region distribution trees as t1、t2For t1Each rectangle r ini∈t1And obtaining:
wherein, t2Each rectangle in rjI.e. rj∈t2;
Finally simmThe average value of (a) is taken as the similarity of the tree, i.e. the similarity of the corresponding interface.
The invention has the beneficial effects that: the functions specifically provided by the invention comprise: 1. collecting android application user interface information and constructing a control area distribution tree; 2. and feeding back the similarity between the android application interfaces. Compared with the prior art, the method has the remarkable characteristics that: the method improves the resistance of code confusion and the two types of anti-detection modes, and improves the accuracy of similarity calculation aiming at the mixed android application interface.
Drawings
FIG. 1 is a system block diagram of an android interface similarity calculation method based on control region distribution characteristics.
FIG. 2 is a flow diagram of the generation of a control region distribution tree.
FIG. 3 is an example of an interface provided.
FIG. 4 is a view hierarchy of an interface.
FIG. 5 is a rectangular area corresponding to a control in the interface and the partition of MBR in the generated R-tree.
FIG. 6 is the final generated control region distribution tree.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings.
First, main process
As shown in fig. 1, for a pair of android application runtime interfaces to be compared, structural information of an android application user interface is obtained by using a third-party tool. And then converting the user interface structure information contained in each layout into a control area distribution tree as the characteristics of the application interface. And finally, calculating the similarity of the control region distribution trees corresponding to the two android application interfaces, and determining whether the application interfaces are similar.
Second, generation of control region distribution tree
The generation flow chart of the control region distribution tree is shown in fig. 2, and the detailed steps are as follows:
the layouts obtained by us are in an XML format, and the data structure of the layouts can be regarded as a tree, the whole tree represents the layout hierarchy, and the nodes of the tree represent the corresponding controls in the layouts. A view hierarchy tree of the running interface layout may be obtained using a third party tool.
Each control in the interface occupies a rectangular area, and different controls draw different content (such as color, text, or pictures) in the rectangles with different sizes. The controls are elements at the bottom layer in the interface view hierarchy, and the corresponding rectangles cannot be nested and overlapped with each other. We build an R-tree to store these rectangular information, which is a balanced multi-way tree for indexing high-dimensional information, and in the present invention we use rectangular information that is two-dimensional (width and height). The R-tree uses a Minimum Bounding Rectangle (MBR) to combine adjacent objects together to form a node, and ensures the balance of the tree. By continuously inserting the rectangle information of each control into the R tree, the finally constructed R tree is used as a control area distribution tree of the interface. Different insertion orders may result in different R-trees but do not affect the similarity calculation results of the control region distribution tree thereafter.
Fig. 3 shows an example of an interface, and fig. 4 is a view hierarchy of the interface, where each node corresponds to an element that is visible or invisible in the interface (in general, an invisible element is used to organize the layout structure of its sub-elements, such as LinearLayout), and we only consider the leaf node elements and add the corresponding rectangular information to the R-tree. Fig. 5 is a rectangular area corresponding to the control in the interface and the partition of MBR in the generated R tree, and fig. 6 is a finally generated control area distribution tree.
Similarity calculation of control region distribution tree
Each node of a control area distribution tree contains the position and size information of a rectangular area occupied by the control, and the similarity of the rectangular area is defined by Jaccard distance: two rectangles r are arranged1、r2The corresponding area size is s1、s2The overlapping area of the two is soThen the similarity of the two rectangles is:
secondly, let two control region distribution trees be t1、t2For t1Each rectangle r ini∈t1And obtaining:
wherein, t2Each rectangle in rjI.e. rj∈t2;
Finally simmThe average value of (a) is taken as the similarity of the tree, i.e. the similarity of the corresponding interface.
It should be noted that the terms "upper", "lower", "left", "right", "front", "back", etc. used in the present invention are for clarity of description only, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not limited by the technical contents of the essential changes.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.
Claims (1)
1. A method for calculating the similarity of an android interface based on control region distribution characteristics is characterized by comprising the following steps:
firstly, dynamically executing android application and collecting user interface information; for the android application runtime interfaces to be compared, obtaining structural information of the android application user interface;
step two, constructing a control area distribution tree through user interface information; the control area distribution tree is a tree-shaped data structure, an application interface corresponds to the control area distribution tree, nodes of the tree correspond to controls in the application interface one by one, each node comprises area information corresponding to the control, and the area information of the control refers to the position and size information of a rectangular area occupied by the control on the interface presentation; the second step specifically comprises:
step 2.1, firstly, obtaining view level information of the application interface layout;
step 2.2, traversing the rectangular region information of the bottommost control in the view level one by one according to any sequence, taking the rectangular regions as nodes, and inserting and constructing the nodes into an R tree;
step 2.3, taking the constructed R tree as a control area distribution tree;
thirdly, judging the similarity between the application interfaces by comparing the similarity of the control region distribution trees;
the third step specifically comprises:
3.1, each node of a control region distribution tree contains the position and size information of a rectangular region occupied by the control, and the similarity of the rectangular region is defined by the Jaccard distance;
two rectangles r are arranged1、r2The corresponding area size is s1、s2The overlapping area of the two is soThen the similarity of the two rectangles is:
step 3.2, setting the two control region distribution trees as t1、t2For t1Each rectangle r ini∈t1And obtaining:
wherein, t2Each rectangle in rjI.e. rj∈t2;
Finally simmThe average value of (a) is taken as the similarity of the tree, i.e. the similarity of the corresponding interface.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810711378.0A CN109190370B (en) | 2018-07-02 | 2018-07-02 | Android interface similarity calculation method based on control region distribution characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810711378.0A CN109190370B (en) | 2018-07-02 | 2018-07-02 | Android interface similarity calculation method based on control region distribution characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109190370A CN109190370A (en) | 2019-01-11 |
CN109190370B true CN109190370B (en) | 2022-02-08 |
Family
ID=64948793
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810711378.0A Active CN109190370B (en) | 2018-07-02 | 2018-07-02 | Android interface similarity calculation method based on control region distribution characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109190370B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110502876B (en) * | 2019-08-07 | 2021-04-27 | 南京大学 | Android interface static confusion method based on resource files |
CN110766697B (en) * | 2019-10-16 | 2023-08-04 | 南京大学 | Method and device for identifying graphical interface control image of interface sketch |
CN111273905B (en) * | 2020-01-17 | 2023-04-18 | 南京大学 | Application retrieval method and device based on interface sketch |
CN116795346B (en) * | 2023-06-26 | 2024-03-15 | 成都中科合迅科技有限公司 | Component interface drawing method and system based on visual contrast |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169323A (en) * | 2017-05-11 | 2017-09-15 | 南京大学 | Packet inspection method is beaten again in a kind of Android application based on layout cluster figure |
-
2018
- 2018-07-02 CN CN201810711378.0A patent/CN109190370B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169323A (en) * | 2017-05-11 | 2017-09-15 | 南京大学 | Packet inspection method is beaten again in a kind of Android application based on layout cluster figure |
Non-Patent Citations (1)
Title |
---|
《DroidEagle: Seamless Detection of Visually Similar AndroidApps》;Mingshen Sun等;《Proceedings of the 8th ACM Conference on Security and Privacy in Wireless and Mobile Networks》;20150630;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109190370A (en) | 2019-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109190370B (en) | Android interface similarity calculation method based on control region distribution characteristics | |
Raja et al. | Table structure recognition using top-down and bottom-up cues | |
US20230091174A1 (en) | System and method for the creation and use of visually-diverse high-quality dynamic layouts | |
US9996566B2 (en) | Visual design system for generating a visual data structure associated with a semantic composition based on a hierarchy of components | |
US9817804B2 (en) | System for comparison and merging of versions in edited websites and interactive applications | |
US11188509B2 (en) | System and method for generating a visual data structure associated with business information based on a hierarchy of components | |
JP5952428B2 (en) | Borderless table detection engine | |
US20160034441A1 (en) | Systems, apparatuses and methods for generating a user interface | |
CN104067293B (en) | Polar plot classification engine | |
CN105930159A (en) | Image-based interface code generation method and system | |
Morozov et al. | Distributed contour trees | |
JP2010541097A (en) | Arrangement of graphics objects on the page by control based on relative position | |
US11874813B2 (en) | Visual design system for generating a visual data structure associated with a semantic composition based on a hierarchy of components | |
US9031894B2 (en) | Parsing and rendering structured images | |
AU2016299873C1 (en) | System and method for the creation and use of visually- diverse high-quality dynamic visual data structures | |
US11610054B1 (en) | Semantically-guided template generation from image content | |
CN106201184A (en) | Edit methods, device and the terminal of a kind of SNS message | |
TW201523421A (en) | Determining images of article for extraction | |
JP5890340B2 (en) | Image classification device and image classification program | |
US11663398B2 (en) | Mapping annotations to ranges of text across documents | |
Yuan et al. | A novel figure panel classification and extraction method for document image understanding | |
CN111723177B (en) | Modeling method and device of information extraction model and electronic equipment | |
Diem et al. | Semi-automated document image clustering and retrieval | |
Wu et al. | Very fast generation of content-preserved photo collage under canvas size constraint | |
KR20170081348A (en) | Method and System for Intelligent Mining of Digital Image Big-Data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |