CN109190370B - Android interface similarity calculation method based on control region distribution characteristics - Google Patents

Android interface similarity calculation method based on control region distribution characteristics Download PDF

Info

Publication number
CN109190370B
CN109190370B CN201810711378.0A CN201810711378A CN109190370B CN 109190370 B CN109190370 B CN 109190370B CN 201810711378 A CN201810711378 A CN 201810711378A CN 109190370 B CN109190370 B CN 109190370B
Authority
CN
China
Prior art keywords
tree
control
similarity
information
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810711378.0A
Other languages
Chinese (zh)
Other versions
CN109190370A (en
Inventor
岳胜涛
马骏
陶先平
吕建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201810711378.0A priority Critical patent/CN109190370B/en
Publication of CN109190370A publication Critical patent/CN109190370A/en
Application granted granted Critical
Publication of CN109190370B publication Critical patent/CN109190370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Abstract

A method for calculating the similarity of an android interface based on control region distribution characteristics comprises the following steps of describing a model-control region distribution tree of application interface information during android operation and calculating the similarity of the interface: firstly, obtaining view level information of an application interface layout, wherein the information can be obtained by an existing third-party tool; then traversing the rectangular region information of each control in the view level, and constructing the rectangular regions into an R tree; and taking the constructed R tree as a control region distribution tree, and judging the similarity of the corresponding interfaces by comparing the similarity of the space region distribution tree. The system is characterized in that: the method improves the resistance of code confusion and the two types of anti-detection modes, and improves the accuracy of similarity calculation aiming at the mixed android application interface.

Description

Android interface similarity calculation method based on control region distribution characteristics
Technical Field
The invention belongs to the field of android applications, software birthmarks, repacking detection and application fuzzy algorithms, and particularly relates to an android interface similarity calculation method based on control region distribution characteristics.
Background
With the increasing popularity of mobile devices, the number of mobile applications has also seen explosive growth, and thus has attracted the attention of many lawbreakers. Android applications are easy to be repackaged and released, and attackers add and modify partial codes in the repackaging process so as to achieve the illegal purpose of the attackers. In the prior art, mostly, the android application repackaging needs to be detected by identifying the software birthmark of the application, and the software birthmark of the application is extracted by analyzing the code or interface information of the application. However, due to the popularity of confusion and encryption in application today, the extraction of the software birthmark from the application code is severely interfered, and therefore, more and more work is focused on extracting the birthmark from the application interface, and an important part of the work is how to calculate the similarity between the interfaces. With the gradual trend that the hybrid android application becomes the mainstream mode of the android application development nowadays, the application interface information is greatly different from the traditional native android application. Practically speaking, the task of extracting the birthmarks of the mixed application interface is not performed at the present stage; whereas from theoretical analysis, the dynamic, untyped control properties in the Web section increase the difficulty of processing.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a method for calculating the similarity of an android interface based on the distribution characteristics of control areas. The method is based on the following realisations: the control in the android application interface always occupies a rectangular area (containing position and size information), the rectangular areas form a framework of the interface, and partial modification of the framework does not affect the similarity degree of the whole interface excessively. Therefore, we propose a control region distribution tree, a model for android runtime application interface information.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for calculating the similarity of an android interface based on control region distribution characteristics is characterized by comprising the following steps:
firstly, dynamically executing android application and collecting user interface information;
step two, constructing a control area distribution tree through user interface information;
and step three, judging the similarity between the application interfaces by comparing the similarity of the control region distribution trees.
In order to optimize the technical scheme, the specific measures adopted further comprise:
in the first step, for android application runtime interfaces to be compared, structural information of an android application user interface is obtained.
In the second step, the control area distribution tree is a tree-shaped data structure, one application interface corresponds to one control area distribution tree, nodes of the tree correspond to controls in the application interface one by one, each node contains area information corresponding to the control, and the area information of the control refers to the position and size information of a rectangular area occupied by the control on the interface presentation.
The second step specifically comprises:
step 2.1, firstly, obtaining view level information of the application interface layout;
step 2.2, traversing the rectangular region information of the bottommost control in the view level one by one according to any sequence, taking the rectangular regions as nodes, and inserting and constructing the nodes into an R tree;
and 2.3, taking the constructed R tree as a control area distribution tree.
The third step specifically comprises:
3.1, each node of a control region distribution tree contains the position and size information of a rectangular region occupied by the control, and the similarity of the rectangular region is defined by the Jaccard distance;
two rectangles r are arranged1、r2The corresponding area size is s1、s2The overlapping area of the two is soThen the similarity of the two rectangles is:
Figure BDA0001715741210000021
step 3.2, setting the two control region distribution trees as t1、t2For t1Each rectangle r ini∈t1And obtaining:
Figure BDA0001715741210000022
wherein, t2Each rectangle in rjI.e. rj∈t2
Finally simmThe average value of (a) is taken as the similarity of the tree, i.e. the similarity of the corresponding interface.
The invention has the beneficial effects that: the functions specifically provided by the invention comprise: 1. collecting android application user interface information and constructing a control area distribution tree; 2. and feeding back the similarity between the android application interfaces. Compared with the prior art, the method has the remarkable characteristics that: the method improves the resistance of code confusion and the two types of anti-detection modes, and improves the accuracy of similarity calculation aiming at the mixed android application interface.
Drawings
FIG. 1 is a system block diagram of an android interface similarity calculation method based on control region distribution characteristics.
FIG. 2 is a flow diagram of the generation of a control region distribution tree.
FIG. 3 is an example of an interface provided.
FIG. 4 is a view hierarchy of an interface.
FIG. 5 is a rectangular area corresponding to a control in the interface and the partition of MBR in the generated R-tree.
FIG. 6 is the final generated control region distribution tree.
Detailed Description
The present invention will now be described in further detail with reference to the accompanying drawings.
First, main process
As shown in fig. 1, for a pair of android application runtime interfaces to be compared, structural information of an android application user interface is obtained by using a third-party tool. And then converting the user interface structure information contained in each layout into a control area distribution tree as the characteristics of the application interface. And finally, calculating the similarity of the control region distribution trees corresponding to the two android application interfaces, and determining whether the application interfaces are similar.
Second, generation of control region distribution tree
The generation flow chart of the control region distribution tree is shown in fig. 2, and the detailed steps are as follows:
the layouts obtained by us are in an XML format, and the data structure of the layouts can be regarded as a tree, the whole tree represents the layout hierarchy, and the nodes of the tree represent the corresponding controls in the layouts. A view hierarchy tree of the running interface layout may be obtained using a third party tool.
Each control in the interface occupies a rectangular area, and different controls draw different content (such as color, text, or pictures) in the rectangles with different sizes. The controls are elements at the bottom layer in the interface view hierarchy, and the corresponding rectangles cannot be nested and overlapped with each other. We build an R-tree to store these rectangular information, which is a balanced multi-way tree for indexing high-dimensional information, and in the present invention we use rectangular information that is two-dimensional (width and height). The R-tree uses a Minimum Bounding Rectangle (MBR) to combine adjacent objects together to form a node, and ensures the balance of the tree. By continuously inserting the rectangle information of each control into the R tree, the finally constructed R tree is used as a control area distribution tree of the interface. Different insertion orders may result in different R-trees but do not affect the similarity calculation results of the control region distribution tree thereafter.
Fig. 3 shows an example of an interface, and fig. 4 is a view hierarchy of the interface, where each node corresponds to an element that is visible or invisible in the interface (in general, an invisible element is used to organize the layout structure of its sub-elements, such as LinearLayout), and we only consider the leaf node elements and add the corresponding rectangular information to the R-tree. Fig. 5 is a rectangular area corresponding to the control in the interface and the partition of MBR in the generated R tree, and fig. 6 is a finally generated control area distribution tree.
Similarity calculation of control region distribution tree
Each node of a control area distribution tree contains the position and size information of a rectangular area occupied by the control, and the similarity of the rectangular area is defined by Jaccard distance: two rectangles r are arranged1、r2The corresponding area size is s1、s2The overlapping area of the two is soThen the similarity of the two rectangles is:
Figure BDA0001715741210000041
secondly, let two control region distribution trees be t1、t2For t1Each rectangle r ini∈t1And obtaining:
Figure BDA0001715741210000042
wherein, t2Each rectangle in rjI.e. rj∈t2
Finally simmThe average value of (a) is taken as the similarity of the tree, i.e. the similarity of the corresponding interface.
It should be noted that the terms "upper", "lower", "left", "right", "front", "back", etc. used in the present invention are for clarity of description only, and are not intended to limit the scope of the present invention, and the relative relationship between the terms and the terms is not limited by the technical contents of the essential changes.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (1)

1. A method for calculating the similarity of an android interface based on control region distribution characteristics is characterized by comprising the following steps:
firstly, dynamically executing android application and collecting user interface information; for the android application runtime interfaces to be compared, obtaining structural information of the android application user interface;
step two, constructing a control area distribution tree through user interface information; the control area distribution tree is a tree-shaped data structure, an application interface corresponds to the control area distribution tree, nodes of the tree correspond to controls in the application interface one by one, each node comprises area information corresponding to the control, and the area information of the control refers to the position and size information of a rectangular area occupied by the control on the interface presentation; the second step specifically comprises:
step 2.1, firstly, obtaining view level information of the application interface layout;
step 2.2, traversing the rectangular region information of the bottommost control in the view level one by one according to any sequence, taking the rectangular regions as nodes, and inserting and constructing the nodes into an R tree;
step 2.3, taking the constructed R tree as a control area distribution tree;
thirdly, judging the similarity between the application interfaces by comparing the similarity of the control region distribution trees;
the third step specifically comprises:
3.1, each node of a control region distribution tree contains the position and size information of a rectangular region occupied by the control, and the similarity of the rectangular region is defined by the Jaccard distance;
two rectangles r are arranged1、r2The corresponding area size is s1、s2The overlapping area of the two is soThen the similarity of the two rectangles is:
Figure FDA0003315798570000011
step 3.2, setting the two control region distribution trees as t1、t2For t1Each rectangle r ini∈t1And obtaining:
Figure FDA0003315798570000012
wherein, t2Each rectangle in rjI.e. rj∈t2
Finally simmThe average value of (a) is taken as the similarity of the tree, i.e. the similarity of the corresponding interface.
CN201810711378.0A 2018-07-02 2018-07-02 Android interface similarity calculation method based on control region distribution characteristics Active CN109190370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810711378.0A CN109190370B (en) 2018-07-02 2018-07-02 Android interface similarity calculation method based on control region distribution characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810711378.0A CN109190370B (en) 2018-07-02 2018-07-02 Android interface similarity calculation method based on control region distribution characteristics

Publications (2)

Publication Number Publication Date
CN109190370A CN109190370A (en) 2019-01-11
CN109190370B true CN109190370B (en) 2022-02-08

Family

ID=64948793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810711378.0A Active CN109190370B (en) 2018-07-02 2018-07-02 Android interface similarity calculation method based on control region distribution characteristics

Country Status (1)

Country Link
CN (1) CN109190370B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502876B (en) * 2019-08-07 2021-04-27 南京大学 Android interface static confusion method based on resource files
CN110766697B (en) * 2019-10-16 2023-08-04 南京大学 Method and device for identifying graphical interface control image of interface sketch
CN111273905B (en) * 2020-01-17 2023-04-18 南京大学 Application retrieval method and device based on interface sketch
CN116795346B (en) * 2023-06-26 2024-03-15 成都中科合迅科技有限公司 Component interface drawing method and system based on visual contrast

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169323A (en) * 2017-05-11 2017-09-15 南京大学 Packet inspection method is beaten again in a kind of Android application based on layout cluster figure

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107169323A (en) * 2017-05-11 2017-09-15 南京大学 Packet inspection method is beaten again in a kind of Android application based on layout cluster figure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
《DroidEagle: Seamless Detection of Visually Similar AndroidApps》;Mingshen Sun等;《Proceedings of the 8th ACM Conference on Security and Privacy in Wireless and Mobile Networks》;20150630;全文 *

Also Published As

Publication number Publication date
CN109190370A (en) 2019-01-11

Similar Documents

Publication Publication Date Title
CN109190370B (en) Android interface similarity calculation method based on control region distribution characteristics
Raja et al. Table structure recognition using top-down and bottom-up cues
US20230091174A1 (en) System and method for the creation and use of visually-diverse high-quality dynamic layouts
US9996566B2 (en) Visual design system for generating a visual data structure associated with a semantic composition based on a hierarchy of components
US9817804B2 (en) System for comparison and merging of versions in edited websites and interactive applications
US11188509B2 (en) System and method for generating a visual data structure associated with business information based on a hierarchy of components
JP5952428B2 (en) Borderless table detection engine
US20160034441A1 (en) Systems, apparatuses and methods for generating a user interface
CN104067293B (en) Polar plot classification engine
CN105930159A (en) Image-based interface code generation method and system
Morozov et al. Distributed contour trees
JP2010541097A (en) Arrangement of graphics objects on the page by control based on relative position
US11874813B2 (en) Visual design system for generating a visual data structure associated with a semantic composition based on a hierarchy of components
US9031894B2 (en) Parsing and rendering structured images
AU2016299873C1 (en) System and method for the creation and use of visually- diverse high-quality dynamic visual data structures
US11610054B1 (en) Semantically-guided template generation from image content
CN106201184A (en) Edit methods, device and the terminal of a kind of SNS message
TW201523421A (en) Determining images of article for extraction
JP5890340B2 (en) Image classification device and image classification program
US11663398B2 (en) Mapping annotations to ranges of text across documents
Yuan et al. A novel figure panel classification and extraction method for document image understanding
CN111723177B (en) Modeling method and device of information extraction model and electronic equipment
Diem et al. Semi-automated document image clustering and retrieval
Wu et al. Very fast generation of content-preserved photo collage under canvas size constraint
KR20170081348A (en) Method and System for Intelligent Mining of Digital Image Big-Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant