CN111753322B - Automatic verification method and system for mobile App permission list - Google Patents

Automatic verification method and system for mobile App permission list Download PDF

Info

Publication number
CN111753322B
CN111753322B CN202010635435.9A CN202010635435A CN111753322B CN 111753322 B CN111753322 B CN 111753322B CN 202010635435 A CN202010635435 A CN 202010635435A CN 111753322 B CN111753322 B CN 111753322B
Authority
CN
China
Prior art keywords
mobile app
vector
actual
permission list
tested
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010635435.9A
Other languages
Chinese (zh)
Other versions
CN111753322A (en
Inventor
王海洋
李雪梅
刘大伟
王丽萍
张旋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yantai Zhongke Data Technology Co ltd
Yantai Branch Institute Of Computing Technology Chinese Academy Of Science
Original Assignee
Yantai Zhongke Data Technology Co ltd
Yantai Branch Institute Of Computing Technology Chinese Academy Of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yantai Zhongke Data Technology Co ltd, Yantai Branch Institute Of Computing Technology Chinese Academy Of Science filed Critical Yantai Zhongke Data Technology Co ltd
Priority to CN202010635435.9A priority Critical patent/CN111753322B/en
Publication of CN111753322A publication Critical patent/CN111753322A/en
Application granted granted Critical
Publication of CN111753322B publication Critical patent/CN111753322B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/604Tools and structures for managing or administering access control systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices

Abstract

The invention discloses an automatic verification method for a mobile App permission list, which comprises the steps of S1, obtaining an actual permission list of a mobile App to be tested, converting the actual permission list into a vector form, and obtaining an actual permission list vector; s2, obtaining a privacy protocol of the mobile App to be tested, converting the privacy protocol into a vector form through a trained deep learning classification model, and comparing the vector form with a set threshold value to obtain a statement authority list vector; and S3, comparing whether the actual permission list vector of the mobile App to be tested is consistent with the declaration permission list vector, if so, judging that the mobile App to be tested is in compliance, otherwise, judging that the mobile App to be tested is not in compliance. The automatic verification method for the mobile APP permission list realizes automatic verification of the mobile APP permission list, and can judge whether the App illegally acquires the personal information of the user without manually reading and verifying the App privacy protocol content. The invention also discloses an automatic verification system for the mobile App permission list.

Description

Automatic verification method and system for mobile App permission list
Technical Field
The invention relates to the field of information security, in particular to a method and a system for automatically checking a mobile App permission list.
Background
The popularization of smart mobile phones greatly facilitates the life of people, meanwhile, in order to enrich the application of smart mobile phones, various requirements of people are met, various APPs come into existence, a general regular mobile App needs to clearly indicate the situations of the purpose, mode and range of collecting and using personal information through the form of privacy policy and user protocol when collecting personal information, but meanwhile, some merchants still have the problems that personal information is collected illegally, the right of users is damaged due to excessive right searching and the like in order to achieve the commercial purpose of the merchants, and personal and property safety of users is threatened.
Currently, in the auditing of the mobile App, one of the items is to check whether the right required by the 'declaration' of the mobile App is consistent with the right required by the 'actual' of the mobile App, and the work of the part is to judge whether the user authority requested by the App is consistent with the content specified in the privacy protocol of the user, and the work is finished in a form of manual detection, after the authority list of the current App needs to be read manually, whether corresponding description exists in the privacy protocol of the App, if the corresponding description does not exist, the mobile App is considered to have illegal claiming behavior, and a flow method for automatic auditing is not provided at present.
By the end of 2 months in 2020, the number of the APP monitored in domestic markets in China is 352 ten thousands, the working amount of the numerous Apps in a manual checking mode is very large, a large amount of time is consumed, and the checking efficiency is low, so that the manual checking is difficult to apply and deploy in a large scale.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: an automatic verification method for a mobile App permission list is provided.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
an automatic verification method for a mobile App permission list comprises the following steps:
s1, acquiring an actual authority list of the mobile App to be tested, and expressing the actual authority list as an actual authority list vector form;
s2, obtaining a privacy protocol of the mobile App to be tested, and expressing the privacy protocol text into a form of a statement authority list vector through a trained deep learning classification model;
and S3, comparing whether the actual permission list vector of the mobile App to be tested is consistent with the declaration permission list vector, if so, judging that the mobile App to be tested is in compliance, otherwise, judging that the mobile App to be tested is not in compliance.
Compared with the prior art, the invention has the following technical effects:
the invention realizes the automatic verification of the mobile APP permission list by adopting a neural network algorithm. After the algorithm is deployed, reporting abnormal conditions can be automatically detected and found, and whether the App has the problem of illegally acquiring personal information of a user can be judged without manually reading and verifying the content of the App privacy protocol.
On the basis of the technical scheme, the invention can be further improved as follows.
Further, the step S1 of obtaining the actual authority list of the mobile App to be tested is to perform decompiling on the App installation package file by using an android decompiling tool to obtain android manifest.
The beneficial effect of adopting the further scheme is that the authority list can be comprehensively obtained without omission.
Preferably, the actual permission list vector is in the form of one-hot.
The further scheme has the advantage of facilitating subsequent maintenance treatment.
Further, the deep learning classification model trained in step S2 is obtained by specifically training with the following method:
s2-1, acquiring training materials, specifically acquiring an actual authority list of a mobile App in a training set, representing the actual authority list as a one-hot vector, and acquiring an actual authority list vector as a category label; acquiring a privacy protocol text of the mobile App, and expressing the privacy protocol text into a vector form by using a pre-training language model to be used as a corpus to be classified;
s2-2, inputting the category label obtained in the step S2-1 and the corpus to be classified into a neural network model as input items;
s2-3, connecting the output of the penultimate layer of the neural network model to a full connection layer, wherein the output of the full connection layer declares an authority list vector, and the dimension of the declaration authority list vector is equal to that of the actual authority list vector;
s2-4, constructing a loss function for representing loss between the mobile App statement authority list vector and the mobile App actual authority list vector output by the neural network model;
s2-5, judging the size relation between the loss between the declared authority list vector and the actual authority list vector output by the neural network model and a set threshold, if the loss is larger than the set threshold, returning to the step S2-1 to continue training iteration until the loss is smaller than the threshold or the specified iteration number is reached, and otherwise executing the step S2-6;
and S2-6, outputting the trained deep learning classification model.
The further scheme has the advantages that semantic knowledge can be learned, the problem of word ambiguity can be solved, and the accuracy is high. The probability of asking for various rights can be obtained from the privacy protocol text.
Preferably, the training of the neural network model is an active learning method, specifically, a correct answer is not provided for a mobile App in a training set in advance, when the neural network model predicts an App which is not compliant, manual intervention is performed to check whether the conclusion of the App which is not compliant is correct, and if the conclusion is incorrect, the predicted conclusion is corrected and parameter optimization is performed on the neural network model.
The beneficial effect of adopting above-mentioned further scheme is that need not prepare in advance the training data set that the privacy agreement corresponds the authority that points, need not carry out a large amount of artifical mark labels, only need carry out artifical the audit to the few condition in the training process, use less training sample to obtain better classification effect, and classification efficiency is high, practiced thrift the cost of labor.
The invention also discloses an automatic verification system for the mobile App permission list, which comprises a to-be-tested mobile App actual permission list acquisition module, a to-be-tested mobile App privacy protocol acquisition module, a deep learning classification model and a permission consistency verification module;
the actual permission list acquisition module of the mobile App to be tested is used for acquiring an actual permission list of the mobile App to be tested, converting the actual permission list into a vector form and obtaining an actual permission list vector;
the to-be-tested mobile App privacy protocol acquisition module is used for acquiring a to-be-tested mobile App privacy protocol;
the deep learning classification model is used for converting a privacy protocol of the mobile App to be tested into a vector form to obtain a statement authority list vector;
and the permission consistency checking module is used for comparing the consistency condition of the actual permission list vector and the declaration permission list vector, if the actual permission list vector and the declaration permission list vector are consistent, outputting the 'compliance' of the mobile App to be tested, and otherwise, outputting the 'non-compliance' of the mobile App to be tested.
On the basis of the above, the invention can be further improved as follows:
furthermore, the deep learning classification model comprises a material module, a pre-training language model module, a neural network model module, a loss function module, a threshold setting and judging module and a model output module,
the material module is used for acquiring an actual permission list of the mobile App to be tested, expressing the actual permission list as a one-hot vector, obtaining an actual permission list vector as a category label and inputting the actual permission list vector into the neural network model module; meanwhile, the method is used for acquiring the privacy protocol of the mobile App to be tested and inputting the privacy protocol to the pre-training language model module;
the pre-training language model module is used for expressing a privacy protocol of the mobile App to be tested into a vector form and conveying the vector form as a corpus to be classified to the neural network model module;
the neural network model module utilizes the category labels and the linguistic data to be classified to perform autonomous learning and output a statement authority list vector, and the statement authority list vector corresponds to each dimension of data of an actual authority list vector;
the loss function module is used for obtaining the loss between the statement authority list vector and the actual authority list vector;
the threshold setting and judging module is used for setting a loss threshold and a training frequency threshold, judging the size relationship between the loss obtained by the loss function module and the set loss threshold, and if the loss obtained by the loss function module is larger than the set loss threshold, continuing iterative training until the loss is smaller than the loss threshold or the training frequency reaches the training frequency threshold;
the model output module is used for outputting a statement authority list vector.
Drawings
Fig. 1 is a schematic flow chart of an automatic verification method for a mobile App permission list according to the present invention;
FIG. 2 is a flow chart of the deep learning classification model training process of the present invention;
FIG. 3 is a relational diagram of key steps of an automatic verification method for a mobile App permission list in an embodiment;
FIG. 4 shows the 26 key permissions involved by the mobile App in the embodiment;
fig. 5 is an example of the mobile App declaring permissions in a configuration file in an embodiment;
FIG. 6 is an example of the mobile App requesting permission in the code running phase in the embodiment;
FIG. 7 is a functional block diagram of a mobile App permission list auto-verification system of the present invention;
FIG. 8 is a functional block diagram of a deep learning classification model module in the mobile App privilege list automatic verification system of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Android operating system permissions are classified into native Android operating system permissions, such as ACCESS _ FILE-LOCATION, READ _ CONTACTS, etc., or permissions customized to fit more launchers, such as use-permission, com. Currently, the definition permissions 2068 and the native Android system permissions 135 are counted from the acquired permission list of 1 ten thousand APPs.
The permission list in the invention comprises a custom permission and a native Android system permission, and the native Android system permission is taken as an example for convenience in understanding.
As shown in the figure, the currently commonly used native android operating system has 26 key permissions closely related to the collection and use of personal information, and the following embodiments take the 26 key permissions as examples for description.
First, a deep learning model based on text classification needs to be trained, please refer to fig. 2,
s2-1, acquiring the authority list of a specific mobile App in the training set, and representing the authority list as a one-hot vector form as a category label.
The permission list of the mobile App can be obtained from the configuration file by decompiling the mobile App installation package, and can also be obtained from the App source code.
The acquisition basis is as follows: if the App needs to apply for the system permission due to the service function, the App developer needs to apply for the system permission in a manner (static manner) explicitly declared in an android manifest.
In the android development process, the used authority needs to be declared in an android manifest.
The acquisition method comprises the following steps: performing decompiling on an installation package file APK of the App by using an android decompilation tool (such as an apktool, aapt and the like) to obtain an android manifest.
Furthermore, according to the android6.0 and above version characteristics, the authority related to the personal user information needs to be dynamically applied to the code. Therefore, the authority related to the user information needs to be acquired in java code.
The acquisition method comprises the following steps: after obtaining the java source code by using an android decompiler (unship, dex2jar, pyocyon-decompiler, etc.), matching all java files by using the constant keywords of the authority declaration to obtain all the authorities of the dynamic declaration:
the permission list of the mobile App is represented as a vector form of one-hot as a category label, e.g., [1,0,1,0 … … 0,1], where the first position indicates permission to read the calendar, 1 indicates the permission, and 0 indicates no permission. The second position indicates that the calendar is allowed to be edited, 1 indicates that there is the right, 0 indicates that there is no right, and so on, with dimension 26.
The privacy protocol of the mobile App is obtained, the privacy protocol text of the mobile App is represented in a vector form as a corpus to be classified through a pre-training language model (for example, fastText, ELMo, GPT, BERT, and the like), the output dimension is maxlen, the value of maxlen can be set to 128 dimensions, 256 dimensions, 512 dimensions, 768 dimensions, and the like as required, and 512 dimensions are preferred in this embodiment.
S2-2, inputting the category label and the corpus to be classified obtained in the step into a neural network model as input items;
taking the example of bert, bert includes two versions, 12-layer transformer and 24-layer transformer. The output value of each layer of the transform can be theoretically used as a sentence vector, and experimental data shows that the best result is to take the penultimate layer, the value of the last layer is too close to the target, and the value semantics of the previous layers are not fully learned. Therefore, the penultimate layer is selected to be connected with the full connection layer aiming at the neural network of the pre-training language model.
S2-3, connecting the output of the penultimate layer of the neural network model to a full connection layer, wherein the output dimension of the full connection layer is equal to the dimension of the authority list vector;
s2-4, constructing a loss function to represent the loss between the declared authority list vector and the actual authority list vector output by the neural network model; the loss function can be selected according to practical problems, such as a mean square error loss function, a cross entropy loss function, and the like. In the scheme, a variance loss function is adopted to define the loss function of the neural network as: the Euclidean distance between the authority expression vector output by the neural network and the actual authority expression vector;
s2-5, judging the size relation between the difference between the declared authority list vector and the actual authority list vector output by the neural network model and a set threshold, if the difference is larger than the set threshold, returning to the step S2-1 to continue training iteration until the difference is smaller than the threshold or the specified iteration number is reached, and otherwise executing the step S2-6;
active learning and passive learning can be selected for training the neural network, but if the passive learning obtains a better classification effect, large-scale training sample labeling is needed, and the labeling of samples in an actual scene needs manual labeling by field experts, so that a large amount of labor and time cost is consumed.
If the training process of the neural network in the embodiment selects passive learning, because the relative number of the non-compliant mobile apps is sparse, and the authority information is marked from the obscure privacy protocol, a large amount of labor cost is consumed, so that a better classification effect can be obtained by using fewer training samples, the scheme adopts an active learning method, namely, the training set is not manually labeled in advance, the neural network automatically selects the data to be learned to label, and gives a judgment conclusion, if the non-compliant apps are judged in advance, manual intervention is performed to check whether the determined non-compliant apps are really non-compliant, and the result is fed back to the neural network, so that the adjustable parameters of the neural network are further optimized, and the training of the neural network can be realized only by manually introducing few samples.
And S2-6, outputting the trained deep learning classification model.
A classification model capable of classifying the mobile APP privacy protocol text is obtained, as shown in fig. 4, the classification model in this embodiment is a classification model of 26 classes, the classification model at this time is a classification model for classifying the mobile APP privacy protocol text, that is, the privacy protocol text is input, 26 dimensions are output, and each dimension represents a probability of collecting a user-specific personal information class, as shown in fig. 3. And setting a threshold, and comparing the probability of each dimension with the threshold to obtain a statement authority list vector corresponding to the privacy protocol text.
After the trained deep learning classification model is obtained, authority verification can be automatically carried out on massive mobile apps. As shown in fig. 1, the specific steps are as follows:
s1, acquiring an actual authority list of the mobile App to be tested, and expressing the actual authority list into a vector form to obtain an actual authority list vector;
s2, obtaining a privacy protocol of the mobile App to be tested, and expressing the privacy protocol text into a vector form through a trained deep learning classification model;
and S3, comparing whether the actual permission list vector of the mobile App to be tested is consistent with the declaration permission list vector, if so, judging that the mobile App to be tested is in compliance, otherwise, judging that the mobile App to be tested is not in compliance.
Taking the mobile App ' today's headline ' as an example, by decompiling the installation file thereof, the actual required authority can be obtained as follows;
android.permission.READ_CALENDAR 1
android.permission.WRITE_CALENDAR 1
android.permission.READ_CALL_LOG 0
android.permission.WRITE_CALL_LOG 0
android.permission.PROCESS_OUTGOING_CALLS 0
android.permission.CAMERA 1
android.permission.READ_CONTACTS 1
android.permission.WRITE_CONTACTS 0
android.permission.MANAGE_ACCOUNTS 0
android.permission.ACCESS_FINE_LOCATION 1
android.permission.ACCESS_COARSE_LOCATION 1
android.permission.RECORD_AUDIO 0
android.permission.READ_PHONE_STATE 1
android.permission.TELEPHONY_SERVICE 0
android.permission.CALL_PRIVILEGED 0
android.permission.MODIFY_PHONE_STATE 0
android.permission.ADD_VOICEMAIL 0
android.permission.USE_SIP 0
android.permission.SENSOR_INFO 1
android.permission.SEND_SMS 0
android.permission.RECEIVE_SMS 0
android.permission.READ_SMS 0
android.permission.RECEIVE_WAP_PUSH 0
android.permission.RECEIVE_MMS 0
android.permission.READ_EXTERNAL_STORAGE 1
android.permission.WRITE_EXTERNAL_STORAGE 1
representing the authority list as a vector in the form of one-hot, then the actual authority list vector of "today's top" with 26-dimensional data is obtained:
[1,1,0,0,0,1,1,0,0,1,1,0,1,0,0,0,0,0,1,0,0,0,0,0,1,1]
wherein:
the first "1" of the bit column indicates that the CALENDAR has android.
The second "1" of the bit column indicates that there is an android.
The third "0" in the bit column indicates that there is no authority of android.
And so on.
The privacy protocol for obtaining "today's headlines" is then as follows (section):
today's first line privacy policy
And (3) updating the date: 12.12.23.2019
The effective date: 12 and 30 months in 2019
The importance of personal information to your is deeply known in the recent article (for short, "us"), and the personal information and privacy security of your can be protected according to the regulations of laws and regulations. We customize this "privacy policy" and specifically suggest: you would like to peruse and understand the present privacy policy before using today's headlines and related services in order to make the appropriate selection.
This privacy policy will help you know:
we will follow the privacy policy to collect and use your information, but will not just use forced bundling to collect personal information for your consent to the privacy policy.
When you use or start the related function or use the service, we will collect and use the related information necessary for realizing the function and the service. Unless it is necessary to implement basic business functions or necessary information required by law and regulation, you can refuse to provide it without affecting other functions or services. We will describe item by item in the privacy policy which are the necessary information.
…………
We will help you see how we collect, use, store, transmit, share, transfer (as applicable), and protect personal information in detail; help you know the way to inquire about, access, delete, correct, withdraw the personal information of authorization. In which, we have shown in bold type about the important content of the terms of your personal information interest, please pay special attention.
1. How we collect and use personal information
2. How we use cookie and the like
3. How we share, transfer, disclose publicly personal information
4. How we store personal information
5. How we secure personal information
6. Managing your personal information
7. Minor terms of use
8. Revision and notification of privacy policy
9. Contact us
1. How we collect and use personal information
We will collect information that you are actively providing when using the service, and information that you are generating during using the function or receiving the service through automatic means as follows:
1.1 registration, Login, authentication
1.1.1 registration, Login
a. When you register, log in the top of the day and relevant services, you can create an account through a mobile phone number, and you can perfect relevant network identification information (head portrait, nickname and password), and the information is collected to help you complete registration. The user can select to fill in gender, birthday, region and personal introduction according to the self requirement to perfect the information.
…………
1.6.2 device information and Log information
a. In order to ensure the safety, the operation quality and the efficiency of software service, a hardware model, an operating system version number, an international mobile equipment identification code, a unique equipment identifier, a network equipment hardware address, an IP address, a WLAN access point, Bluetooth, a base station, a software version number, a network access mode, a type, a state, network quality data, operation, use and service logs are collected.
…………
g. The payment function is as follows: the payment function provides you with services by a third party payment authority that cooperates with us. A third party payment authority may need to collect your name, bank card type and card number, expiration date and cell phone number. The bank card number, the validity period and the mobile phone number are personal sensitive information which is necessary for the payment function, and the refusal of providing the information can cause you not to use the function but does not influence the normal use of other functions.
…………
6.2.2 changing or revoking sensitive Authority settings
a. You can turn off GPS geographical position, camera, microphone, photo album right in the operating system of the device, change the consent range or withdraw your authorization. After revoking the authorization we will not collect any more information about these rights.
…………
9. Contact us
…………
b. If there is any question, comment or suggestion about the content of the privacy policy, you can contact us by logging into the "user feedback" page in the "today's headlines" client or by the official website home page "contact us-user feedback" entry.
It can be seen that the actual privacy protocol text is obscure and tedious, and the manual reading workload is very large, and the pre-training language model is used to express the privacy protocol into a vector with 512 dimensions as the corpus to be classified, as follows:
array([[0.8558377,0.86575764,0.13302976,-0.3925884,-0.04717652,-0.9750702,0.972964,-0.8903629,-0.9929771,…………-0.37429714]],dtype=float32)
and then, inputting the corpus to be classified into a trained deep learning classification model to obtain an intermediate vector with the same dimension as the actual authority list vector, wherein the intermediate vector has the following form:
[0.9,0.8,0.2,0.3,0.4,0.9,0.8,0.2,0.4,0.8,0.9,0.3,0.8,0.1,0.3,0.2,0.4,0.4,0.8,0.1,0.2,0.3,0.1,0.2,0.8,0.9] which is the same 26 dimensions as the actual rights list vector, each dimension representing the probability of having a corresponding right, as in this example,
0.9 located at the first bit indicates that the probability of having the right of "android.
0.8 located at the second bit indicates that the probability of having the right of "android.
And by analogy, setting a threshold, for example 80%, comparing the threshold with the intermediate vector, and marking the intermediate vector as 1 if the threshold is greater than the threshold, and marking the intermediate vector as 0 if the threshold is less than or equal to the threshold, then the intermediate vector is expressed as:
[1,1,0,0,0,1,1,0,0,1,1,0,1,0,0,0,0,0,1,0,0,0,0,0,1,1]
the vector obtained by comparing the intermediate vector with the threshold is the current head declared authority list vector.
And further comparing whether the actual authority list vector is consistent with the declaration authority list vector, if any one of the actual authority list vector and the declaration authority list vector is inconsistent, determining that the actual authority list vector is not in compliance, outputting a conclusion, and manually intervening and checking.
The invention also discloses an automatic verification system for the mobile App permission list, which comprises a to-be-tested mobile App actual permission list acquisition module 1, a to-be-tested mobile App privacy protocol acquisition module 2, a deep learning classification model 3 and a permission consistency verification module 4, as shown in FIG. 7;
the actual permission list acquisition module of the mobile App to be tested is used for acquiring an actual permission list of the mobile App to be tested, converting the actual permission list into a vector form and obtaining an actual permission list vector;
the to-be-tested mobile App privacy protocol acquisition module is used for acquiring a to-be-tested mobile App privacy protocol;
the deep learning classification model is used for converting a privacy protocol of the mobile App to be tested into a vector form to obtain a statement authority list vector;
and the permission consistency checking module is used for comparing the consistency condition of the actual permission list vector and the declaration permission list vector, if the actual permission list vector and the declaration permission list vector are consistent, outputting the 'compliance' of the mobile App to be tested, and otherwise, outputting the 'non-compliance' of the mobile App to be tested.
As shown in FIG. 8, the deep learning classification model includes a material module 2-1, a pre-training language model module 2-2, a neural network model module 2-3, a loss function module 2-4, a threshold setting and judging module 2-5 and a model output module 2-6,
the material module is used for acquiring an actual permission list of the mobile App to be tested, expressing the actual permission list as a one-hot vector, obtaining an actual permission list vector as a category label and inputting the actual permission list vector into the neural network model module; meanwhile, the method is used for acquiring the privacy protocol of the mobile App to be tested and inputting the privacy protocol to the pre-training language model module;
the pre-training language model module is used for expressing a privacy protocol of the mobile App to be tested into a vector form and conveying the vector form as a corpus to be classified to the neural network model module;
the neural network model module utilizes the category labels and the linguistic data to be classified to perform autonomous learning and output a statement authority list vector, and the statement authority list vector corresponds to each dimension of data of an actual authority list vector;
the loss function module is used for obtaining the loss between the statement authority list vector and the actual authority list vector;
the threshold setting and judging module is used for setting a loss threshold and a training frequency threshold, judging the size relationship between the loss obtained by the loss function module and the set loss threshold, and if the loss obtained by the loss function module is larger than the set loss threshold, continuing iterative training until the loss is smaller than the loss threshold or the training frequency reaches the training frequency threshold;
the model output module is used for outputting a statement authority list vector.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (3)

1. An automatic verification method for a mobile App permission list is characterized by comprising the following steps:
s1, acquiring an actual authority list of the mobile App to be tested, and converting the actual authority list into a one-hot vector form to obtain an actual authority list vector; the actual authority list of the mobile App to be tested is obtained by performing decompiling on an installation package file of the App by using an android decompiling tool to obtain android Manifest.xml files and/or java source codes of programs, and obtaining an authority list recorded in the android Manifest.xml files and/or matching the java source codes by using constant keywords of authority statements to obtain all authorities of dynamic statements in codes;
s2, obtaining a privacy protocol of the mobile App to be tested, inputting the privacy protocol into a trained deep learning classification model to obtain a statement authority list vector, wherein the deep learning classification model is obtained through the following training steps, and the training steps comprise:
s2-1, obtaining a plurality of mobile APPs for training, arbitrarily taking one of the mobile APPs to obtain an actual permission list, converting the actual permission list into a one-hot vector to obtain an actual permission list vector, using the actual permission list vector as a category label to obtain a privacy protocol text of the mobile App, and converting the privacy protocol text into a vector form by using a pre-trained language model to serve as a corpus to be classified;
s2-2, inputting the category label and the linguistic data to be classified into a neural network model together as input items;
s2-3, connecting the output of the penultimate layer of the neural network model to a full connection layer, wherein the full connection layer outputs a statement authority list vector, and the dimension of the statement authority list vector is equal to that of the actual authority list vector;
s2-4, constructing a loss function for representing the loss between the statement authority list vector and the actual authority list vector;
s2-5, judging the size relation between the loss between the statement authority list vector and the actual authority list vector and a set threshold, if the loss is less than or equal to the set threshold, finishing the training and executing the step S2-6, if the loss is greater than the set threshold, returning to the step S2-1 to continue the training iteration, and executing the step S2-6 until the loss is less than or equal to the set threshold or the appointed iteration number is finished;
s2-6, using the trained neural network model as the deep learning classification model;
s3, comparing whether the actual permission list vector and the statement permission list vector of the mobile App to be tested are consistent or not, if so, judging that the mobile App to be tested is in compliance, otherwise, judging that the mobile App to be tested is not in compliance.
2. The mobile App permission list auto-verification method of claim 1, wherein the pre-trained language model is one of fastText, ELMo, GPT, BERT.
3. The method for automatically checking the permission list of the mobile App according to claim 1, wherein a training process of the trained deep learning classification model is an active learning method, specifically, for the mobile App in the training set, no sample marking is carried out on the training set in advance, when the neural network model judges the App which is not compliant in advance, the neural network model manually intervenes to check whether the conclusion of the non-compliance is correct, if the conclusion is incorrect, the pre-judgment conclusion is manually corrected, and the neural network model guides the training classification model to carry out parameter optimization based on the result of the manual correction.
CN202010635435.9A 2020-07-03 2020-07-03 Automatic verification method and system for mobile App permission list Active CN111753322B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010635435.9A CN111753322B (en) 2020-07-03 2020-07-03 Automatic verification method and system for mobile App permission list

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010635435.9A CN111753322B (en) 2020-07-03 2020-07-03 Automatic verification method and system for mobile App permission list

Publications (2)

Publication Number Publication Date
CN111753322A CN111753322A (en) 2020-10-09
CN111753322B true CN111753322B (en) 2021-10-01

Family

ID=72680414

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010635435.9A Active CN111753322B (en) 2020-07-03 2020-07-03 Automatic verification method and system for mobile App permission list

Country Status (1)

Country Link
CN (1) CN111753322B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199506B (en) * 2020-11-10 2021-08-24 支付宝(杭州)信息技术有限公司 Information detection method, device and equipment for application program
CN112257114A (en) * 2020-12-02 2021-01-22 支付宝(杭州)信息技术有限公司 Application privacy compliance detection method, device, equipment and medium
CN113051613A (en) * 2021-03-15 2021-06-29 Oppo广东移动通信有限公司 Privacy policy detection method and device, electronic equipment and readable storage medium
CN113139186A (en) * 2021-04-14 2021-07-20 北京开元华创信息技术有限公司 Personal information security audit evaluation system
CN113282748B (en) * 2021-04-29 2023-05-12 湘潭大学 Automatic detection method for privacy text based on transformer
CN113343219B (en) * 2021-05-31 2023-03-07 烟台中科网络技术研究所 Automatic and efficient high-risk mobile application program detection method
CN113688033A (en) * 2021-07-20 2021-11-23 荣耀终端有限公司 Privacy compliance detection method and computer readable storage medium
CN113849852A (en) * 2021-08-27 2021-12-28 杭州逗酷软件科技有限公司 Privacy authority detection method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284370B (en) * 2018-08-20 2022-05-06 中山大学 Mobile application description and permission fidelity determination method and device based on deep learning
CN109639884A (en) * 2018-11-21 2019-04-16 惠州Tcl移动通信有限公司 A kind of method, storage medium and terminal device based on Android monitoring sensitive permission
CN110162963B (en) * 2019-04-26 2021-07-06 佛山市微风科技有限公司 Method for identifying over-right application program

Also Published As

Publication number Publication date
CN111753322A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
CN111753322B (en) Automatic verification method and system for mobile App permission list
CN105825138B (en) A kind of method and apparatus of sensitive data identification
US10715550B2 (en) Method and device for application information risk management
EP2748781B1 (en) Multi-factor identity fingerprinting with user behavior
CN103546877B (en) A kind of method, system and mobile terminal obtaining simultaneously input content code
Neyaz et al. Security, privacy and steganographic analysis of FaceApp and TikTok
Wang et al. Using text mining to infer the purpose of permission use in mobile apps
CN104980580B (en) Short message inspection method and device
US8832795B2 (en) Using a communications network to verify a user searching data
CN111314306A (en) Interface access method and device, electronic equipment and storage medium
US20120254853A1 (en) Customizing mobile applications
CN101611588A (en) Secure access for limited resources
US20200334151A1 (en) Facts controller for a shared fact service
US20160366592A1 (en) Authorization based on access token
CN110287691A (en) Application program login method, device, equipment and storage medium
CN102077201A (en) System and method for dynamic and real-time categorization of webpages
CN104376266A (en) Determination method and device for security level of application software
CN112106049A (en) System and method for generating private data isolation and reporting
CN105653947B (en) The method and device of data safety risk is applied in a kind of assessment
Wu et al. Overprivileged permission detection for android applications
US20160063278A1 (en) Privacy Compliance Event Analysis System
CN111190603A (en) Private data detection method and device and computer readable storage medium
CN113051613A (en) Privacy policy detection method and device, electronic equipment and readable storage medium
Kuncoro et al. Mobile Forensics Development of Mobile Banking Application using Static Forensic
CN108667768A (en) A kind of recognition methods of network application fingerprint and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant