CN110826006B - Abnormal collection behavior identification method and device based on privacy data protection - Google Patents

Abnormal collection behavior identification method and device based on privacy data protection Download PDF

Info

Publication number
CN110826006B
CN110826006B CN201911158814.7A CN201911158814A CN110826006B CN 110826006 B CN110826006 B CN 110826006B CN 201911158814 A CN201911158814 A CN 201911158814A CN 110826006 B CN110826006 B CN 110826006B
Authority
CN
China
Prior art keywords
lightweight
applications
data
scene
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911158814.7A
Other languages
Chinese (zh)
Other versions
CN110826006A (en
Inventor
徐文浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN201911158814.7A priority Critical patent/CN110826006B/en
Publication of CN110826006A publication Critical patent/CN110826006A/en
Priority to TW109115226A priority patent/TWI743773B/en
Priority to PCT/CN2020/111725 priority patent/WO2021098327A1/en
Application granted granted Critical
Publication of CN110826006B publication Critical patent/CN110826006B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Abstract

The embodiment of the specification discloses a method and a device for recognizing abnormal collection behaviors and training a scene classification model based on privacy data protection and electronic equipment, wherein the method comprises the following steps: acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application; taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model; and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.

Description

Abnormal collection behavior identification method and device based on privacy data protection
Technical Field
The present disclosure relates to the field of computer software technologies, and in particular, to a method and an apparatus for identifying abnormal collection behavior based on privacy data protection, and an electronic device.
Background
With the rapid development of the mobile internet technology, application programs are more and more widely applied, and lightweight applications such as applets are also more and more widely applied as the lightweight applications can be embedded into third-party application programs and do not need to be downloaded and installed, and are available at any time. However, when the existing small programs are opened, the private data of the user are often collected, and some small programs also have the situation of excessively collecting the private data of the user.
At present, for such a situation, it is often necessary for an operator to manually determine whether the applets excessively collect the user privacy data on the premise that the operator receives a report of a certain applet from a user, or after finding out an applet with an abnormal collection behavior through a system. Therefore, a method for determining abnormal collection behavior for lightweight applications such as applets is needed to address the above-mentioned problems of the prior art.
Disclosure of Invention
An embodiment of the specification aims to provide a method, a device and electronic equipment for recognizing abnormal collection behaviors and training a scene classification model based on privacy data protection, so as to avoid the situation that lightweight applications such as small programs excessively collect privacy data of users.
In order to solve the above technical problem, the embodiments of the present specification are implemented as follows:
in a first aspect, a method for identifying abnormal collection behaviors based on privacy data protection is provided, which includes:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
In a second aspect, a method for training a scene classification model is provided, including:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
training to obtain a scene classification model based on the usage scene features and the corresponding usage scene labels of the light-weight applications, wherein the scene classification model is used for predicting the usage scene categories of the light-weight applications.
In a third aspect, an abnormal collection behavior recognition apparatus based on privacy data protection is provided, including:
the acquisition unit is used for acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
the prediction unit takes the page content data and the user behavior data of the target light-weight application as the input of a scene classification model so as to predict the usage scene category of the target light-weight application through the scene classification model;
the determining unit is used for determining whether abnormal acquisition behaviors exist in the target light-weight application or not based on the acquirable private data list corresponding to the usage scene category of the target light-weight application and the private data list acquired by the target light-weight application.
In a fourth aspect, a training unit of a scene classification model is provided, including:
the data acquisition unit is used for acquiring page content data, user behavior data and usage scene tags of a plurality of light-weight applications;
the characteristic extraction unit is used for extracting the usage scene characteristics of the light-weight applications from the page content data and the user behavior data of the light-weight applications;
and the model training unit is used for training to obtain a scene classification model based on the use scene characteristics of the light-weight applications and the corresponding use scene labels, and the scene classification model is used for predicting the use scene categories of the light-weight applications.
In a fifth aspect, an electronic device is provided, which includes:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
In a sixth aspect, a computer-readable storage medium is presented, storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
In a seventh aspect, an electronic device is provided, including:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
training to obtain a scene classification model based on the usage scene features and the corresponding usage scene labels of the light-weight applications, wherein the scene classification model is used for predicting the usage scene categories of the light-weight applications.
In an eighth aspect, a computer-readable storage medium is presented, the computer-readable storage medium storing one or more programs that, when executed by an electronic device that includes a plurality of application programs, cause the electronic device to:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
training to obtain a scene classification model based on the usage scene features and the corresponding usage scene labels of the light-weight applications, wherein the scene classification model is used for predicting the usage scene categories of the light-weight applications.
As can be seen from the technical solutions provided in the embodiments of the present specification, the embodiments of the present specification have at least one of the following technical effects:
one or more embodiments provided in this specification can acquire page content data, user behavior data, and a privacy data list acquired by a target lightweight application, and then use the page content data and the user behavior data of the target lightweight application as inputs of a scene classification model to predict a usage scene type of the target lightweight application through the scene classification model, and can determine whether an abnormal acquisition behavior exists in the target lightweight application based on an acquirable privacy data list corresponding to the usage scene type of the target lightweight application and the privacy data list acquired by the target lightweight application. The identification of the abnormal acquisition behaviors of light-weight applications such as small programs and the like is changed from passive verification to active identification, and the scene classification model is used for identifying the category of the used scene, so that the identification efficiency is improved on one hand; on the other hand, the privacy of the user is protected, and more secure service experience is brought to the user.
One or more embodiments provided in this specification can obtain page content data, user behavior data, and usage scenario tags of a plurality of lightweight applications, and then train and obtain a scenario classification model from the page content data and the user behavior data of the plurality of lightweight applications based on the usage scenario features of the plurality of lightweight applications and the corresponding usage scenario tags. The use scenes of light-weight applications such as the small programs and the like are identified by using the trained scene classification model, so that on one hand, the identification efficiency of the use scenes of the small programs can be improved, and on the other hand, unnecessary human resources are saved.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort.
Fig. 1 is a schematic implementation flow diagram of an abnormal collection behavior identification method based on privacy data protection according to an embodiment of the present specification.
Fig. 2 is a schematic implementation flow diagram of a training method for a scene classification model according to an embodiment of the present disclosure.
Fig. 3 is a schematic flowchart of a method for training a scene classification model, which is provided in an embodiment of the present disclosure and applied to an actual scene.
Fig. 4 is a schematic structural diagram of an abnormal collection behavior recognition apparatus based on privacy data protection according to an embodiment of the present specification.
Fig. 5 is a schematic structural diagram of a training apparatus for a scene classification model according to an embodiment of the present disclosure.
Fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present specification.
Fig. 7 is a schematic structural diagram of another electronic device provided in an embodiment of the present specification.
Detailed Description
In order to make the objects, technical solutions and advantages of the present specification clearer, the technical solutions in the present specification will be clearly and completely described below with reference to the specific embodiments of the present specification and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of this document, and not all embodiments. All other embodiments obtained by a person skilled in the art without making creative efforts based on the embodiments in this document belong to the protection scope of this document.
The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.
In order to avoid the situation that a lightweight application such as a small program excessively collects private data of a user, one or more embodiments of the present specification provide an abnormal collection behavior identification method based on privacy data protection, which can acquire page content data, user behavior data, and a private data list collected by a target lightweight application, and then use the page content data and the user behavior data of the target lightweight application as inputs of a scene classification model to predict a usage scene category of the target lightweight application through the scene classification model, and can determine whether an abnormal collection behavior exists in the target lightweight application based on the collectable private data list corresponding to the usage scene category of the target lightweight application and the private data list collected by the target lightweight application.
Therefore, the abnormal acquisition behavior recognition based on the privacy data protection of the light-weight applications such as the small programs and the like is changed from passive verification to active recognition, and the scene classification model is used for recognizing the category of the used scene, so that the recognition efficiency is improved on one hand; on the other hand, the privacy of the user is protected, and more secure service experience is brought to the user.
It should be understood that the execution subject of the abnormal collection behavior identification method based on privacy data protection provided in the embodiments of the present specification may be, but is not limited to, a server, a computer, and the like, which can be configured to execute at least one of the user terminals of the method provided in the embodiments of the present specification, or the execution subject of the method may also be a client itself capable of executing the method.
For convenience of description, the following description will be made of an embodiment of the method, taking an execution subject of the method as a server capable of executing the method as an example. It is understood that the implementation of the method by the server is merely an exemplary illustration and should not be construed as a limitation of the method.
Fig. 1 is a schematic implementation flow diagram of an abnormal collection behavior identification method based on privacy data protection according to an embodiment of the present specification. The method of fig. 1 may include:
s110, acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
the target lightweight applications may specifically include fast applications, applets, H5 applications, and the like, i.e., lightweight applications that can be used by a user without installation.
The page content data of the target lightweight application includes text information, entity types and corresponding entity numbers in the page of the target lightweight application, and the entity types may be various objects in the page, such as entities like cats, dogs, houses, cars, and the like. The user behavior data in the target lightweight application comprises behavior data of clicking, sliding, paying, forwarding, inputting and the like of the user in a page of the target lightweight application, and characteristic data of a city where the user is located, a academic calendar, an age, an occupation and the like of the user. The private data list acquired by the target lightweight application may specifically be a private data list of the user actually acquired when the target lightweight application is used by the user, and may include, for example, private data such as an identity number of the user, a mobile phone number of the user, a gender of the user, an avatar of the user, and a nickname.
S120, taking the page content data and the user behavior data of the target lightweight application as the input of a scene classification model, and predicting the use scene category of the target lightweight application through the scene classification model;
it should be understood that lightweight applications such as applets often collect privacy data of a user when the user opens the application, for example, when a shopping applet opens in a chat application, the user is prompted to provide the user with a right to collect privacy data such as an avatar, a nickname, a contact address of the user in the chat application. Generally, when a user opens an applet, the user does not want to see whether the opened applet excessively collects the privacy data of the user, so that many applets may have the intention of excessively collecting the privacy data of the user, and the privacy data of the user is maliciously utilized or sold to achieve the purpose of additional profit.
In order to avoid excessive collection and utilization of the private data of the user, in one or more embodiments of the present specification, a scene classification model may be obtained by training in advance based on page content data of a plurality of lightweight applications, user behavior data, and usage scene tags of the lightweight applications, a usage scene type of a target lightweight application may be predicted by the scene classification model, and whether there is an abnormal collection behavior in the target lightweight application may be determined based on an acquirable private data list corresponding to the usage scene type of the target lightweight application and a private data list that the target lightweight application applies for collection.
S130, determining whether the target lightweight application has abnormal collection behaviors or not based on the collectable privacy data list corresponding to the usage scene category of the target lightweight application and the privacy data list collected by the target lightweight application.
The usage scenario categories of the lightweight application may include a shopping usage scenario, a train ticket purchasing usage scenario, a shared single train usage scenario, a learning tool usage scenario, and the like, and generally, user privacy data that needs to be collected by lightweight applications of different usage scenario categories may also be different. Lightweight applications such as shopping generally need to collect privacy data such as a shopping account number and a contact information of a user; the light application of purchasing train tickets needs to collect privacy data such as identity card numbers, ticket purchasing account numbers, contact ways and the like of users; the method comprises the steps that private data such as a login account and a contact mode of a user need to be collected by a shared bicycle light-weight application; lightweight applications of the learning tool class may only need to collect private data such as the user's login account.
That is, it is possible to determine whether the lightweight application has excessively collected the user privacy data, based on the privacy data list actually applied for collection by the lightweight application of different usage scenario types and the privacy data list that can be collected by the lightweight application corresponding to the different usage scenario types.
Optionally, determining whether the target lightweight application has an abnormal collection behavior based on the private data list collected by the target lightweight application and the target private data collection list, includes:
if the privacy data list acquired by the target lightweight application is consistent with the target privacy data acquisition list, determining that the target lightweight application does not have abnormal acquisition behaviors;
and if the private data list acquired by the target lightweight application is inconsistent with the target private data acquisition list, determining that the target lightweight application has abnormal acquisition behaviors.
Optionally, in order to avoid excessive collection of the privacy data of the user by the target lightweight application, after determining that there is an abnormal collection behavior in the target lightweight application, the method further includes:
a private data sending request of a target lightweight application is intercepted.
Taking the target lightweight application as a shopping lightweight application as an example, when the target lightweight application is opened and used by a user, the lightweight application only needs to collect privacy data information such as a shopping account number, a contact way, a receiving address and the like of the user, obviously, when the user carries out shopping, identity information of the user, such as an identification number, does not need to be presented in general. If the shopping application additionally collects the privacy data of the identity card number of the user, after the target lightweight application is determined to have abnormal collection behavior based on the privacy data list and the target privacy data collection list which are applied for collection by the target lightweight application, a sending request of the target lightweight application for the additionally collected privacy data can be intercepted, or a sending request of all the privacy data of the target lightweight application can be intercepted.
One or more embodiments provided in this specification can acquire page content data, user behavior data, and a privacy data list acquired by a target lightweight application, and then use the page content data and the user behavior data of the target lightweight application as inputs of a scene classification model to predict a usage scene type of the target lightweight application through the scene classification model, and can determine whether an abnormal acquisition behavior exists in the target lightweight application based on an acquirable privacy data list corresponding to the usage scene type of the target lightweight application and the privacy data list acquired by the target lightweight application. The identification of the abnormal acquisition behaviors of light-weight applications such as small programs and the like is changed from passive verification to active identification, and the scene classification model is used for identifying the category of the used scene, so that the identification efficiency is improved on one hand; on the other hand, the privacy of the user is protected, and more secure service experience is brought to the user.
Fig. 2 is a schematic implementation flow diagram of a training method for a scene classification model according to an embodiment of the present specification, including:
s210, acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications;
the page content data of the plurality of lightweight applications comprises text information, entity types and corresponding entity numbers in the pages of the plurality of lightweight applications, and the entity types can be various objects in the pages, such as entities of cats, dogs, houses, cars and the like. The user behavior data in the plurality of lightweight applications comprises behavior data of clicking, sliding, paying, forwarding, inputting and the like of a plurality of users in pages of the plurality of lightweight applications, and feature data of cities, academic calendars, ages, professions and the like of the plurality of users.
Before training of the scene classification model, the usage scene labels of the multiple lightweight applications are marked in a manual or machine marking mode, and the corresponding usage scene labels, such as shopping usage scene labels, ticket purchasing usage labels, learning tool usage labels and the like, are marked on the usage scenes of the multiple lightweight applications.
S220, extracting the use scene characteristics of the light-weight applications from the page content data and the user behavior data of the light-weight applications;
it should be understood that the page content data of the lightweight application usually includes text data and image data, and in order to extract corresponding feature data from the text data and the image data, one or more embodiments of the present disclosure may convert the image data into text data, and then concatenate all the text data to obtain a text field. Specifically, extracting usage scenario features of a plurality of lightweight applications from page content data and user behavior data of the lightweight applications comprises:
respectively acquiring a plurality of text messages in pages of a plurality of lightweight applications and entity types and the number in the pages of the plurality of lightweight applications from page content data of the plurality of lightweight applications;
respectively splicing a plurality of text messages in the pages of the light-weight applications and entity types and the number in the pages of the light-weight applications to obtain a plurality of text fields corresponding to the light-weight applications, wherein one text field is obtained by splicing the plurality of text messages, the entity types and the corresponding entity numbers in the corresponding light-weight applications;
and extracting the usage scene features of the light-weight applications from a plurality of text fields and user behavior data corresponding to the light-weight applications.
Optionally, extracting usage scenario features of the plurality of lightweight applications from a plurality of text fields and user behavior data corresponding to the plurality of lightweight applications includes:
respectively carrying out data preprocessing on a plurality of text fields corresponding to a plurality of lightweight applications;
respectively converting a plurality of text fields corresponding to a plurality of light-weight applications after data preprocessing operation into a plurality of corresponding word vectors;
extracting usage scene features of the plurality of lightweight applications from the plurality of word vectors and the user behavior data corresponding to the plurality of lightweight applications;
wherein the data preprocessing operation comprises a stop word eliminating operation.
Since there will usually exist some words and matches without practical meaning in the multiple text fields obtained by merging, such as "the" even "and" so "connecting words, these words have no excessive value and meaning to the scene classification process, and such words will also increase the calculation amount of classification, therefore, in one or more embodiments of the present specification, before converting multiple text fields corresponding to multiple applications into corresponding multiple word vectors, data preprocessing operations such as removing stop words and the like may be performed on the multiple text fields.
The method includes the steps of converting a plurality of text fields corresponding to a plurality of light-weight applications after data preprocessing operation into a plurality of corresponding word vectors, specifically, converting the plurality of text fields after data preprocessing operation into a plurality of corresponding word vectors by using a word vector dictionary obtained by corpus training or a word vector dictionary of an open source version. The word vector dictionary comprises mapping relations between a plurality of words and word vectors, and one word vector corresponds to a group of feature vectors.
The behavior characteristic data corresponding to the user behavior data can be obtained through a statistical analysis mode. The usage scenario features of the plurality of lightweight applications are extracted from the plurality of text fields and the user behavior data corresponding to the plurality of lightweight applications, and specifically, the usage scenario features of the plurality of lightweight applications can be obtained by merging the plurality of word vectors corresponding to the plurality of text fields and the behavior feature data corresponding to the user behavior data.
Optionally, in order to avoid missing features in the page of the lightweight application, one or more embodiments of the present specification may repeat the name of each entity type by a corresponding number of times based on the names and corresponding numbers of the entity types in the pages of the plurality of lightweight applications, and then concatenate the name of each entity type with the text information in the page of the lightweight application to obtain the text field of each lightweight application. Specifically, the splicing of the multiple text messages in the pages of the multiple lightweight applications and the entity types and the number in the pages of the multiple lightweight applications to obtain multiple text fields corresponding to the multiple lightweight applications includes:
respectively acquiring text fields corresponding to the entity types in the pages of the light-weight applications based on the names and the corresponding numbers of the entity types in the pages of the light-weight applications, wherein the text field corresponding to one entity type in one page of the light-weight applications comprises the names of the entity types in the corresponding number;
the method includes the steps that a plurality of text fields corresponding to a plurality of lightweight applications are obtained on the basis of splicing a plurality of text messages in pages of the lightweight applications and text fields corresponding to entity types in the pages of the lightweight applications.
And S230, training to obtain a scene classification model based on the use scene features of the light-weight applications and the corresponding use scene labels, wherein the scene classification model is used for predicting the use scene categories of the light-weight applications.
Optionally, training a scene classification model based on the usage scene features and the corresponding usage scene labels of the multiple lightweight applications includes:
and training to obtain a scene classification model through a multi-classification model based on the use scene features of a plurality of light-weight applications and the corresponding use scene labels.
The multi-classification model may specifically include an xgboost model, which is specifically an open source implementation of a gradient lifting tree model and can be used for classification and regression tasks.
The following describes in detail a training method of a scene classification model and an abnormal collection behavior recognition method based on privacy data protection provided in an embodiment of the present specification, taking lightweight application as an applet as an example, and combining the scene classification model and an application method flow diagram of the scene classification model shown in fig. 3, where the method includes:
s301, acquiring page content data of a plurality of applets, wherein the page content data comprises text information and image data displayed in an applet page, and the image data comprises entity types and corresponding quantity displayed in the applet page;
s302, user behavior data of a plurality of applets are obtained, wherein the user behavior data comprise behavior data of clicking, sliding, jumping, inputting, paying and the like of a user on an applet page;
s303, respectively splicing a plurality of character information in the pages of the plurality of small programs and entity types and the number in the pages of the plurality of small programs to obtain a plurality of text fields corresponding to the plurality of small programs, removing stop words from the plurality of text fields to remove redundant information in the plurality of text fields, and converting the plurality of text fields into a plurality of corresponding word vectors based on a preset word vector dictionary;
the word vector dictionary comprises a plurality of corresponding relations between the text fields and word vectors, and one word vector corresponds to a group of feature vectors.
S304, constructing a plurality of corresponding behavior characteristic data based on the user behavior data of the small programs;
specifically, feature data such as average operation frequency and operation time period of the user, and feature data such as a city where the user is located, age of the user, academic occupation and the like can be obtained through statistical analysis based on user behavior data of a plurality of applets.
S305, manually marking the use scene data of the small programs to obtain use scene labels of the small programs, wherein the use scene labels are used for representing information related to the use scene categories of the small programs;
s306, training to obtain a scene classification model through an xgboost multi-classification model based on a plurality of word vectors and behavior characteristic data corresponding to a plurality of small programs;
s307, taking the page content data and the user behavior data of the target small program as the input of a scene classification model, and predicting the use scene category of the target small program through the scene classification model;
s308, determining a private data acquisition list corresponding to the use scene category of the target applet;
s309, determining a privacy data list applied and collected by the target applet;
s310, comparing a private data acquisition list corresponding to the use scene category of the target applet with a private data list applied for acquisition, and judging whether the target applet has abnormal acquisition behaviors;
s311, if the private data collection list corresponding to the use scene type of the target small program is not consistent with the private data list applied for collection, determining that the target small program has abnormal collection behaviors, and intercepting the private data sending request of the target small program.
Taking an applet which takes a target applet as a shopping applet as an example, a private data list corresponding to the usage scene category of the target applet includes sensitive information such as a mobile phone number of a user, and if the private data list acquired by the target applet application also includes sensitive information such as an identification number, it can be determined that the target applet has an abnormal acquisition behavior. In this case, when the target applet transmits the private data of the user, the private data transmission request of the target applet can be intercepted, so that excessive collection of the private data of the user by the target applet is avoided.
One or more embodiments provided in this specification can obtain page content data, user behavior data, and usage scenario tags of a plurality of lightweight applications, and then train and obtain a scenario classification model from the page content data and the user behavior data of the plurality of lightweight applications based on the usage scenario features of the plurality of lightweight applications and the corresponding usage scenario tags. The use scenes of light-weight applications such as the small programs and the like are identified by using the trained scene classification model, so that on one hand, the identification efficiency of the use scenes of the small programs can be improved, and on the other hand, unnecessary human resources are saved.
Fig. 4 is a schematic structural diagram of an abnormal collection behavior recognition apparatus 400 based on privacy data protection according to an embodiment of the present specification. Referring to fig. 4, in a software implementation, the apparatus 400 for recognizing abnormal collection behavior based on protection of private data may include:
the acquiring unit 401 is configured to acquire page content data and user behavior data of a target lightweight application and a privacy data list applied and acquired by the target lightweight application;
a prediction unit 402, which takes the page content data and the user behavior data of the target lightweight application as the input of a scene classification model, so as to predict the usage scene category of the target lightweight application through the scene classification model;
a determining unit 403, configured to determine whether an abnormal collection behavior exists in the target lightweight application based on the collected private data list corresponding to the usage scenario category of the target lightweight application and the private data list collected by the target lightweight application.
Optionally, in an embodiment, the determining unit 403 is configured to:
if the privacy data list acquired by the target lightweight application is consistent with the target privacy data acquisition list, determining that the target lightweight application has no abnormal acquisition behavior;
and if the private data list acquired by the target lightweight application is inconsistent with the target private data acquisition list, determining that the target lightweight application has abnormal acquisition behaviors.
Optionally, in an embodiment, after the determining unit 403 determines that there is an abnormal acquisition behavior in the target lightweight application, the apparatus further includes:
an intercepting unit 404, intercepting the private data sending request of the target lightweight application.
The device 400 for recognizing abnormal collection behavior based on privacy data protection can implement the method in the embodiment of the method in fig. 1, and specifically refer to the method for recognizing abnormal collection behavior based on privacy data protection in the embodiment shown in fig. 1, which is not described again.
Fig. 5 is a schematic structural diagram of a training apparatus 500 for a scene classification model according to an embodiment of the present disclosure. Referring to fig. 5, in a software implementation, the training apparatus 500 for a scene classification model may include:
a data obtaining unit 501, configured to obtain page content data, user behavior data, and usage scenario tags of a plurality of lightweight applications;
a feature extraction unit 502, which extracts usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
the model training unit 503 is configured to train to obtain a scene classification model based on the usage scene features and the corresponding usage scene labels of the multiple lightweight applications, where the scene classification model is used to predict usage scene categories of the lightweight applications.
Optionally, in an embodiment, the feature extraction unit 502 is configured to:
respectively acquiring a plurality of text messages in the pages of the plurality of lightweight applications and entity types and quantity in the pages of the plurality of lightweight applications from the page content data of the plurality of lightweight applications;
respectively splicing a plurality of text messages in the pages of the plurality of lightweight applications and entity types and numbers in the pages of the plurality of lightweight applications to obtain a plurality of text fields corresponding to the plurality of lightweight applications, wherein one text field is obtained by splicing a plurality of text messages in the corresponding lightweight applications, names of entity types and corresponding entity numbers;
and extracting the usage scene characteristics of the plurality of lightweight applications from a plurality of text fields and user behavior data corresponding to the plurality of lightweight applications.
Optionally, in an embodiment, the feature extraction unit 502 is configured to:
respectively carrying out data preprocessing on a plurality of text fields corresponding to the plurality of lightweight applications;
respectively converting a plurality of text fields corresponding to the plurality of lightweight applications after the data preprocessing operation into a plurality of corresponding word vectors;
extracting usage scene features of the plurality of lightweight applications from the plurality of word vectors and user behavior data corresponding to the plurality of lightweight applications;
wherein the data preprocessing operation comprises a stop word elimination operation.
Optionally, in an embodiment, the feature extraction unit 502 is configured to:
respectively acquiring text fields corresponding to the entity types in the pages of the plurality of lightweight applications based on the names and the corresponding numbers of the entity types in the pages of the plurality of lightweight applications, wherein the text field corresponding to one entity type in one page of one lightweight application comprises the names of the entity types in the corresponding number;
and obtaining a plurality of text fields corresponding to the plurality of lightweight applications based on splicing the plurality of text information in the pages of the plurality of lightweight applications and the text fields corresponding to the entity types in the pages of the plurality of lightweight applications.
Optionally, in an embodiment, the model training unit 503 is configured to:
and training to obtain a scene classification model based on the using scene characteristics of the light-weight applications and the corresponding using scene labels through a multi-classification model.
The training apparatus 500 of the scene classification model can implement the method in the embodiment of the method shown in fig. 2 to 3, which specifically refers to the training method of the scene classification model in the embodiment shown in fig. 2 to 3, and is not described again.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present specification. Referring to fig. 6, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
And the processor reads a corresponding computer program from the nonvolatile memory to the memory and then runs the computer program, and an abnormal acquisition behavior recognition device based on privacy data protection is formed on a logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
The method executed by the anomaly collection behavior recognition device based on privacy data protection according to the embodiment disclosed in fig. 1 to fig. 3 of the present specification can be applied to a processor, or implemented by the processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in a hardware decoding processor, or in a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The electronic device may further execute the method in fig. 1, and implement the functions of the abnormal collection behavior recognition apparatus based on privacy data protection in the embodiment shown in fig. 1, which are not described herein again in this specification.
Embodiments of the present specification also propose a computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, are capable of causing the portable electronic device to perform the method of the embodiment shown in fig. 1, and in particular to perform the following:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
Of course, besides the software implementation, the electronic device in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present specification. Referring to fig. 7, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 7, but this does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads a corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the training device of the scene classification model on a logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
and training to obtain a scene classification model based on the usage scene features of the plurality of lightweight applications and the corresponding usage scene labels.
The method performed by the training apparatus for the scene classification model disclosed in the embodiments of fig. 2 and fig. 3 in the present specification may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in a hardware decoding processor, or in a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The electronic device may further execute the methods in fig. 2 and fig. 3, and implement the functions of the training apparatus for a scene classification model in the embodiments shown in fig. 2 and fig. 3, which are not described herein again in this specification.
This specification embodiment also proposes a computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a portable electronic device comprising a plurality of application programs, are capable of causing the portable electronic device to perform the method of the embodiment shown in fig. 2, and in particular to perform the following operations:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
and training to obtain a scene classification model based on the usage scene features of the plurality of lightweight applications and the corresponding usage scene labels.
Of course, besides the software implementation, the electronic device in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In short, the above description is only a preferred embodiment of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present specification shall be included in the protection scope of the present specification.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

Claims (14)

1. An abnormal collection behavior identification method based on privacy data protection comprises the following steps:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
2. The method of claim 1, wherein determining whether the target lightweight application has abnormal acquisition behavior based on the list of private data acquired by the target lightweight application and the list of private data that can be acquired corresponding to the usage scenario category of the target lightweight application comprises:
if the private data list acquired by the target lightweight application is consistent with the acquirable private data list corresponding to the use scene category of the target lightweight application, determining that the target lightweight application has no abnormal acquisition behavior;
and if the private data list acquired by the target lightweight application is not consistent with the acquirable private data list corresponding to the use scene category of the target lightweight application, determining that the target lightweight application has abnormal acquisition behaviors.
3. The method of claim 2, after determining that the target lightweight application has anomalous acquisition behavior, the method further comprising:
intercepting a private data sending request of the target lightweight application.
4. A training method of a scene classification model comprises the following steps:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications, wherein the page content data of the lightweight applications comprise text information, entity types and corresponding entity numbers in pages of the lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
training to obtain a scene classification model based on the usage scene features and the corresponding usage scene labels of the light-weight applications, wherein the scene classification model is used for predicting the usage scene categories of the light-weight applications.
5. The method of claim 4, extracting usage scenario features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications, comprising:
respectively acquiring a plurality of text messages in the pages of the plurality of lightweight applications and entity types and quantity in the pages of the plurality of lightweight applications from the page content data of the plurality of lightweight applications;
respectively splicing a plurality of text messages in the pages of the plurality of lightweight applications and entity types and numbers in the pages of the plurality of lightweight applications to obtain a plurality of text fields corresponding to the plurality of lightweight applications, wherein one text field is obtained by splicing a plurality of text messages in the corresponding lightweight applications, names of entity types and corresponding entity numbers;
and extracting the usage scene characteristics of the plurality of lightweight applications from a plurality of text fields and user behavior data corresponding to the plurality of lightweight applications.
6. The method of claim 5, wherein extracting usage scenario features of the plurality of lightweight applications from a plurality of text fields and user behavior data corresponding to the plurality of lightweight applications comprises:
respectively carrying out data preprocessing on a plurality of text fields corresponding to the plurality of lightweight applications;
respectively converting a plurality of text fields corresponding to the plurality of lightweight applications after the data preprocessing operation into a plurality of corresponding word vectors;
extracting usage scene features of the plurality of lightweight applications from the plurality of word vectors and user behavior data corresponding to the plurality of lightweight applications;
wherein the data preprocessing operation comprises a stop word elimination operation.
7. The method of claim 5, wherein the splicing the text information in the pages of the light-weight applications and the entity types and the number in the pages of the light-weight applications to obtain the text fields corresponding to the light-weight applications comprises:
respectively acquiring text fields corresponding to the entity types in the pages of the plurality of lightweight applications based on the names and the corresponding numbers of the entity types in the pages of the plurality of lightweight applications, wherein the text field corresponding to one entity type in one page of one lightweight application comprises the names of the entity types in the corresponding number;
and obtaining a plurality of text fields corresponding to the plurality of lightweight applications based on splicing the plurality of text information in the pages of the plurality of lightweight applications and the text fields corresponding to the entity types in the pages of the plurality of lightweight applications.
8. The method of claim 4, training a scene classification model based on the usage scenario features and corresponding usage scenario labels of the plurality of lightweight applications, comprising:
and training to obtain a scene classification model based on the using scene characteristics of the light-weight applications and the corresponding using scene labels through a multi-classification model.
9. An abnormal collection behavior recognition device based on privacy data protection comprises:
the acquisition unit is used for acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
the prediction unit takes the page content data and the user behavior data of the target light-weight application as the input of a scene classification model so as to predict the usage scene category of the target light-weight application through the scene classification model;
the determining unit is used for determining whether abnormal acquisition behaviors exist in the target light-weight application or not based on the acquirable private data list corresponding to the usage scene category of the target light-weight application and the private data list acquired by the target light-weight application.
10. An apparatus for training a scene classification model, comprising:
the data acquisition unit is used for acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications, wherein the page content data of the lightweight applications comprise text information, entity types and corresponding entity numbers in pages of the lightweight applications;
the characteristic extraction unit is used for extracting the usage scene characteristics of the light-weight applications from the page content data and the user behavior data of the light-weight applications;
and the model training unit is used for training to obtain a scene classification model based on the use scene characteristics of the light-weight applications and the corresponding use scene labels, and the scene classification model is used for predicting the use scene categories of the light-weight applications.
11. An electronic device, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
12. A computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
acquiring page content data, user behavior data and a privacy data list acquired by the target lightweight application;
taking page content data and user behavior data of the target lightweight application as input of a scene classification model so as to predict a usage scene category of the target lightweight application through the scene classification model;
and determining whether abnormal acquisition behaviors exist in the target lightweight application or not based on the acquirable private data list corresponding to the usage scene category of the target lightweight application and the private data list applied for acquisition by the target lightweight application.
13. An electronic device, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications, wherein the page content data of the lightweight applications comprise text information, entity types and corresponding entity numbers in pages of the lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
training to obtain a scene classification model based on the usage scene features and the corresponding usage scene labels of the light-weight applications, wherein the scene classification model is used for predicting the usage scene categories of the light-weight applications.
14. A computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
acquiring page content data, user behavior data and usage scene tags of a plurality of lightweight applications, wherein the page content data of the lightweight applications comprise text information, entity types and corresponding entity numbers in pages of the lightweight applications;
extracting usage scene features of the plurality of lightweight applications from page content data and user behavior data of the plurality of lightweight applications;
training to obtain a scene classification model based on the usage scene features and the corresponding usage scene labels of the light-weight applications, wherein the scene classification model is used for predicting the usage scene categories of the light-weight applications.
CN201911158814.7A 2019-11-22 2019-11-22 Abnormal collection behavior identification method and device based on privacy data protection Active CN110826006B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201911158814.7A CN110826006B (en) 2019-11-22 2019-11-22 Abnormal collection behavior identification method and device based on privacy data protection
TW109115226A TWI743773B (en) 2019-11-22 2020-05-07 Method and device for identifying abnormal collection behavior based on privacy data protection
PCT/CN2020/111725 WO2021098327A1 (en) 2019-11-22 2020-08-27 Private data protection-based method and device for abnormal collection behavior recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911158814.7A CN110826006B (en) 2019-11-22 2019-11-22 Abnormal collection behavior identification method and device based on privacy data protection

Publications (2)

Publication Number Publication Date
CN110826006A CN110826006A (en) 2020-02-21
CN110826006B true CN110826006B (en) 2021-03-19

Family

ID=69558415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911158814.7A Active CN110826006B (en) 2019-11-22 2019-11-22 Abnormal collection behavior identification method and device based on privacy data protection

Country Status (3)

Country Link
CN (1) CN110826006B (en)
TW (1) TWI743773B (en)
WO (1) WO2021098327A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826006B (en) * 2019-11-22 2021-03-19 支付宝(杭州)信息技术有限公司 Abnormal collection behavior identification method and device based on privacy data protection
CN111400705B (en) * 2020-03-04 2023-03-14 支付宝(杭州)信息技术有限公司 Application program detection method, device and equipment
CN112491815A (en) * 2020-11-11 2021-03-12 恒安嘉新(北京)科技股份公司 Information monitoring method, device, equipment and medium
CN115842656A (en) * 2021-01-07 2023-03-24 支付宝(杭州)信息技术有限公司 Management and control method and device based on private data calling
CN112835902A (en) * 2021-02-01 2021-05-25 上海上讯信息技术股份有限公司 Data asset identification and use method and equipment
CN115186260A (en) * 2021-03-26 2022-10-14 支付宝(杭州)信息技术有限公司 Applet risk detection method and device
CN113434847B (en) * 2021-06-25 2023-10-27 深圳赛安特技术服务有限公司 Privacy module processing method and device of application program, electronic equipment and medium
CN113297609A (en) * 2021-07-27 2021-08-24 支付宝(杭州)信息技术有限公司 Method and device for monitoring privacy acquisition behaviors for small programs
CN113792341B (en) * 2021-09-15 2023-10-13 百度在线网络技术(北京)有限公司 Automatic detection method, device, equipment and medium for privacy compliance of application program
CN114793269A (en) * 2022-03-25 2022-07-26 岚图汽车科技有限公司 Control method of camera and related equipment

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101539841B1 (en) * 2013-05-30 2015-07-28 제주대학교 산학협력단 Method and system for rrotecting data service policy based in smart grid power network
CN104966031A (en) * 2015-07-01 2015-10-07 复旦大学 Method for identifying permission-irrelevant private data in Android application program
CN107958154A (en) * 2016-10-17 2018-04-24 中国科学院深圳先进技术研究院 A kind of malware detection device and method
CN109495727A (en) * 2019-01-04 2019-03-19 京东方科技集团股份有限公司 Intelligent control method and device, system, readable storage medium storing program for executing
CN109933503A (en) * 2019-02-13 2019-06-25 平安科技(深圳)有限公司 User's operation risk factor determines method, apparatus and storage medium, server
CN109960753A (en) * 2019-02-13 2019-07-02 平安科技(深圳)有限公司 Detection method, device, storage medium and the server of equipment for surfing the net user
CN110213236A (en) * 2019-05-05 2019-09-06 深圳市腾讯计算机系统有限公司 Determine method, electronic equipment and the computer storage medium of service security risk
CN110428091A (en) * 2019-07-10 2019-11-08 平安科技(深圳)有限公司 Risk Identification Method and relevant device based on data analysis
CN110457694A (en) * 2019-07-29 2019-11-15 腾讯科技(深圳)有限公司 Message prompt method and device, scene type identification based reminding method and device
CN110475014A (en) * 2018-05-11 2019-11-19 北京三星通信技术研究有限公司 The recognition methods of user's scene and terminal device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070111603A (en) * 2006-05-18 2007-11-22 이상규 Security system for client and server
US20130297256A1 (en) * 2012-05-04 2013-11-07 Jun Yang Method and System for Predictive and Conditional Fault Detection
CN105550584A (en) * 2015-12-31 2016-05-04 北京工业大学 RBAC based malicious program interception and processing method in Android platform
US11347871B2 (en) * 2018-01-16 2022-05-31 International Business Machines Corporation Dynamic cybersecurity protection mechanism for data storage devices
CN109344042B (en) * 2018-08-22 2022-02-18 北京中测安华科技有限公司 Abnormal operation behavior identification method, device, equipment and medium
CN109829300A (en) * 2019-01-02 2019-05-31 广州大学 APP dynamic depth malicious act detection device, method and system
CN109766488B (en) * 2019-01-16 2022-09-16 南京工业职业技术学院 Data acquisition method based on Scapy
CN110087099B (en) * 2019-03-11 2020-08-07 北京大学 Monitoring method and system for protecting privacy
CN110826006B (en) * 2019-11-22 2021-03-19 支付宝(杭州)信息技术有限公司 Abnormal collection behavior identification method and device based on privacy data protection

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101539841B1 (en) * 2013-05-30 2015-07-28 제주대학교 산학협력단 Method and system for rrotecting data service policy based in smart grid power network
CN104966031A (en) * 2015-07-01 2015-10-07 复旦大学 Method for identifying permission-irrelevant private data in Android application program
CN107958154A (en) * 2016-10-17 2018-04-24 中国科学院深圳先进技术研究院 A kind of malware detection device and method
CN110475014A (en) * 2018-05-11 2019-11-19 北京三星通信技术研究有限公司 The recognition methods of user's scene and terminal device
CN109495727A (en) * 2019-01-04 2019-03-19 京东方科技集团股份有限公司 Intelligent control method and device, system, readable storage medium storing program for executing
CN109933503A (en) * 2019-02-13 2019-06-25 平安科技(深圳)有限公司 User's operation risk factor determines method, apparatus and storage medium, server
CN109960753A (en) * 2019-02-13 2019-07-02 平安科技(深圳)有限公司 Detection method, device, storage medium and the server of equipment for surfing the net user
CN110213236A (en) * 2019-05-05 2019-09-06 深圳市腾讯计算机系统有限公司 Determine method, electronic equipment and the computer storage medium of service security risk
CN110428091A (en) * 2019-07-10 2019-11-08 平安科技(深圳)有限公司 Risk Identification Method and relevant device based on data analysis
CN110457694A (en) * 2019-07-29 2019-11-15 腾讯科技(深圳)有限公司 Message prompt method and device, scene type identification based reminding method and device

Also Published As

Publication number Publication date
CN110826006A (en) 2020-02-21
WO2021098327A1 (en) 2021-05-27
TWI743773B (en) 2021-10-21
TW202121215A (en) 2021-06-01

Similar Documents

Publication Publication Date Title
CN110826006B (en) Abnormal collection behavior identification method and device based on privacy data protection
CN110874440B (en) Information pushing method and device, model training method and device, and electronic equipment
CN108550046B (en) Resource and marketing recommendation method and device and electronic equipment
CN108399482B (en) Contract evaluation method and device and electronic equipment
CN109344406B (en) Part-of-speech tagging method and device and electronic equipment
CN110956275A (en) Risk prediction and risk prediction model training method and device and electronic equipment
CN109271611B (en) Data verification method and device and electronic equipment
CN113420229B (en) Social media information pushing method and system based on big data
CN111143654A (en) Crawler identification method and device for assisting in identifying crawler, and electronic equipment
CN114758327A (en) Method, device and equipment for identifying risks in code image
CN109195154B (en) Internet of things card fleeing user identification method and device
CN111353784A (en) Transfer processing method, system, device and equipment
CN112184143B (en) Model training method, device and equipment in compliance audit rule
CN111275071B (en) Prediction model training method, prediction device and electronic equipment
CN111598122B (en) Data verification method and device, electronic equipment and storage medium
CN110334936B (en) Method, device and equipment for constructing credit qualification scoring model
CN110443291B (en) Model training method, device and equipment
CN109120509B (en) Information collection method and device
US20210019553A1 (en) Information processing apparatus, control method, and program
CN113127767B (en) Mobile phone number extraction method and device, electronic equipment and storage medium
CN113111153A (en) Data analysis method, device, equipment and storage medium
CN112101308B (en) Method and device for combining text boxes based on language model and electronic equipment
CN110262938B (en) Content monitoring method and device
CN113988483B (en) Risk operation behavior control method, risk operation behavior model training method and electronic equipment
CN115689284A (en) Online shopping risk identification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant