CN111079182B - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN111079182B
CN111079182B CN201911310768.8A CN201911310768A CN111079182B CN 111079182 B CN111079182 B CN 111079182B CN 201911310768 A CN201911310768 A CN 201911310768A CN 111079182 B CN111079182 B CN 111079182B
Authority
CN
China
Prior art keywords
data
user
processing
access
isolation area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911310768.8A
Other languages
Chinese (zh)
Other versions
CN111079182A (en
Inventor
陈亮辉
王全斌
付琰
杨晓璇
彭炼钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201911310768.8A priority Critical patent/CN111079182B/en
Publication of CN111079182A publication Critical patent/CN111079182A/en
Application granted granted Critical
Publication of CN111079182B publication Critical patent/CN111079182B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6272Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database by registering files or documents with a third party
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures

Abstract

The application discloses a data processing method, a data processing device, data processing equipment and a storage medium, wherein a data processing platform provided in the method receives an access request which is sent by a data user and carries an identity, the access request is used for accessing data stored in a safety isolation area, the data in the safety isolation area cannot be copied, then whether the data user has the authority of accessing corresponding data or not is determined according to the identity, if the data user has the authority of accessing corresponding data, the corresponding data in the safety isolation area is analyzed and processed according to the access request to obtain a processing result, and the processing result is returned to the data user. The data provider uploads the data to the security isolation area which cannot be copied, the data user can only use the data on the platform, the data are effectively prevented from being leaked, and data security is improved.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of computers, and in particular, to a method, an apparatus, a device, and a storage medium for processing data in the field of data recommendation.
Background
With the development of internet technology, in order to meet the requirements of different internet users, more and more Applications (APPs) emerge, and content and e-commerce APPs need to be personalized for each different user, so that a recommendation system needs to be accessed.
The recommendation system provider can expand the user image based on the search data or the data of the third-party data provider, so that the recommendation effect of the client is improved, for example, in a scene of cold start recommendation of the user, the third-party data provider has a huge gain on the recommendation effect. And when the client uses the recommendation capability and the third-party data provided by the service provider, the data needs to be uploaded to the service provider, the data of the client and the service provider are fused, and the recommendation result of the user is calculated. However, this brings about a security problem, on one hand, the data provider wants the client to use only the data of the provider without leakage risk, on the other hand, the client wants to optimize the recommendation algorithm by using the data of the provider, and the data of the client is uploaded to the provider without leakage. In the prior art, a common mode is to ensure data security through a federal learning scheme and protocol authorization, and a data provider limits a user to dump and extract data and use the data for other purposes except recommendation through the mode of protocol authorization.
However, the system flexibility of the federal learning scheme is not enough, and only calculation can be performed based on a modeling process agreed in advance, so that the recommendation optimization requirements of different customers in different fields cannot be met; the delay performance is insufficient, and under the federal learning framework, each data provider must perform homomorphic encryption on own data and then transmit the data, so that some customers with higher delay requirements cannot be met.
The data security requirements of the data providers cannot be met, and particularly, data security accidents are easy to occur after the number of data providers increases.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, data processing equipment and a data processing storage medium, which are used for solving the problem that data security accidents are easy to occur particularly after data providers become more and cannot meet the data security requirements of the data providers in the prior art.
In a first aspect, the present application provides a data processing method, including:
receiving an access request sent by a data user, wherein the access request is used for accessing data stored in a security isolation area, and the access request comprises an identity of the data user; data in the securely isolated region cannot be copied;
determining whether the data user has the authority to access the corresponding data according to the identity of the data user;
if the data user is determined to have the right of accessing the corresponding data, analyzing and processing the corresponding data in the safe isolation area according to the access request to obtain a processing result;
and returning the processing result to the data user.
In one possible embodiment, the method further comprises:
and recording the access record of the data user to the data in the safety isolation area.
In one possible implementation, the access request comprises a predictive model training request;
analyzing and processing the corresponding data in the safety isolation area according to the access request to obtain a processing result, wherein the processing result comprises:
according to the prediction demand provided by the data user and the code developed in the preset on-line development environment, performing model training by adopting the data provided by the data user and/or the data in the safety isolation area to obtain a prediction model; wherein the processing result comprises the predictive model;
deploying the predictive model in an online predictive module.
In one possible embodiment, the access request comprises a user recommendation request;
analyzing and processing the corresponding data in the security isolation area according to the access request to obtain a processing result, wherein the processing result comprises:
performing prediction processing according to corresponding data in the safety isolation area by adopting a prediction model in an online prediction module to obtain a prediction result; the processing result comprises the predicted result.
In one possible embodiment, the method further comprises:
receiving data uploaded by a data provider; the data comprises common data and/or encrypted sensitive data;
storing the data in the securely isolated area.
In one possible embodiment, the method further comprises:
receiving a blocking policy configured by the data provider, the blocking policy including a condition of a data consumer not allowed to access;
correspondingly, the determining whether the data user has the right to access the corresponding data device according to the identity of the data user includes:
acquiring an access record of the data user according to the identity;
and determining whether the data user has the right to access the corresponding data according to the access record and the blocking policy.
In a possible embodiment, the determining whether the data user has access right to the data to be accessed according to the identity of the data user includes:
forwarding the identity of the data user to a corresponding data provider so that the data provider determines whether the data user has access right to the data needing to be accessed;
and receiving a determination result returned by the data provider, wherein the determination result is used for indicating that the data user has access right or does not have access right to the data needing to be accessed.
In a second aspect, the present application provides a data processing method, including:
sending an access request to a data processing platform, wherein the access request is used for accessing data stored in a security isolation area in the data processing platform, and the access request comprises an identity of a data user;
and if the data user has the right to access the corresponding data, receiving a processing result returned by the data processing platform.
In one possible embodiment, the access request comprises a predictive model training request, and the method further comprises:
and providing prediction requirements and data to the data processing platform, so that the data processing platform performs model training by adopting the data provided to the data processing platform and/or the data in the safety isolation area according to the prediction requirements and codes developed in a preset on-line development environment to obtain a prediction model.
In a possible implementation manner, the access request comprises a user recommendation request, and the processing result comprises a prediction result, wherein the prediction result is obtained by prediction by using a prediction model according to data in the safe isolation area.
In a third aspect, a data processing method in the present application includes:
receiving an identity of a data user sent by a data processing platform;
determining whether the data user can access the data to be accessed according to the identity;
and returning a determination result to the data processing platform, wherein the determination result is used for indicating that the data user has access right or does not have access right to the data needing to be accessed.
In one possible embodiment, the method further comprises:
and uploading data to a security isolation area of the data processing platform, wherein the data comprises common data and/or encrypted sensitive data.
In one possible embodiment, the method further comprises:
configuring a blocking policy to the data processing platform, the blocking policy including a condition of a data consumer that is not allowed access.
In a fourth aspect, the present application provides an apparatus for processing data, comprising:
the receiving module is used for receiving an access request sent by a data user, wherein the access request is used for accessing data stored in a security isolation area, and the access request comprises an identity of the data user;
the processing module is used for determining whether the data user has the authority of accessing the corresponding data according to the identity of the data user;
if the data user is determined to have the right to access the corresponding data, the processing module is further configured to analyze and process the corresponding data in the security isolation area according to the access request to obtain a processing result;
and the sending module is used for returning the processing result to the data user.
In one possible embodiment, characterized in that,
the processing module is further configured to record an access record of the data consumer to the data in the secure enclave.
In one possible embodiment, the access request comprises a predictive model training request;
the processing module is specifically configured to:
according to the prediction demand provided by the data user and the code developed in the preset on-line development environment, model training is carried out by adopting the data provided by the data user and/or the data in the safety isolation area to obtain a prediction model; wherein the processing result comprises the predictive model;
deploying the predictive model in an online predictive module.
In one possible embodiment, the access request comprises a user recommendation request;
the processing module is specifically configured to:
performing prediction processing according to corresponding data in the safety isolation area by adopting a prediction model in an online prediction module to obtain a prediction result; the processing result comprises the prediction result.
In a possible embodiment, the apparatus further comprises: a storage module;
the receiving module is also used for receiving data uploaded by a data provider; the data comprises common data and/or encrypted sensitive data;
the storage module is to store the data in the secure enclave.
In a possible implementation, the receiving module is further configured to:
receiving a blocking policy configured by the data provider, the blocking policy including a condition of a data consumer not allowed to access;
correspondingly, the processing module is further configured to:
obtaining an access record of the data user according to the identity;
and determining whether the data user has the right to access the corresponding data according to the access record and the blocking policy.
In a possible embodiment, the sending module is further configured to forward the identity of the data consumer to the corresponding data provider, so that the data provider determines whether the data consumer has access right to the data that needs to be accessed;
the receiving module is further configured to receive a determination result returned by the data provider, where the determination result is used to indicate that the data consumer has access right or does not have access right to the data that needs to be accessed.
In a fifth aspect, the present application provides an apparatus for processing data, comprising:
the data processing system comprises a sending module, a receiving module and a processing module, wherein the sending module is used for sending an access request to a data processing platform, the access request is used for accessing data stored in a security isolation area in the data processing platform, and the access request comprises an identity of a data user;
and the receiving module is used for receiving the processing result returned by the data processing platform if the data user has the right of accessing the corresponding data.
In one possible implementation, the access request includes a prediction model training request, and the sending module is further configured to:
and providing prediction requirements and data to the data processing platform, so that the data processing platform performs model training by adopting the data provided to the data processing platform and/or the data in the safety isolation area according to the prediction requirements and codes developed in a preset on-line development environment to obtain a prediction model.
In a possible implementation manner, the access request comprises a user recommendation request, and the processing result comprises a prediction result, wherein the prediction result is obtained by prediction by using a prediction model according to data in the safe isolation area.
In a sixth aspect, the present application provides an apparatus for processing data, comprising:
the receiving module is used for receiving the identity of the data user sent by the data processing platform;
the processing module is used for determining whether the data user can access the data to be accessed according to the identity;
and the sending module is used for returning a determination result to the data processing platform, and the determination result is used for indicating that the data user has access authority or does not have access authority on the data needing to be accessed.
In a possible implementation, the sending module is further configured to:
and uploading data to a security isolation area of the data processing platform, wherein the data comprises common data and/or encrypted sensitive data.
In a possible implementation, the sending module is further configured to:
configuring a blocking policy to the data processing platform, the blocking policy including a condition of a data consumer that is not allowed access.
In a seventh aspect, the present application provides an electronic device, comprising:
at least one processor, a memory, and an interface to communicate with other electronic devices;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of processing data according to any one of the first aspect.
In an eighth aspect, the present application provides an electronic device, comprising:
at least one processor, a memory, and an interface to communicate with other electronic devices;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of processing data according to any of the second aspect.
In a ninth aspect, the present application provides an electronic device comprising:
at least one processor, a memory, and an interface to communicate with other electronic devices;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of processing data according to any one of the third aspects.
In a tenth aspect, the present application provides a system for processing data, comprising: the electronic device of the seventh aspect, the electronic device of the eighth aspect, and the electronic device of the ninth aspect.
In an eleventh aspect, the present application provides a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of processing data of any one of the first to third aspects.
In a twelfth aspect, the present application provides a data processing method, including:
storing data uploaded by a data provider in a secure isolation area;
and determining that the data user can use the data in the secure isolation area according to the identity of the data user.
One embodiment in the above application has the following advantages or benefits: the data processing platform receives an access request carrying an identity identifier and sent by a data user, the access request is used for accessing data stored in a safe isolation area, the data in the safe isolation area cannot be copied, then whether the data user has the right of accessing the corresponding data or not is determined according to the identity identifier, if the data user has the right of accessing the corresponding data is determined, the corresponding data in the safe isolation area is analyzed and processed according to the access request to obtain a processing result, and the processing result is returned to the data user. The data provider uploads the data to the security isolation area which cannot be copied, the data user can only use the data on the platform, the data are effectively prevented from being leaked, and data security is improved.
Other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
fig. 1 is an application scenario of the data processing method provided in the present application;
fig. 2 is another application scenario of the data processing method provided in the present application;
fig. 3 is a flowchart of a first embodiment of a data processing method provided in the present application;
fig. 4 is a flowchart of a second embodiment of a data processing method provided in the present application;
fig. 5 is a flowchart of a third embodiment of a data processing method provided in the present application;
FIG. 6 is a schematic flow chart of a recommendation system provided herein;
FIG. 7 is a schematic flow chart of a data joint enhancement recommendation system provided in the present application;
FIG. 8 is a schematic structural diagram of a first embodiment of a data processing apparatus provided in the present application;
fig. 9 is a schematic structural diagram of a second embodiment of a data processing apparatus provided in the present application;
fig. 10 is a schematic structural diagram of a third embodiment of a data processing apparatus provided in the present application;
fig. 11 is a schematic structural diagram of a fourth embodiment of a data processing apparatus provided in the present application;
fig. 12 is a block diagram of an electronic device for implementing a data processing method according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application to assist in understanding, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
With the development of internet technology, in order to meet the requirements of various internet users, a recommendation system provider can expand user images based on search data provided by a search engine or data of a third-party data provider, so as to improve the recommendation effect of a client, for example, in a scene of cold start recommendation of a user, the third-party data provider has a huge gain on the recommendation effect. And when the client uses the recommendation capability and the third-party data provided by the service provider, the data needs to be uploaded to the service provider, the data of the client and the service provider are fused, and the recommendation result of the user is calculated.
However, this brings about a security problem, on one hand, the data provider wants the client to use only the data of the provider without leakage risk, on the other hand, the client wants to optimize the recommendation algorithm by using the data of the provider, and the data of the client is uploaded to the provider without leakage.
The existing recommendation and optimization method for solving multiple data generally adopts the following method: according to the federal learning scheme, recommended key steps are simplified into model training, self data of different data providers cannot be transmitted out of the local, and characteristic data calculated and encrypted through a model training algorithm are transmitted in multiple ways, so that the effect of optimizing the model is achieved. And in the protocol authorization scheme, a data provider limits a user to dump and extract data in a protocol authorization mode and uses the data for other purposes except recommendation.
However, the system flexibility of the federal learning scheme is not enough, and only calculation can be performed based on a modeling process agreed in advance, so that the recommendation optimization requirements of different customers in different fields cannot be met; the delay performance is insufficient, and under the federal learning framework, each data provider must perform homomorphic encryption on own data and then transmit the data, so that some customers with higher delay requirements cannot be met.
The agreement authority can only legally restrict the data user and cannot technically restrict the data user from leaking the data, so that the data security requirement of the data provider cannot be met. Particularly, after the number of data providers is increased, the whole process platform is uncontrollable, and data safety accidents are easy to happen. That is, the current solution cannot absolutely guarantee the security of data.
Based on the above-mentioned security problems and the need for optimization of various requirements, the present application provides a data processing scheme. Specifically, the whole thinking of this scheme is design a system scheme based on data security domain, provide the platform of special user data management and processing, hereinafter collectively referred to as data processing platform, keep apart data user and data provider, data provider uploads the data to the safety isolation region on this data platform and stores, the regional data of this safety isolation can not be duplicated, when data user needs the use data, apply for the permission, and directly use on this data processing platform, it can to obtain the processing result, effectively avoid data to be revealed, improve data security.
The following describes a data processing method provided in the present application with reference to specific embodiments.
Fig. 1 is an application scenario of the data processing method provided in the present application, and as shown in fig. 1, the scheme may be applied to a data processing system, where the system includes a data processing platform, at least one data consumer, and at least one data provider, the data processing platform includes a data processing module and a secure isolation area corresponding to each data provider, and is used to store data uploaded by each data provider, and the data in the secure isolation area cannot be copied. And the data processing platform provides different services for the data provider and the data user through the security gateway.
Fig. 2 is still another application scenario of the data processing method provided in the present application, as shown in fig. 2, a specific data management platform may be independently disposed outside a data processing platform and between a data user and the data processing platform, and a data calculation in the data platform in fig. 2 refers to a part for processing data, and is equivalent to a data processing module in fig. 1. The data user can apply for the authority to the data provider through the data management platform, and the scheme is not limited.
In the application scenario, the security gateway may at least provide functions of authorization service, data desensitization, data wind control, watermark service, and behavior audit, and may also configure more functions according to the actual application situation, which is not limited in this scheme.
Specifically, the authorization service: the data user applies for the authority, and the data provider can check the application, and manage the authority according to the row, column, label, security level and other authorities, including functions of authorization, authority withdrawal and the like. All data operations within the secure isolated domain area need to have associated permissions.
Data desensitization: the data provider encrypts the data that needs desensitization. For desensitized data, the data consumer can only see the encrypted attribute values.
Data wind control: the access request of the data user is blocked in real time, the data provider can configure a blocking strategy according to needs, complex access quota rules are added, certain conditions are met, the data user can be stopped from accessing the authority, and risk control is performed on the data of the data provider.
Watermark service: the access data is watermarked aiming at the accessed data user, so that a data leakage channel and a source can be conveniently traced, namely, the access record can be stored and recorded, and the use of the data is well documented.
And (4) behavior audit: and providing a user behavior auditing function, recording data request and response contents of a user to a data provider, and displaying a user behavior log on a log auditing platform.
Fig. 3 is a flowchart of a first embodiment of a data processing method provided in the present application, and as shown in fig. 3, the data processing method provided in the present embodiment specifically includes the following steps:
s101: and sending an access request to the data processing platform, wherein the access request is used for accessing the data stored in the security isolation area in the data processing platform, and the access request comprises the identity of the data user.
In this step, when a data user needs to use some data, the data user may interact through a security gateway provided by the data processing platform to initiate a data access process, and specifically may send an access request to the data processing platform, where the access request at least needs to carry an identity of the data user and may also carry related data, and this scheme is not limited.
For the data processing platform, an access request sent by a data user is received, where the access request is used to access data stored in the secure enclave, and the access request includes an identity of the data user.
The access request may be a query request, or may be a request for data analysis processing, for example: may be a user recommendation request, or may be a model training request, etc.
S102: and determining whether the data user has the right to access the corresponding data according to the identity of the data user.
In this step, after receiving an access request for accessing data in the data security isolation area, the data processing platform needs to first determine, according to the identity identifier therein, whether the data user has an authority to access the data in the data security isolation area, and if not, may reject the access request, and if so, may perform processing according to a specific access manner carried in the access request.
For example, the data processing platform may determine the authority of the data user at least in two ways:
first, if the data provider configures a blocking policy that specifies conditions of a data consumer that is not allowed to access, the data processing platform may determine whether the data consumer is allowed to access data in the data security isolation region based on the conditions in the blocking policy, and if not, determine that no access rights are available, and if so, determine that access rights are available.
For example, the data processing platform may also determine whether the data consumer has the right to access the corresponding data in conjunction with the blocking policy and the access record of the data consumer.
That is, obtaining the access record of the data user according to the identity; and determining whether the data user has the right to access the corresponding data according to the access record and the blocking policy.
In a second way, the data processing platform may forward the access request to a data provider, the data provider verifies the identity of the data consumer, determines whether to allow the data consumer to access the data, and after the data provider determines, returns a determination result to the data processing platform.
That is, the data processing platform forwards the identity of the data user to the corresponding data provider, so that the data provider determines whether the data user has access right to the data to be accessed; and then the data processing platform receives a determination result returned by the data provider, wherein the determination result is used for indicating that the data user has access right or does not have access right to the data needing to be accessed.
S103: and if the data user is determined to have the right of accessing the corresponding data, analyzing and processing the corresponding data in the safety isolation area according to the access request to obtain a processing result.
S104: and returning the processing result to the data user.
In this step, if it is determined that the data user has access to the data in the security isolation area, the data processing platform performs analysis processing on the data according to the data access request to obtain corresponding processing results, such as data analysis results, user recommendation results, model training results, and the like. And returning the obtained processing result to the data user.
Optionally, in a specific embodiment, the data processing platform may record each data user's access to the data each time to obtain a corresponding access record, where the access record may include an identity of the data user, access time, access operation content, and so on, so as to be able to check the data problem in tracing. I.e. recording the access record of the data consumer to the data in the securely isolated area.
According to the data processing method provided by the embodiment, the data processing platform determines whether a data user has permission to access the data in the security isolation area according to the identity in the received access request, the data provider can authorize the data user, the data provider uploads the data to the security isolation area which cannot be copied, the data user can only use the data on the platform, the data is effectively prevented from being leaked, and the data security is improved.
On the basis of the foregoing embodiment, in a possible implementation manner, the access request may be a prediction model training request, that is, a data using party needs to perform model training according to data in a secure isolation area, and a specific implementation process is as follows.
Fig. 4 is a flowchart of a second embodiment of the data processing method provided in the present application, and as shown in fig. 4, when the access request is a prediction model training request, the data processing method provided in this embodiment in step S103 in the foregoing embodiment specifically includes the following steps:
s1031: and performing model training by adopting data provided by the data user and/or data in the safety isolation area according to the prediction requirement provided by the data user and codes developed in a preset on-line development environment to obtain a prediction model.
Wherein the processing result comprises the predictive model.
In this step, the data processing platform may perform offline batch computation on the data in the security isolation area in advance, including offline algorithm computation and data processing computation, where the algorithm computation includes collaborative filtering, association rules, and computation of similarity of articles based on content, and the computation results are filled into the index system. The data processing module can also process and clean the data uploaded by the data user, splice different types of data, generate a sample for model training, and count the sample to obtain the distribution of different fields in the sample for guiding the model training. And finally, filling necessary data processing results such as forward and backward rows of articles and the like into an index system for an online system to use, wherein the online system refers to an operating system of a data processing platform and a system for providing data users and data providers to operate.
When the model training is carried out, a data user can provide prediction requirements and develop in an online development environment provided by the data processing platform, namely, codes are developed, and then training calculation of models such as logistic regression, FM and deep learning can be carried out according to data in a safety isolation area and/or data provided by the data user, so that a prediction model required by the data user is obtained.
Namely, the data using direction provides the data processing platform with the prediction requirement and the data, and the data processing platform performs model training by adopting the data provided for the data processing platform and/or the data in the safety isolation area according to the prediction requirement and the code developed in the preset on-line development environment to obtain a prediction model.
S1032: the prediction model is deployed in an online prediction module.
In this step, the obtained prediction model may be deployed in an online prediction module, which is also part of the data processing platform. After the data in the subsequent safety isolation area is updated and the data user needs to predict, the data can be analyzed and processed according to the prediction model to obtain a prediction result.
Optionally, model training and evaluation are also performed in an online training environment provided by the data processing platform, a user evaluates the model as required, and if the requirements are met, the model is audited and then deployed to an online model prediction module. If the requirements are not met, the user returns to the investigation environment repeatedly, and the modeling codes are modified, so that a model which fits the requirements of the data user better is trained.
For example, in a specific implementation manner, when the access request sent by the data user is a user recommendation request, prediction may be performed according to the prediction model in the online prediction module and the corresponding data in the security isolation area to obtain a prediction result, and the result is returned to the data user, in which case, the processing result is the prediction result here.
In the data processing method provided by this embodiment, an online model training function is provided in the data processing platform, a data user develops a corresponding code in an online development environment by sending an access request, performs model training, deploys a prediction model obtained by training in an online prediction module, and then may also directly perform prediction for the data user according to data continuously updated in the data processing platform to obtain a corresponding prediction result. In the whole process, the data provider and the data user are isolated, and meanwhile, the process that the data user directly contacts with the real data is blocked, so that the safety of the data is further ensured. And different data users in different fields can flexibly train the required prediction model according to respective requirements, and the requirements of various types of customers can be met.
On the basis of the two embodiments described above, all data processing procedures are processing of data in the secure isolation area, and a storage procedure of data is described below by a specific implementation.
Fig. 5 is a flowchart of a third embodiment of a data processing method provided in the present application, and as shown in fig. 5, the data processing method provided in the present embodiment includes the following steps:
s201: receiving data uploaded by a data provider; the data includes normal data and/or encrypted sensitive data.
In this step, the data provider uploads some data through an interface between the data provider and the data processing platform, the data includes common data, that is, non-sensitive data, and does not need to be encrypted, and/or the data provider can also include encrypted sensitive data, for the sensitive data, the data provider can perform desensitization processing in advance, and encrypt the data according to a certain rule, and in the subsequent use process, several data users can see that the data is also an encrypted attribute value, rather than real data.
For the data processing platform, the data uploaded by the data provider is received. The data may be information such as a user behavior log, article information, and user figure, but is not limited thereto.
S202: the data is stored in a securely isolated area.
In this step, after receiving the data uploaded by the data provider, the data processing platform stores the data in a security isolation area corresponding to the data provider, and one or more security isolation areas may be included in the data processing platform. If a security isolation zone is included all data providers will have data in this zone. Or a plurality of security isolation areas may be included, and each data provider corresponds to one security isolation area, which is not limited in this scheme.
S203: a blocking policy configured by a data provider is received, the blocking policy including a condition of a data consumer not allowed access.
This step S203 is an optional step. The data provider may further provide a blocking policy that specifies conditions for data consumers that are not allowed to access the data. For example, the identifier of the data user that is not allowed to use, the IP address, etc. may be used, which is not limited.
Correspondingly, in step S102, determining whether the data user has the right to access the corresponding data device according to the identity of the data user includes:
obtaining an access record of the data user according to the identity;
and determining whether the data user has the right to access the corresponding data according to the access record and the blocking policy.
According to the data processing method provided by the embodiment, the data provider uploads the data in the security isolation domain of the data processing platform, a plurality of measures are adopted to ensure data security, before the data of the data provider enters the isolation domain, desensitization encryption can be carried out on the data as required, when a data user applies for use, the data provider can also authorize at a field level and audit the user, in the using process, a supplier and a recommendation function service provider (data processing platform party) can audit a user code to ensure data security, after the data watermark is used, the data watermark is convenient for the platform party to track the data direction, and the data security is effectively improved.
With reference to any of the above embodiments, the following describes a data processing method provided by the present application by taking a recommendation system specifically applied to a user recommendation process as an example.
Fig. 6 is a schematic flow diagram of a recommendation system provided in the present application, and as shown in fig. 6, a specific implementation process of the recommendation system is as follows:
and 1, uploading data in the application, including information such as user behavior logs, article information and pictures of the user in the application, to a data provider through an interface, wherein the uploaded data can be stored in a persistent storage unit on a platform, such as HDFS (Hadoop distributed file system) of Google, BOS (British operating System) of Baidu and the like.
And 2, the offline calculation task is realized through a processing module provided by a data processing platform, the persistent storage unit is divided into a plurality of security isolation areas for data storage, on the premise, the offline calculation task uses data on the persistent storage unit to perform offline batch calculation including offline algorithm calculation and data processing calculation, the algorithm calculation includes collaborative filtering, association rules, calculation of article similarity based on content and the like, and the calculation result is filled into an index system. The data processing module processes and cleans data uploaded by a client, splices different types of data, generates a sample for model training, and also calculates the sample to obtain the distribution of different fields in the sample for guiding model training. And finally, filling necessary data processing results such as forward discharge, backward discharge and the like of the articles into the index system for the online system to use.
And 3, the model training module is also realized in the processing module in the data processing platform, the model training module reads the pre-estimated sample, performs training calculation of models such as logistic regression, FM and deep learning, and deploys the models to a model service for an online module to use after the models are generated. Particularly, in the joint recommendation system for data security, a data provider expands a request sample of a model service through the model service, and a data user is blocked from directly contacting real data.
And 4, the online server receives a recommendation request of a client, performs data requester authentication, and recalls in different modes (for example, vectorizing the user by using a model, then requesting the closest item vector in a vector index, or using non-desensitized user attributes, and then reversing the items matched with the corresponding attributes). And after the article is recalled, the index is requested to be arranged in the row, and the content of the article is supplemented. And then filtering according to the service requirement. Merging the articles of different recall methods, and using the model service to estimate the business indexes of the articles, such as click rate, conversion rate and the like. And calculating the score of each sample by integrating factors such as the estimated value, the timeliness, the recall algorithm weight and the like, then truncating, and finally scattering the recommended article queue in order to ensure the diversity presentation. And then returning the result to a recommendation server of the data user.
5. The indexing system includes, but is not limited to, the following: and (4) reversing the articles, wherein the content information of the articles comprises information such as labels and themes of the articles, and if the information from the label theme to the article id is stored in the index, the online server can recall the corresponding articles through the labels and the themes quickly. In the positive row of the articles, the content information of the articles includes but is not limited to information such as tags, subjects, contents, prices, authors, publishers and the like, and the recommendation system displays the information of all the articles in an ordering way, so that indexes from the article id to all the content information of the articles need to be stored for the online system to quickly access and obtain. Vector indexing, one of the recommendation algorithms commonly used in the recommendation industry at present, is to perform recall recommendation by vectorizing articles and users and using user vectors to find the nearest article vector.
In the above process, the provided online server, the data service, the model deployment, the offline computation task, the index system, the model training, and the persistent storage unit are all implemented in the data processing platform in the architecture shown in fig. 1 or fig. 2, and may provide interfaces for the data user and the data provider to perform data interaction, for example, may provide corresponding clients to enable the data user and the data provider to access.
Fig. 7 is a schematic flow diagram of the data joint enhancement recommendation system provided in the present application, and as shown in fig. 7, a specific implementation process of the data joint enhancement recommendation system is as follows:
data of a data user and data of a data provider of a third party are transmitted to a safety isolation area of the data processing platform in an interface mode. The data of the data user is the data used by the customer in the recommendation field, such as commodity information in the e-commerce recommendation field, user click purchasing behavior, and the like. Customer data is needed in both research and online training environments. The data of the data provider is data which has an enhancement effect on the customer recommendation service, the data user initiates an application, the data provider authorizes the data, and desensitization encryption is performed according to needs. After entering the safety isolation area, data of a data provider is used for on-line actual training on one hand, and enters the supplier isolation area to establish data service for on-line recommendation service on the other hand. Whether data consumer own data or data provider data, is only accessible to designated IP whitelist machines within the secure enclave.
2. The development of the isolation domain as a data user, namely a development environment in which a client can operate, is mainly divided into three parts: research environment, online training environment, online service.
3. And (3) investigating an environment: the data provider provides sample data to a data user in a research environment for code level modeling, and the data user decrypts and characterizes own data and fuses the data with sample data of the data provider. Meanwhile, a data user can put forward data analysis requirements, characteristic analysis is carried out on specific group data, and the characteristics of the data provided by the data provider are determined. And after the data user determines the data range and the use method, developing a modeling code.
4. An online training environment: the modeling codes developed by a data user are synchronized to an online training environment, the data provided by a full data provider can be contacted in the online training environment, but the data user only can operate the codes developed by the development environment and cannot directly observe the full data, the data analysis requirement is audited and implemented by recommendation service provider personnel, and the result is returned to the data user. And in an online training environment, model training and evaluation are also carried out, a user evaluates the model according to needs, and if the requirements are met, the model is audited and then deployed to an online model prediction module. If the requirements are not met, the user repeatedly returns to the investigation environment to modify the modeling code.
5. And (3) online service: in the safe isolation area, a data user can develop an online service according to business conditions, respond to an external recommendation request, use data of a data provider to perform recommendation prediction on a user, and realize a series of business-related logics including but not limited to recommended article information filling, diversity guarantee, recommended article filtering and duplicate removal and the like.
6. After the steps of investigation, online training, online service development and deployment and the like are carried out on a data user, a service for external calling can be built, the deployed external service is authenticated, accessed and forwarded through a cloud environment of a recommended service provider, a request is sent to an online service in an isolation domain, the online service requests a model prediction service according to a demand, the model prediction service accesses a data characterization service in the isolation domain of the data provider, a recommended prediction request is completed, and a result is returned.
In the recommendation system provided in the above example, the whole set of data enhancement recommendation system is deployed on the cloud of the recommendation service provider, that is, the data processing platforms may be deployed in the server of the service provider, the client may purchase the required resources by himself, and the storage resources and the computing resources required by each step are bound to the resources purchased by the client, which reduces resource consumption. The safety isolation region designed in the scheme has a plurality of measures to guarantee data safety, before data of a data provider enters the isolation region, desensitization encryption can be carried out on the data according to needs, when a user applies for use, the data provider can authorize at a field level and audit the user, in the using process, the data provider and a recommendation function service provider (platform) can audit a data user code to ensure data safety, and after the use, data watermarking facilitates the platform to track the data to go. After the data user takes the data sample, the strategy can be customized according to the self service requirement, the code level development is carried out in the development environment provided by the platform side, and meanwhile, the platform side can upgrade the development environment and the process according to the requirements of different services.
According to the data processing method, data uploaded by a data provider are stored in a safe isolation area; and determining that the data user can use the data in the secure isolation area according to the identity of the data user. Compared with the prior art, the method has the following advantages:
the data use party can develop any recommendation logic which does not violate data safety according to self business requirements under the environment provided by the platform party by combining flexibly and matching with code auditing.
Data security, based on the data scheme of the security isolation area, the data can be authorized and used, and a data provider encrypts according to the security level condition, and data operation and destination tracking can be realized.
The time delay meets the requirement, and compared with federal learning, homomorphic encryption with longer time delay is not needed, so that the online time delay requirement of general internet services can be met while the data security is ensured.
Fig. 8 is a schematic structural diagram of a first embodiment of a data processing apparatus provided in the present application. As shown in fig. 8, the apparatus may be integrated in or implemented by an electronic device, and the electronic device may be a server, a cloud server, a computer, or other service devices, which is not limited thereto. The data processing device 10 comprises:
a receiving module 11, configured to receive an access request sent by a data consumer, where the access request is used to access data stored in a secure isolation area, and the access request includes an identity of the data consumer;
the processing module 12 is configured to determine whether the data user has an authority to access the corresponding data according to the identity of the data user;
if the data user is determined to have the right to access the corresponding data, the processing module is further configured to analyze and process the corresponding data in the security isolation area according to the access request to obtain a processing result;
and the sending module 13 is configured to return the processing result to the data user.
In an alternative implementation, the processing module 12 is further configured to record an access record of the data user to the data in the secure enclave.
Optionally, the access request comprises a prediction model training request;
the processing module 12 is specifically configured to:
according to the prediction demand provided by the data user and the code developed in the preset on-line development environment, performing model training by adopting the data provided by the data user and/or the data in the safety isolation area to obtain a prediction model; wherein the processing result comprises the predictive model;
deploying the predictive model in an online predictive module.
Optionally, the access request includes a user recommendation request;
the processing module 12 is specifically configured to:
performing prediction processing according to corresponding data in the safety isolation area by adopting a prediction model in an online prediction module to obtain a prediction result; the processing result comprises the prediction result.
The data processing apparatus provided by the foregoing several embodiments is used to implement the technical solution of the data processing platform in any of the foregoing method embodiments, and the implementation principle and the technical effect are similar, and are not described herein again.
Fig. 9 is a schematic structural diagram of a second data processing apparatus according to the present application. As shown in fig. 9, on the basis of the above embodiment, the data processing apparatus 10 further includes 14: a storage module;
the receiving module 11 is further configured to receive data uploaded by a data provider; the data comprises common data and/or encrypted sensitive data;
the storage module 14 is configured to store the data in the secure enclave.
Optionally, the receiving module 11 is further configured to:
receiving a blocking policy configured by the data provider, the blocking policy including a condition of a data consumer not allowed to access;
correspondingly, the processing module 12 is further configured to:
obtaining an access record of the data user according to the identity;
and determining whether the data user has the right to access the corresponding data according to the access record and the blocking policy.
Optionally, the sending module 13 is further configured to forward the identifier of the data consumer to a corresponding data provider, so that the data provider determines whether the data consumer has an access right to the data to be accessed;
the receiving module 11 is further configured to receive a determination result returned by the data provider, where the determination result is used to indicate that the data consumer has an access right or does not have an access right to the data that needs to be accessed.
The data processing apparatus provided by the foregoing several embodiments is used to implement the technical solution of the data processing platform in any of the foregoing method embodiments, and the implementation principle and the technical effect are similar, and are not described herein again.
Fig. 10 is a schematic structural diagram of a third embodiment of a data processing apparatus provided in the present application. As shown in fig. 10, the data processing apparatus 20 includes:
a sending module 21, configured to send an access request to a data processing platform, where the access request is used to access data stored in a secure isolation area in the data processing platform, and the access request includes an identity of a data consumer;
and the receiving module 22 is configured to receive a processing result returned by the data processing platform if the data user has the right to access the corresponding data.
Optionally, if the access request includes a prediction model training request, the sending module 21 is further configured to:
and providing prediction requirements and data to the data processing platform, so that the data processing platform performs model training by adopting the data provided to the data processing platform and/or the data in the safety isolation area according to the prediction requirements and codes developed in a preset on-line development environment to obtain a prediction model.
Optionally, the access request includes a user recommendation request, and the processing result includes a prediction result, where the prediction result is obtained by using a prediction model to predict according to data in the security isolation area.
The data processing apparatus provided by the foregoing several embodiments is used to implement the technical solution of the data user in any of the foregoing method embodiments, and the implementation principle and technical effect are similar, and are not described herein again.
Fig. 11 is a schematic structural diagram of a fourth embodiment of a data processing apparatus provided in the present application. As shown in fig. 11, the data processing apparatus 30 includes:
the receiving module 31 is configured to receive an identity of a data user sent by the data processing platform;
the processing module 32 is configured to determine whether the data user can access the data to be accessed according to the identity;
a sending module 33, configured to return a determination result to the data processing platform, where the determination result is used to indicate that the data user has an access right or does not have an access right to the data that needs to be accessed.
Optionally, the sending module 33 is further configured to:
and uploading data to a security isolation area of the data processing platform, wherein the data comprises common data and/or encrypted sensitive data.
Optionally, the sending module 33 is further configured to:
configuring a blocking policy to the data processing platform, the blocking policy including a condition of a data consumer that is not allowed access.
The data processing apparatus provided by the foregoing several embodiments is used to implement the technical solution of the data provider in any of the foregoing method embodiments, and the implementation principle and technical effect thereof are similar, and are not described herein again.
It should be noted that the division of the modules of the apparatus provided in the above embodiments is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these modules can all be implemented in the form of software invoked by a processing element; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by the processing element, and part of the modules can be realized in the form of hardware. For example, the processing module may be a processing element that is separately configured, or may be integrated into a chip of the apparatus, or may be stored in a memory of the apparatus in the form of program code, and a processing element of the apparatus calls and executes a function of the processing module. Other modules are implemented similarly. In addition, all or part of the modules can be integrated together or can be independently realized. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in the form of software.
Furthermore, the application provides an electronic device corresponding to the data processing platform, an electronic device corresponding to the data user, and an electronic device corresponding to the data provider.
Fig. 12 is a block diagram of an electronic device for implementing a data processing method according to an embodiment of the present application. As shown in fig. 12, the electronic device is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein. The electronic device may be the electronic device of the data processing platform, may also be an electronic device of a data user, and may also be an electronic device of a data provider, which is not limited in this embodiment.
As shown in fig. 12, the electronic apparatus includes: one or more processors 1001, memory 1002, and interfaces for connecting the various components, including high-speed and low-speed interfaces, as well as interfaces for communicating with other electronic devices. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). Fig. 11 illustrates an example of one processor 1001.
The memory 1002 is a non-transitory computer readable storage medium provided herein. The memory stores instructions executable by at least one processor, so that the at least one processor executes the processing method of data corresponding to any execution subject provided by the application. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the methods provided herein.
The memory 1002, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as corresponding program instructions/modules (e.g., storage modules shown in fig. 9) in the processing method of data in the embodiments of the present application. The processor 1001 executes various functional applications of the server and data processing, that is, implements a processing method of data corresponding to any execution subject in the above method embodiments, by executing the non-transitory software program, instructions, and modules stored in the memory 1002.
The memory 1002 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the data storage area may store data, such as data provided by parties stored in the data processing platform, or tertiary data in a secure isolation area, etc. Further, the memory 1002 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 1002 may optionally include memory located remotely from the processor 1001, which may be connected to data processing electronics over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Further, the electronic device may further include: an input device 1003 and an output device 1004. The processor 1001, the memory 1002, the input device 1003, and the output device 1004 may be connected by a bus or other means, and the bus connection is exemplified in fig. 12.
The input device 1003 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the data processing electronic apparatus, such as an input device like a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, etc. The output devices 1004 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
Further, the present application also provides a non-transitory computer readable storage medium storing computer instructions, which are executed by a processor to implement the technical solution provided by any of the foregoing method embodiments.
Further, the present application also provides a data processing system, which includes an electronic device on the data processing platform side, an electronic device on the data provider side, and an electronic device on the data consumer side. The technical solutions provided in the foregoing method embodiments are implemented together.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (32)

1. A method for processing data, comprising:
receiving an access request sent by a data user, wherein the access request is used for accessing data stored in a security isolation area, and the access request comprises an identity of the data user; data in the securely isolated region cannot be copied;
determining whether the data user has the authority to access the corresponding data or not according to the identity of the data user;
if the data user is determined to have the right of accessing the corresponding data, analyzing and processing the corresponding data in the safe isolation area according to the access request to obtain a processing result;
and returning the processing result to the data user.
2. The method of claim 1, further comprising:
recording access records of the data users to the data in the secure enclave.
3. The method of claim 1, wherein the access request comprises a predictive model training request;
analyzing and processing the corresponding data in the safety isolation area according to the access request to obtain a processing result, wherein the processing result comprises:
according to the prediction demand provided by the data user and the code developed in the preset on-line development environment, performing model training by adopting the data provided by the data user and the data in the safe isolation area, or performing model training by adopting the data in the safe isolation area to obtain a prediction model; wherein the processing result comprises the predictive model;
deploying the predictive model in an online predictive module.
4. The method of claim 1, wherein the access request comprises a user recommendation request;
analyzing and processing the corresponding data in the safety isolation area according to the access request to obtain a processing result, wherein the processing result comprises:
performing prediction processing according to corresponding data in the safety isolation area by adopting a prediction model in an online prediction module to obtain a prediction result; the processing result comprises the predicted result.
5. The method according to any one of claims 1 to 4, further comprising:
receiving data uploaded by a data provider; the data comprises common data and/or encrypted sensitive data;
storing the data in the securely isolated area.
6. The method of claim 5, further comprising:
receiving a blocking policy configured by the data provider, the blocking policy including a condition of a data consumer not allowed to access;
correspondingly, the determining whether the data user has the right to access the corresponding data tool according to the identity of the data user includes:
acquiring an access record of the data user according to the identity;
and determining whether the data user has the right to access the corresponding data or not according to the access record and the blocking strategy.
7. The method of claim 5, wherein the determining whether the data user has access right to the data to be accessed according to the identity of the data user comprises:
forwarding the identity of the data user to a corresponding data provider so that the data provider determines whether the data user has access right to the data needing to be accessed;
and receiving a determination result returned by the data provider, wherein the determination result is used for indicating that the data user has access right or does not have access right to the data needing to be accessed.
8. A method for processing data, comprising:
sending an access request to a data processing platform, wherein the access request is used for accessing data stored in a security isolation area in the data processing platform, and the access request comprises an identity of a data user; wherein data in the securely isolated region cannot be copied;
and if the data user has the right to access the corresponding data, the data processing platform analyzes and processes the corresponding data in the safety isolation area according to the access request to obtain a processing result, and receives the processing result returned by the data processing platform.
9. The method of claim 8, wherein the access request comprises a predictive model training request, the method further comprising:
and providing prediction requirements and data to the data processing platform so that the data processing platform performs model training by adopting the data provided to the data processing platform and the data in the safety isolation area or performs model training by adopting the data in the safety isolation area according to the prediction requirements and codes developed in a preset on-line development environment to obtain a prediction model.
10. The method of claim 8, wherein the access request comprises a user recommendation request, and wherein the processing result comprises a prediction result predicted by a prediction model based on data in the secure enclave.
11. A method for processing data, comprising:
receiving an identity of a data user sent by a data processing platform;
determining whether the data user can access the data to be accessed according to the identity; wherein the data is stored in a securely quarantined area in the data processing platform, the data in the securely quarantined area being copyable;
returning a determination result to the data processing platform, wherein the determination result is used for indicating that the data user has access right or does not have access right to the data needing to be accessed;
if the data user has access right to the data to be accessed, the data processing platform analyzes and processes the corresponding data in the safety isolation area according to the access request to obtain a processing result, and returns the processing result to the data user; the access request is used for accessing the data stored in the security isolation area, and the access request comprises the identity of a data user.
12. The method of claim 11, further comprising:
and uploading data to a security isolation area of the data processing platform, wherein the data comprises common data and/or encrypted sensitive data.
13. The method of claim 12, further comprising:
configuring a blocking policy to the data processing platform, the blocking policy including a condition of a data consumer that is not allowed access.
14. An apparatus for processing data, comprising:
the receiving module is used for receiving an access request sent by a data user, wherein the access request is used for accessing data stored in a security isolation area, and the access request comprises an identity of the data user; data in the securely isolated region cannot be copied;
the processing module is used for determining whether the data user has the authority of accessing the corresponding data according to the identity of the data user;
if the data user is determined to have the right to access the corresponding data, the processing module is further configured to analyze and process the corresponding data in the security isolation area according to the access request to obtain a processing result;
and the sending module is used for returning the processing result to the data user.
15. The apparatus of claim 14,
the processing module is further configured to record an access record of the data consumer to the data in the secure enclave.
16. The apparatus of claim 14, wherein the access request comprises a predictive model training request;
the processing module is specifically configured to:
according to the prediction demand provided by the data user and the code developed in the preset on-line development environment, performing model training by adopting the data provided by the data user and the data in the safe isolation area, or performing model training by adopting the data in the safe isolation area to obtain a prediction model; wherein the processing result comprises the predictive model;
deploying the predictive model in an online predictive module.
17. The apparatus of claim 14, wherein the access request comprises a user recommendation request;
the processing module is specifically configured to:
performing prediction processing according to corresponding data in the safety isolation area by adopting a prediction model in an online prediction module to obtain a prediction result; the processing result comprises the prediction result.
18. The apparatus of any one of claims 14 to 17, further comprising: a storage module;
the receiving module is also used for receiving data uploaded by a data provider; the data comprises common data and/or encrypted sensitive data;
the storage module is to store the data in the secure enclave.
19. The apparatus of claim 18, wherein the receiving module is further configured to:
receiving a blocking policy configured by the data provider, the blocking policy including a condition of a data consumer not allowed to access;
correspondingly, the processing module is further configured to:
obtaining an access record of the data user according to the identity;
and determining whether the data user has the right to access the corresponding data according to the access record and the blocking policy.
20. The apparatus of claim 18,
the sending module is further configured to forward the identifier of the data consumer to a corresponding data provider, so that the data provider determines whether the data consumer has access right to the data to be accessed;
the receiving module is further configured to receive a determination result returned by the data provider, where the determination result is used to indicate that the data consumer has access right or does not have access right to the data that needs to be accessed.
21. An apparatus for processing data, comprising:
the data processing system comprises a sending module, a receiving module and a processing module, wherein the sending module is used for sending an access request to a data processing platform, the access request is used for accessing data stored in a security isolation area in the data processing platform, and the access request comprises an identity of a data user; wherein data in the securely isolated region cannot be copied;
and the receiving module is used for analyzing and processing the corresponding data in the safe isolation area by the data processing platform according to the access request to obtain a processing result and receiving the processing result returned by the data processing platform if the data user has the right to access the corresponding data.
22. The apparatus of claim 21, wherein the access request comprises a predictive model training request, and wherein the sending module is further configured to:
and providing prediction requirements and data for the data processing platform, so that the data processing platform performs model training by adopting the data provided for the data processing platform and the data in the safety isolation area or performs model training by adopting the data in the safety isolation area according to the prediction requirements and codes developed in a preset on-line development environment to obtain a prediction model.
23. The apparatus of claim 21, wherein the access request comprises a user recommendation request, and wherein the processing result comprises a prediction result predicted by using a prediction model according to data in the security isolation area.
24. An apparatus for processing data, comprising:
the receiving module is used for receiving the identity of the data user sent by the data processing platform;
the processing module is used for determining whether the data user can access the data to be accessed according to the identity; wherein the data is stored in a securely quarantined area in the data processing platform, the data in the securely quarantined area being copyable;
the sending module is used for returning a determination result to the data processing platform, wherein the determination result is used for indicating that the data user has access authority or does not have access authority on the data needing to be accessed;
the processing module is further configured to, if the data user has an access right to the data to be accessed, analyze and process the data corresponding to the secure isolation area by the data processing platform according to the access request to obtain a processing result, and return the processing result to the data user; the access request is used for accessing the data stored in the security isolation area, and the access request comprises the identity of a data user.
25. The apparatus of claim 24, wherein the sending module is further configured to:
and uploading data to a security isolation area of the data processing platform, wherein the data comprises common data and/or encrypted sensitive data.
26. The apparatus of claim 25, wherein the sending module is further configured to:
configuring a blocking policy to the data processing platform, the blocking policy including a condition of a data consumer that is not allowed access.
27. An electronic device, comprising:
at least one processor, a memory, and an interface to communicate with other electronic devices;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of processing data according to any one of claims 1 to 7.
28. An electronic device, comprising:
at least one processor, a memory, and an interface to communicate with other electronic devices;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of processing data according to any one of claims 8 to 10.
29. An electronic device, comprising:
at least one processor, a memory, and an interface to communicate with other electronic devices;
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of processing data according to any one of claims 11 to 13.
30. A system for processing data, comprising: the electronic device of claim 27, the electronic device of claim 28, and the electronic device of claim 29.
31. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute a method of processing data according to any one of claims 1 to 13.
32. A method of processing data, comprising:
storing data uploaded by a data provider in a secure isolation area;
determining whether the data user can use the data in the secure isolation area or not according to the identity of the data user, wherein the data in the secure isolation area cannot be copied;
if the data user can use the data in the safe isolation area, the data processing platform analyzes and processes the corresponding data in the safe isolation area according to the access request to obtain a processing result, and returns the processing result to the data user; the access request is used for accessing the data stored in the security isolation area, and the access request comprises the identity of a data user.
CN201911310768.8A 2019-12-18 2019-12-18 Data processing method, device, equipment and storage medium Active CN111079182B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911310768.8A CN111079182B (en) 2019-12-18 2019-12-18 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911310768.8A CN111079182B (en) 2019-12-18 2019-12-18 Data processing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111079182A CN111079182A (en) 2020-04-28
CN111079182B true CN111079182B (en) 2022-11-29

Family

ID=70315689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911310768.8A Active CN111079182B (en) 2019-12-18 2019-12-18 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111079182B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113569301A (en) * 2020-04-29 2021-10-29 杭州锘崴信息科技有限公司 Federal learning-based security computing system and method
CN112416912A (en) * 2020-10-14 2021-02-26 深圳前海微众银行股份有限公司 Method, device, terminal equipment and medium for removing duplicate of longitudinal federal data statistics
CN112232528B (en) * 2020-12-15 2021-03-09 之江实验室 Method and device for training federated learning model and federated learning system
CN112733152A (en) * 2021-01-22 2021-04-30 湖北宸威玺链信息技术有限公司 Sensitive data processing method, system and device
CN113011632B (en) * 2021-01-29 2023-04-07 招商银行股份有限公司 Enterprise risk assessment method, device, equipment and computer readable storage medium
CN113965346A (en) * 2021-08-31 2022-01-21 微神马科技(大连)有限公司 Design method for big data ecological unified security certification

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543463A (en) * 2018-10-11 2019-03-29 平安科技(深圳)有限公司 Data Access Security method, apparatus, computer equipment and storage medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100095365A1 (en) * 2008-10-14 2010-04-15 Wei-Chiang Hsu Self-setting security system and method for guarding against unauthorized access to data and preventing malicious attacks
CN104657674B (en) * 2015-01-16 2018-02-23 北京邮电大学 The insulation blocking system and method for private data in a kind of mobile phone
US10019887B1 (en) * 2017-03-21 2018-07-10 Satellite Tracking Of People Llc System and method for tracking interaction between monitored population and unmonitored population
CN109308418B (en) * 2017-07-28 2021-09-24 创新先进技术有限公司 Model training method and device based on shared data
CN111756754B (en) * 2017-07-28 2023-04-07 创新先进技术有限公司 Method and device for training model
CN108416230B (en) * 2018-03-23 2019-12-20 重庆市科学技术研究院 Data access method based on data isolation model
CN108632253B (en) * 2018-04-04 2021-09-10 平安科技(深圳)有限公司 Client data security access method and device based on mobile terminal
CN109446830A (en) * 2018-11-13 2019-03-08 中链科技有限公司 Data center environment information processing method and device based on block chain
CN110197084B (en) * 2019-06-12 2021-07-30 上海联息生物科技有限公司 Medical data joint learning system and method based on trusted computing and privacy protection

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109543463A (en) * 2018-10-11 2019-03-29 平安科技(深圳)有限公司 Data Access Security method, apparatus, computer equipment and storage medium

Also Published As

Publication number Publication date
CN111079182A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
CN111079182B (en) Data processing method, device, equipment and storage medium
US10503911B2 (en) Automatic generation of data-centric attack graphs
US10841328B2 (en) Intelligent container resource placement based on container image vulnerability assessment
JP7088913B2 (en) Introduce dynamic policies to detect threats and visualize access
CN111475728B (en) Cloud resource information searching method, device, equipment and storage medium
Hu et al. A review on cloud computing: Design challenges in architecture and security
US10609031B2 (en) Private consolidated cloud service architecture
US9699148B2 (en) Moving a portion of a streaming application to a public cloud based on sensitive data
US11354437B2 (en) System and methods for providing data analytics for secure cloud compute data
US10169548B2 (en) Image obfuscation
US11120157B2 (en) System and method for safe usage and fair tracking of user profile data
GB2606424A (en) Data access monitoring and control
US11227059B2 (en) Regulatory compliance for applications applicable to providing a service for regulatory compliance on a cloud
US11363094B2 (en) Efficient data processing in a mesh network of computing devices
US11062007B2 (en) Automated authentication and access
Varalakshmi et al. Trust management model based on malicious filtered feedback in cloud
US9824113B2 (en) Selective content storage with device synchronization
US11500950B2 (en) Digital search results generation
US20220198404A1 (en) Asset health score based on digital twin resources
US10757216B1 (en) Group profiles for group item recommendations
KR101825487B1 (en) Service system for providing digital photo frame with digital rights management service
US20200257825A1 (en) Customized display of filtered social media content using a private dislike button
US11593873B2 (en) Methods and systems for establishing trust in commerce negotiations
US20240004891A1 (en) System and Method for Generating an Improved User Interface for Data Analytics
US10530797B2 (en) Online presence interaction using a behavioral certificate

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant