CN116992458A - Programmable data processing method and system based on trusted execution environment - Google Patents

Programmable data processing method and system based on trusted execution environment Download PDF

Info

Publication number
CN116992458A
CN116992458A CN202311028439.0A CN202311028439A CN116992458A CN 116992458 A CN116992458 A CN 116992458A CN 202311028439 A CN202311028439 A CN 202311028439A CN 116992458 A CN116992458 A CN 116992458A
Authority
CN
China
Prior art keywords
task
model
initiator
sample data
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311028439.0A
Other languages
Chinese (zh)
Other versions
CN116992458B (en
Inventor
邦佩
陈超超
郑小林
朱明杰
鲍力成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Jinzhita Technology Co ltd
Original Assignee
Hangzhou Jinzhita Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Jinzhita Technology Co ltd filed Critical Hangzhou Jinzhita Technology Co ltd
Priority to CN202311028439.0A priority Critical patent/CN116992458B/en
Publication of CN116992458A publication Critical patent/CN116992458A/en
Application granted granted Critical
Publication of CN116992458B publication Critical patent/CN116992458B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45587Isolation or security of virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Storage Device Security (AREA)

Abstract

Embodiments of the present disclosure provide a programmable data processing method and system based on a trusted execution environment, the method including: responding to a task execution request sent by a task initiator, and creating a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment; determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into a virtual container; receiving model processing information sent by a task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through a virtual container of a loading sample data set; training the initial task model in a virtual container loaded with a sample data set according to a model training strategy, and feeding back a task execution result to a task initiator according to a training result. The privacy calculation of model training is carried out in the trusted execution environment of the task execution party, and the privacy and the safety of the sample data set are ensured.

Description

Programmable data processing method and system based on trusted execution environment
Technical Field
The embodiment of the specification relates to the technical field of privacy computing, in particular to a programmable data processing method based on a trusted execution environment.
Background
With the development of the internet and artificial intelligence, machine learning has been applied in many fields, such as risk assessment, speech recognition, natural language processing, and the like. Training a model by machine learning requires the use of large amounts of sample data, which is typically from different enterprises or users, and which typically involves private data. Therefore, it is necessary to ensure that the private data is not compromised maliciously during the model training process, and when training is performed using the private data from multiple data providers, it is also necessary to ensure that the private data of any one party is not acquired by other parties. Thus, there is a need for a machine learning method that can ensure the security of multiparty private data.
Disclosure of Invention
In view of this, the present embodiments provide a programmable data processing method based on a trusted execution environment. One or more embodiments of the present specification relate to a programmable data processing apparatus based on a trusted execution environment, a programmable data processing system based on a trusted execution environment, a computing device, a computer-readable storage medium, and a computer program, to solve the technical drawbacks of the prior art.
According to a first aspect of embodiments of the present disclosure, there is provided a programmable data processing method based on a trusted execution environment, applied to a task executor, including:
responding to a task execution request sent by a task initiator, and creating a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment;
determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container;
receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set;
training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result.
According to a second aspect of embodiments of the present specification, there is provided a programmable data processing system based on a trusted execution environment, the system comprising a task initiator and a task executor;
The task initiator generates and sends a task execution request to the task executor;
the task executive party responds to the task execution request to create a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment; determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container; receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set; training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result;
and the task initiator generates a local task model according to the task execution result.
According to a third aspect of embodiments of the present specification, there is provided a programmable data processing apparatus based on a trusted execution environment, for application to a task executor, comprising:
The creation module is configured to respond to a task execution request sent by a task initiator and create a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment;
the acquisition module is configured to determine at least one data provider according to the task execution request, acquire a sample data set corresponding to the at least one data provider and load the sample data set into the virtual container;
the determining module is configured to receive model processing information sent by the task initiator, and determine an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through a virtual container loading the sample data set;
and the training module is configured to train the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feed back a task execution result to the task initiator according to a training result.
According to a fourth aspect of embodiments of the present specification, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer executable instructions that, when executed by the processor, implement the steps of the programmable data processing method described above based on a trusted execution environment.
According to a fifth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing computer executable instructions which, when executed by a processor, implement the steps of the programmable data processing method described above based on a trusted execution environment.
According to a sixth aspect of embodiments of the present specification, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the programmable data processing method based on a trusted execution environment as described above.
The specification provides a programmable data processing method based on a trusted execution environment, which is applied to a task execution party and comprises the following steps: responding to a task execution request sent by a task initiator, and creating a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment; determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container; receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set; training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result.
According to the method and the device, a task executive side provides a trusted execution environment by creating a virtual container corresponding to a task execution request, a sample data set needed in the task execution request is loaded into the virtual container, an initial task model and a model training strategy corresponding to the initial task model are determined based on model processing information by the virtual container loaded with the sample data set, then the initial task model is trained in the virtual container according to the model training strategy, calculation of model training is conducted in the trusted execution environment of the task executive side, and privacy and safety of the sample data set are guaranteed.
Drawings
FIG. 1 is a schematic diagram of a programmable data processing method based on a trusted execution environment according to one embodiment of the present disclosure;
FIG. 2 is a flow chart of a programmable data processing method based on a trusted execution environment provided by one embodiment of the present description;
FIG. 3 is a process flow diagram of a programmable data processing method based on a trusted execution environment provided in one embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a programmable data processing apparatus based on a trusted execution environment according to one embodiment of the present disclosure;
FIG. 5 is a flow diagram of a programmable data processing system based on a trusted execution environment provided by one embodiment of the present description;
FIG. 6 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
Furthermore, it should be noted that, user information (including, but not limited to, user equipment information, user personal information, etc.) and data (including, but not limited to, data for analysis, stored data, presented data, etc.) according to one or more embodiments of the present disclosure are information and data authorized by a user or sufficiently authorized by each party, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions, and is provided with corresponding operation entries for the user to select authorization or denial.
First, terms related to one or more embodiments of the present specification will be explained.
Trusted execution environment: (Trusted Execution Environment, TEE) is commonly used to implement privacy protection for users, and it can be approximately considered that the TEE is a black box in hardware, and neither the code executed in the TEE nor the data operating system layer can be peeped, and only a predefined interface in the code can operate the code. The trusted execution environment related to the embodiments of the present specification can provide a secure execution environment for software, and the TEE is a trusted execution environment that is based on a secure extension of CPU hardware and is completely isolated from the outside.
The current application of privacy computing is mainly federal learning and multiparty security computing, but this type of product has the following problems in design and practical application:
1. the performance and the safety cannot be considered, the problem is mainly reflected in an application model based on multiparty safety calculation and federal learning, the multiparty safety calculation is high in safety, but performance bottlenecks of calculation and network IO (input, output) exist; federal learning performance is higher but security is also lower.
2. The coordination cost is high, the interaction among the nodes of each party in multiparty safety calculation or federal learning is similar to a P2P (peer-to-peer) network of a block chain, and the nodes of each party are mutually independent, so that when a system has a problem, the coordination and debugging cost is higher, and meanwhile, the upgrade of the system of any party can also cause inconsistent protocols with other parties, so that the calculation learning task fails.
3. The technical application is difficult to land, the situation that the data protection of all the demanding parties is unequal is mainly presented, and some enterprises often have higher requirements on the data protection of own parties, but hope that the other parties can reduce the threshold of data use. Meanwhile, a technician is required to have a certain professional knowledge background for understanding the privacy computing technology, and negotiation and application landing among enterprises are hindered under the condition of lack of technical cognition. Secondly, the self-service configurability of algorithm modules built based on multiparty security computation and federal learning is relatively low, most manufacturers tend to pre-assemble functions into scene modules open to users, and once the results of models or components deviate from the business greatly, it is difficult to analyze the cause.
Based on this, in the present specification, a programmable data processing method based on a trusted execution environment is provided for solving the above-mentioned problems, and the present specification relates to a programmable data processing apparatus based on a trusted execution environment, a programmable data processing system based on a trusted execution environment, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.
Referring to fig. 1, fig. 1 is a schematic view of a scenario of a programmable data processing method based on a trusted execution environment according to an embodiment of the present disclosure, where a task initiator is a user side with a machine learning requirement in a private computing scenario, a task executor is an intermediate party providing the trusted execution environment in the private computing scenario, and a data provider is a user side providing machine learning training data in the private computing scenario. In fig. 1, when a task initiator wants to use data of other data providers to perform machine learning, a task execution request may be submitted to a task execution party, and after the task execution party receives the task execution request, a virtual container for executing the task is created, where the virtual container belongs to a trusted execution environment, so as to ensure data security and privacy of the task, determine the data providers participating in the task based on the task execution request, and acquire sample data provided by each data provider to be fused to generate a sample data set, where the sample data set is used for machine model training in the machine learning task. Receiving model processing information submitted by a task initiator, wherein the model processing information is modeling codes written by a user aiming at the task, the virtual container can determine an initial task model and a model training strategy corresponding to the initial task model based on the model processing information, then the virtual container can train the initial task model according to the model training strategy by utilizing a sample data set, and the model meeting the expected requirement of the user is obtained after training, so that a task execution result is returned to the task initiator. According to the programmable data processing method based on the trusted execution environment, the calculation such as model training learning is concentrated into the virtual container of the task initiator, so that communication and calculation overhead can be reduced, privacy and safety of task data can be guaranteed, modeling is conducted by a user through an interactive programming mode, and visibility of the model to the user and participation of the user in a machine learning process can be improved.
Referring to fig. 2, fig. 2 shows a flowchart of a programmable data processing method based on a trusted execution environment according to an embodiment of the present disclosure, which specifically includes the following steps.
Step 202: and responding to a task execution request sent by a task initiator, and creating a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment.
The task initiator may be understood as a user side with a machine learning requirement in the private computing scenario, and the task executor may be understood as an intermediate side providing a trusted execution environment in the private computing scenario. A task execution request may be understood as a request that a task initiator have a task executor execute a task when having machine learning requirements. The virtual container can be understood as a virtual environment constructed by a task execution party for the task execution request, and model training and machine learning tasks corresponding to the task execution request are completed through an application deployed in the virtual container.
In practical application, a task initiator is deployed with a privacy computing client, and a task execution request is sent to a task executor through the privacy computing client, so as to execute a model training task used in a certain service. The task executor is provided with a trusted execution environment management end, receives a task execution request through the trusted execution environment management end and responds to the task execution request, so that the task executor is matched with the task executor to complete a model training task. The virtual container is a confidential container in the implementation process, is based on a secure virtualization technology, can be automatically encrypted through CPU (Central Processing Unit ) hardware when data is written into the memory, and can be automatically decrypted through the hardware when the memory is read. Meanwhile, each virtual container uses different keys to encrypt data, so that the resource isolation and the mutual independence of the keys among different virtual containers are ensured, encrypted data in the virtual containers cannot be decrypted and acquired by other virtual containers, the data privacy and the safety in the virtual containers are ensured, and the function of a trusted execution environment is played.
In a specific embodiment of the present disclosure, the task initiator is an a bank, which is expected to train a recommendation model for predicting the borrowing amount of the user, but in order to improve the recommendation effect of the recommendation model, in addition to training by using the user data of the bank, the task initiator may also search for the cooperative data of other data providers to participate in training, such as joint training of the user data participation model of other banks. In order to ensure the privacy and security of the user data of both parties, the task of the joint learning can be executed by a task executor provided with a trusted execution environment condition. The bank A sends a task execution request for recommending the model training task to the task execution party, and the task execution party creates a virtual container for executing the task after receiving the task execution request, wherein the virtual container belongs to a trusted execution environment, so that the privacy and the safety of data of the two parties are ensured.
Furthermore, in order to enable the task initiator to perform model training by using the trusted execution environment of the task executor, the task initiator needs to register with the task executor in advance, and the specific method further includes: responding to a registration request sent by the task initiator, and establishing a target transmission channel corresponding to the task initiator; and returning a registration result corresponding to the registration request through the target transmission channel.
The registration request can be understood as a request that the task initiator registers to the task executor, and because the task initiator needs to perform privacy calculation through a trusted execution environment of the task executor, the task initiator needs to register in the task executor in advance, and during the registration process, the task initiator can issue local collaboration data. The target transmission channel can be understood as a channel used for communication between the task initiator and the task executor, after the task initiator registers with the task executor, the target transmission channel is established between the two parties, and the target transmission channel is a private line network channel during specific implementation, so that the trust degree between the two parties can be improved, the third party is prevented from acquiring communication data between the two parties, and the data privacy and safety are improved.
In practical application, the TEE management end is used as a centralized trusted execution environment provider and is used for opening a registration interface to the outside. The data provider of the task desiring to issue the shared data set and the demander desiring to seek the cooperative data can register at the TEE management end, and after the registration is completed, the TEE management end returns the registration result through the corresponding private line network channel.
In a specific embodiment of the present disclosure, referring to the above example, the bank a sends a registration request to a TEE management end (task execution party), and the TEE management end responds to the registration request to establish a private network transmission channel between the bank a and the bank a, and returns a registration result corresponding to the registration request through the private network transmission channel.
Based on the method, by registering the task executor, any one data provider can use the trusted execution environment of the task executor to perform multiparty privacy calculation, so that the model learning effect is improved, and meanwhile, the data privacy and safety can be ensured.
Further, in order to avoid that local data provided by the task initiator is maliciously acquired, which causes data leakage of the task initiator, the data may be encrypted based on an encryption algorithm, and the specific method further includes: and responding to a task execution request sent by a task initiator, determining an encryption public key corresponding to the task initiator and generating a symmetric key corresponding to the task execution request.
The encrypted public key can be understood as a public key in a key pair generated by a task initiator, and the symmetric key can be understood as a session key corresponding to a single task generated by negotiation between the task initiator and a task executor. In particular, the symmetric key is used for encrypting the data transmitted between the two parties, and the symmetric key is encrypted by using the encryption public key in advance, so that the interaction data between the task execution party and the task initiator at least comprises the symmetric key encrypted by the encryption public key and the transmission data encrypted by the symmetric key.
In practical applications, a symmetric key for encrypting The transmission data is generated by key negotiation between The task initiator and The task executor before each task execution, for example, the length of PKCS5 padding (The Public-Key Cryptography Standards, public key cryptography standard) of CBC mode (Cipher Block Chaining, ciphertext block chaining mode) of AES protocol (Advanced Encryption Standard ).
In a specific embodiment of the present disclosure, referring to the above example, during the registration process of the bank a with the TEE management end, a key pair is generated, and an encrypted public key in the key pair is given to the TEE management end, and after the bank a sends a task execution request, both sides generate a symmetric key corresponding to the task in a key negotiation manner.
Based on the method, the privacy and the security of the data in the privacy calculation process can be further ensured by encrypting the public key and the symmetric key.
Further, before the task executor responds to the task execution request sent by the task initiator, identity authentication is performed between the task executor and the task initiator, so that the situation of data leakage is avoided, and before the task executor responds to the task execution request sent by the task initiator, the method further comprises: receiving an identity verification request sent by the task initiator, and determining initiator identity information corresponding to the task initiator; transmitting identity information of an executive party to the task initiator, wherein the identity information of the executive party is used for the task initiator to carry out identity verification on the task executive party; and carrying out identity verification on the task initiator based on the identity information of the initiator, and executing a step of responding to a task execution request sent by the task initiator and creating a virtual container corresponding to the task execution request under the condition that the verification is passed.
The identity verification request can be understood as a request of a task initiator for identity verification of a task executive, the identity information of the initiator can be understood as information of the task initiator for verifying identity, the identity information of the initiator can be identification information of a privacy computing client deployed by the task initiator, the identity information of the executive can be understood as information of the user verification identity of the task executive, and the identity information of the executive can be authentication report information of a trusted execution environment of the task executive. When the task initiator verifies that the identity of the task executor passes based on the identity information of the executor, a task execution request is sent to the task executor; after the task executor verifies that the identity of the task initiator passes based on the identity information of the initiator, the task executor can continue to execute the task according to the task execution request.
In practical applications, in order to prevent a third party from acquiring local data of a task initiator or data of other data providers in a task executor, identity authentication of both parties is required between the task initiator and the task executor before each execution of a task. In the specific implementation, since the TEE management end and the data provider both deploy the privacy computing application, the privacy computing service provider can provide and update the image file regularly, and the TEE management end can compare the registration information corresponding to the task initiator with the identity information of the initiator, and if the comparison is consistent, the identity verification is passed. The task initiator can obtain the identity information of the executing party sent by the TEE management end, the identity information of the executing party can be authentication report information of the virtual container, the authentication report information can contain information such as a metric value, a version number, an owner and the like of the virtual machine, the validity of the identity of the virtual container can be confirmed by reporting information such as signature confirmation data, and the validity of the running environment can be confirmed by checking the metric value and the like, so that the task initiator can check the authentication report to confirm the validity of the running environment, and the identity verification of the TEE management end, namely the task executing party, is completed.
Based on the method, the trust degree between the task initiator and the task executor is improved through mutual identity authentication between the task initiator and the task executor, and the safety of data involved in subsequent task execution can be ensured.
Further, in order to avoid that training data of model training is acquired by a third party, thereby causing data leakage of a participant, a task needs to be executed in a trusted execution environment, specifically, a virtual container corresponding to a task execution request sent by a task initiator is created in response to the task execution request, including: responding to a task execution request sent by a task initiator, and determining task resource information of the task execution request; and scheduling virtual resources according to the task resource information, and creating a virtual container corresponding to the task execution request by utilizing the virtual resources.
The task resource information may be understood as the amount of virtual resources required for executing the task corresponding to the task execution request, for example, 1 core 4G memory of the CPU is required for executing the task a. The virtual resource can be understood as a virtual resource in the device, the virtual resource is scheduled and utilized to create a virtual container, that is, a part of the virtual resource is allocated to the virtual container to execute a corresponding task, and the part of the virtual resource is released after the task is executed.
In a specific embodiment of the present disclosure, referring to the above example, a TEE management end responds to a task execution request sent by an a bank for a task trained by a recommendation model for predicting a user borrowing amount, determines task resource information corresponding to the task execution request as a 2-core 4G memory, and schedules the part of resources and creates a virtual container.
Step 204: and determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container.
The data provider can be understood as a party providing sample data in the model training task, in order to improve the model training effect, the model training can be performed by introducing data provided by multiple parties, so that a plurality of data providers can be determined according to a task execution request of a task initiator, sample data provided by each data provider are obtained and fused to generate a sample data set, and the model is trained by using the sample data set, so that a model with better prediction effect is obtained.
In practical applications, the task initiator may provide local sample data as a data provider to add its own sample data to the sample data set. Therefore, the data provider can comprise a task initiator and other data providing user parties, and can also only be other data providing user parties, and the data provider can be determined according to actual training tasks.
In a specific embodiment of the present disclosure, referring to the above example, a TEE manager, i.e., a task performer, determines an a bank with a data provider as a task initiator according to a task performance request, and a B bank providing data, respectively obtains sample data corresponding to each data provider and combines the sample data into a sample data set, and loads the sample data set into a virtual container, so that a subsequent virtual container may perform a training task based on the sample data set.
Further, to avoid that an expected model cannot be normally trained due to an incorrect determination of a data provider, the determination of the data provider needs to be performed based on a data source in a task execution request, specifically, determining at least one data provider according to the task execution request includes: determining data source information carried in the task execution request; at least one data provider is determined from the data source information.
The data source information can be understood as source information of data, the data source information can comprise identity information, data identification information and the like of a data provider, and the data provider and sample data which need to participate in model training at the time can be determined through the data source information.
In a specific embodiment of the present disclosure, referring to the above example, the TEE management end parses a task execution request to obtain data source information carried in the task execution request, where the data source information is "bank a: user billing data, B bank: the user bill data "can determine that the data provider is an A bank and a B bank through the data source information.
Based on the data source information carried in the task execution request, the data provider and the data provided by the data provider can be accurately determined, so that the acquired data is applied to subsequent model training, and the model training effect is improved.
Further, in order to ensure the model training effect, after determining the data provider, the corresponding sample data needs to be pulled from the data provider and loaded into the virtual container, specifically, the sample data set corresponding to the at least one data provider is obtained and loaded into the virtual container, which includes: transmitting a data acquisition request to the at least one data provider based on the data source information; receiving at least one sample data returned by the at least one data provider for the data acquisition request; and generating a sample data set according to the at least one sample data and loading the sample data set into a virtual container corresponding to the task execution request.
The data acquisition request may be understood as a request of a task executor to pull data from a data provider, where the data acquisition request may include information such as a data identifier, a data amount, etc., so that after the data provider receives the data acquisition request sent by the task initiator, the data required to be pulled by the task initiator may be determined based on relevant information carried in the data acquisition request, so as to return sample data to the task initiator. The sample data is data provided by each data provider, and each data provider corresponds to each sample data, so that a task executor may receive the plurality of sample data, and then the plurality of sample data can be fused to generate a sample data set for model training, and the sample data set is loaded into the virtual container, so that the virtual container can perform training tasks by using the sample data set.
In practical application, when a task executor sends a data acquisition request to a data provider based on data source information, the data provider authenticates the identity of the task executor, judges whether the execution environment of the task executor is legal, returns sample data corresponding to the data acquisition request to the task executor when the identity authentication of the data provider to the task executor is passed, and after receiving the sample data returned by a plurality of data providers, the task executor fuses all the sample data to generate a sample data set and loads the sample data set into a virtual container so that the virtual container can execute a model training task by using the sample data set.
In a specific embodiment of the present disclosure, referring to the above example, a data acquisition request is sent to an a bank and a B bank based on data source information, user bill data corresponding to each bank is acquired, data returned by the two banks is received, and then data fusion is performed to generate a sample data set, and the generated sample data set is loaded into a virtual container.
Based on the data acquisition requests corresponding to different data providers are generated by the task executor through the data source information, and sample data corresponding to each data provider is pulled based on the data acquisition requests, so that the integrity of data required by the task is ensured, the subsequent incapability of normally executing training tasks is avoided, and the training efficiency is improved.
Further, in order to perform model training normally, it is necessary to fuse a plurality of sample data and generate a sample data set, perform model training using the sample data fused in the sample data set, specifically generate the sample data set according to the at least one sample data, including: acquiring a data identifier of each sample sub-data in the at least one sample data; and fusing the at least one sample data according to each data identifier, and generating a sample data set according to the fusion result.
The sample sub-data may be understood as each piece of sub-data in the sample data, for example, the sample data is 100 pieces of user bill data, wherein 1 piece of user bill data is the sample sub-data of the sample data, the data identifier of the sample sub-data may be understood as unique identification information corresponding to each piece of sample sub-data, and the sample data related to different training tasks are different, so that the data identifiers of the sample sub-data of the sample data in different training tasks are different. And fusing the plurality of sample data according to the data identification, and performing data fusion on the plurality of sample data according to the data identification so as to generate a sample data set formed by combining the sample data corresponding to the plurality of data providers.
In specific implementation, the task initiator expects to improve the model training effect through the data provided by other data providers, so that attribute information of sample data provided by other data providers may be different from the data provided by the task initiator, for example, the task initiator expects to train out a model capable of being based on predicting user behavior habits, the data provided by the task initiator is shopping record data for users, the data provided by other data providers is collection commodity information of users, when sample data provided by two parties are fused, data identification of each sample sub-data needs to be determined, the data identification can be user identification such as user account number, unique name of the user, and the like, and the shopping record data and collection commodity information of the same user are combined through the user identification, so that a sample data set is generated by fusing a plurality of sample data.
In a specific embodiment of the present disclosure, referring to the above example, after sample data provided by a bank and a bank B are obtained, determining a data identifier of each sample sub-data in each sample data, that is, identity account information of each user, performing data fusion according to the identity account information of each user, and fusing data provided by two banks according to rules of the same user, thereby generating a sample data set formed by combining two sample data.
Based on the model training method, after the sample data set is generated by carrying out data fusion according to the data identification of the sample sub-data, the virtual container can be enabled to carry out model training by utilizing the sample data set, so that the model training effect is improved, and a prediction model which is more in line with the expectations of users is obtained.
Step 206: and receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set.
The model processing information may be understood as information submitted by a task initiator and used for model training processing in a model training task, the model processing information may include model construction information, model training information and the like, a model training strategy corresponding to an initial task model and an initial task model may be constructed through the model processing information, the initial task model may be understood as a model constructed according to instructions of the task initiator, the model is an initial model before training, the model training strategy may be understood as a training strategy formulated for the initial task model, the model training strategy may include information such as model training target parameters, model training rounds and the like, and the initial task model is trained according to the model training strategy, so that a prediction model meeting user expectations may be obtained.
In the implementation, the model processing information is executable code blocks submitted by a user through an interactive development window, and the virtual container can analyze and execute the executable codes after receiving the executable codes submitted by the user, so that an initial task model is constructed and a model training strategy corresponding to the initial task model is determined.
In a specific embodiment of the present disclosure, referring to the above example, model processing information submitted by a bank a, that is, a model code is received, and the model code is analyzed and executed through a virtual container, so as to construct an initial task model of the present model training task, and determine a model training strategy for training the initial task model in the present model training task.
Further, in order to provide the application capability of the interactive model programming for the user, to improve the visibility and autonomy of the model to the user, a model building address needs to be provided for the user, and specifically, the method for receiving the model processing information sent by the task initiator includes: sending a model building address corresponding to the virtual container to the task initiator, wherein the model building address is used for enabling the task initiator to determine a processing interface of model processing information; and receiving the model processing information returned by the task initiator based on the model building address.
The model building address can be understood as a URL address (Uniform Resource Locator ) of a machine learning server in the virtual container, and the privacy computing client can redirect the URL address and open an interactive development window by calling back the model building address to the privacy computing client of the task initiator, so that the task initiator can program the model through the interactive development window. Thus, the model build address functions to provide the task initiator with a processing interface for model programming through which the task initiator can submit the written model code to the task executor.
In specific implementation, a machine learning server is deployed in the virtual container and is used for executing model codes submitted by a task initiator, so that training tasks are realized. The task initiator can independently construct an initial training model through the online interactive programming window, and a corresponding training strategy is formulated, so that the participation of a user in model training is improved.
In a specific embodiment of the present disclosure, the TEE manager sends a model build address, i.e., URL address, of the machine learning server in the virtual container to the a bank, which redirects to the URL through the privacy computing client, writes the code through the interactive development window, and returns to the virtual container.
Based on the method, the visibility and autonomy of the model to the user can be effectively improved by providing the online programming processing interface for the user, and the participation of the user to the model training is improved.
Furthermore, in order to avoid the situation that the user submits the violation code with the super authority and the data privacy security problem occurs, the violation detection needs to be performed on the model code submitted by the user, specifically, the initial task model and the model training strategy corresponding to the initial task model are determined based on the model processing information by loading the virtual container of the sample data set, which comprises the following steps: detecting the model processing information according to a preset information detection rule by loading a virtual container of the sample data set; and constructing an initial task model according to the detection result, and generating a model training strategy corresponding to the initial task model.
The preset information detection rule can be understood as a rule for detecting whether the model processing information has violation information, and whether the model processing information has violation information can be detected by detecting the model processing information according to the preset information detection rule, so that the task initiator is prevented from acquiring privacy data of other data providers in an unauthorized manner.
In practical application, a detection component can be deployed in the virtual container, and the detection component can carry out compliance detection on model codes submitted by users, shield code logic endangering privacy and submit the filtered model codes to a machine learning server for execution. When the detection component detects that the violation codes exist in the model codes submitted by the user in the implementation, the detection component can also directly return the violation information to the task initiator, so that the task initiator adjusts the violation codes and then resubmits the violation codes, and the specific detected corresponding strategies can be formulated according to actual requirements.
In a specific embodiment of the present disclosure, referring to the above example, receiving a model code submitted by a bank a, detecting the model code by a detection component in a virtual container according to a preset information detection rule, and if it is determined that a detection result is that no violation code exists in the model code, forwarding the model code to a machine learning server, and executing the model code by the machine learning server, thereby implementing a training task for an initial task model.
Step 208: training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result.
The training result may be understood as a training result obtained after training the initial task model, the training result may include a task model meeting training conditions, evaluation parameters associated with the task model, and other data, the task execution result may be understood as an execution result of a training task returned to the task initiator, the task execution result may include evaluation parameters, model parameters, and other data obtained for the task model meeting the training conditions, and the task execution result is fed back to the task initiator, so that the task initiator can obtain the task model generated by privacy calculation, thereby achieving the purpose of training the predicted model expected by the user.
In a specific embodiment of the present disclosure, training an initial task model according to a model training policy by a machine learning server in a virtual container, obtaining a predicted task model of a user borrowing amount satisfying training conditions after training is completed, and returning relevant model parameters of the predicted task model to an a bank.
Further, in order to avoid that privacy data of a task initiator is maliciously acquired by a third party in a model training process, interaction data between the task initiator and a task executor can be encrypted through an encryption algorithm, specifically, a task execution result is fed back to the task initiator according to a training result, including: encrypting the training result by using the symmetric key to obtain an encrypted training result, and encrypting the symmetric key by using the encrypted public key to obtain an encrypted symmetric key; and generating a task execution result according to the encryption training result and the encryption symmetric key and feeding back to the task initiator.
The symmetric key is an encryption key which is generated by negotiations of both parties in the training task, the training result is encrypted through the symmetric key, the encrypted training result can be obtained, in order to enable the task initiator to decrypt the encrypted training result, the symmetric key is also required to be synchronized to the task initiator, in order to ensure that the symmetric key is not acquired by other parties, private data of the training result is leaked, the symmetric key is also required to be encrypted through an encryption public key, the encrypted symmetric key is obtained, and therefore, the interaction data between the task initiator and the task executing party are not acquired by a third party, and the privacy and the safety of the data are ensured.
In a specific embodiment of the present disclosure, a virtual container encrypts a model training result by using a symmetric key to obtain an encrypted training result, encrypts the symmetric key by using an encrypted public key to obtain an encrypted symmetric key, and combines the encrypted training result with the encrypted symmetric key to generate a task execution result, which is returned to a privacy computing client of an a bank, so that the a bank can decrypt the encrypted symmetric key by using a held decryption private key to obtain the symmetric key, and then decrypt the encrypted training result by using the symmetric key to obtain the training result, thereby realizing privacy protection of interaction data between two parties.
The programmable data processing method based on the trusted execution environment is applied to a task execution party and comprises the steps of responding to a task execution request sent by a task initiation party, and creating a virtual container corresponding to the task execution request, wherein the virtual container belongs to the trusted execution environment; determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container; receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set; training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result. The method comprises the steps that a task executive party provides a trusted execution environment by creating a virtual container corresponding to a task execution request, a sample data set required in the task execution request is loaded into the virtual container, an initial task model and a model training strategy corresponding to the initial task model are determined based on model processing information by the virtual container loaded with the sample data set, then the initial task model is trained in the virtual container according to the model training strategy, so that model training calculation is performed in the trusted execution environment of the task executive party, and privacy and safety of the sample data set are guaranteed.
The following describes, with reference to fig. 3, an example of application of the programmable data processing method based on a trusted execution environment provided in the present specification to a training task of a user preference prediction model, where the programmable data processing method based on a trusted execution environment is further described. FIG. 3 is a flowchart of a programmable data processing method based on a trusted execution environment according to one embodiment of the present disclosure, and specifically includes the following steps.
Step 302: and responding to the registration request sent by the task initiator, establishing a target transmission channel corresponding to the task initiator, and returning a registration result corresponding to the registration request through the target transmission channel.
In one implementation, the task initiator is an a shopping platform, the a shopping platform sends a registration request to the TEE management end, the TEE management end establishes a target transmission channel between the two parties after responding to the registration request, and returns a result of successful registration to the privacy computing client of the a shopping platform through the target transmission channel.
Step 304: responding to a task execution request sent by a task initiator, and determining task resource information of the task execution request; and scheduling the virtual resources according to the task resource information, and creating a virtual container corresponding to the task execution request by utilizing the virtual resources.
In one implementation manner, a task initiator prepares to train a commodity recommendation model capable of predicting a commodity recommended to a user, builds a model training task for the commodity recommendation model, sends a task execution request for the task to a TEE management end, and after receiving the task execution request, the TEE management end determines task resource information of the task to be a 2-core 4G memory, invokes a corresponding virtual resource to create a virtual container, and the virtual container is used for executing the model training task.
Step 306: and determining data source information carried in the task execution request, and determining at least one data provider according to the data source information.
In one implementation manner, data source information carried in a task execution request is determined, the data source information comprises user historical purchasing behavior data of an A shopping platform and user historical purchasing behavior data of a B shopping platform, and a data provider is determined to be the A shopping platform and the B shopping platform through the data source information.
Step 308: and sending a data acquisition request to at least one data provider based on the data source information, and receiving at least one sample data returned by the at least one data provider for the data acquisition request.
In one implementation manner, the TEE management end sends data acquisition requests to the a shopping platform and the B shopping platform through the virtual container respectively, sample data provided by the a shopping platform are historical purchase behavior data of the a shopping platform user, and sample data provided by the B shopping platform are historical purchase behavior data of the B shopping platform user.
Step 310: and generating a sample data set according to at least one sample data and loading the sample data set into a virtual container corresponding to the task execution request.
In one implementation, a data identifier of each sample sub-data in at least one sample data is obtained, at least one sample data is fused according to each data identifier, a sample data set is generated according to a fusion result, the user identifier is determined according to user historical purchasing behavior data respectively obtained from an A shopping platform and a B shopping platform, historical purchasing behavior data of the same user on the two shopping platforms are combined together based on the user identifier, and thus the sample data provided by the two shopping platforms are fused to generate the sample data set and loaded into a machine learning server side of a virtual container.
Step 312: and sending a model construction address corresponding to the virtual container to the task initiator, wherein the model construction address is used for enabling the task initiator to determine a processing interface of model processing information, and receiving the model processing information returned by the task initiator based on the model construction address.
In one implementation, the URL address of the machine learning server is returned to the privacy computing client of the a shopping platform, which may be redirected to the URL address and returned to the virtual container of the TEE management via the interactive development window editing model code.
Step 314: and detecting model processing information according to a preset information detection rule by loading a virtual container of the sample data set, constructing an initial task model according to a detection result, and generating a model training strategy corresponding to the initial task model.
In one implementation manner, a detection component in the virtual container is used for detecting compliance of the model codes according to a preset information detection rule, filtered model codes are obtained after detection, and the filtered model codes are submitted to a machine learning server for execution, so that an initial task model and a model training strategy corresponding to the initial task model are constructed.
Step 316: training the initial task model in a virtual container loaded with a sample data set according to a model training strategy, and feeding back a task execution result to a task initiator according to a training result.
In one implementation manner, a machine learning server deployed in a virtual container trains an initial task model according to a model training strategy to obtain a commodity recommendation model meeting training conditions, information such as evaluation parameters and model parameters of the commodity recommendation model is encrypted by using a symmetric key, symmetry is encrypted by using an encryption public key, the encrypted training result and the encrypted symmetric key are combined to generate a task execution result, and the task execution result is returned to the shopping platform A, so that a model training task of the commodity recommendation model is completed.
The programmable data processing method based on the trusted execution environment provided by the specification realizes that a task executor provides a trusted execution environment by creating a virtual container corresponding to a task execution request, and loads a sample data set required in the task execution request into the virtual container, the virtual container loaded with the sample data set determines an initial task model based on model processing information and a model training strategy corresponding to the initial task model, and then trains the initial task model in the virtual container according to the model training strategy, so that model training calculation is performed in the trusted execution environment of the task executor, and the privacy and safety of the sample data set are ensured.
Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a programmable data processing apparatus based on a trusted execution environment, and fig. 4 is a schematic structural diagram of a programmable data processing apparatus based on a trusted execution environment according to one embodiment of the present disclosure. As shown in fig. 4, the apparatus includes:
a creating module 402, configured to respond to a task execution request sent by a task initiator, and create a virtual container corresponding to the task execution request, where the virtual container belongs to a trusted execution environment;
An obtaining module 404, configured to determine at least one data provider according to the task execution request, obtain a sample data set corresponding to the at least one data provider, and load the sample data set into the virtual container;
a determining module 406, configured to receive model processing information sent by the task initiator, and determine an initial task model and a model training strategy corresponding to the initial task model based on the model processing information by loading a virtual container of the sample dataset;
training module 408 is configured to train the initial task model in the virtual container loaded with the sample dataset according to the model training strategy, and to feed back the task execution result to the task initiator according to the training result.
Optionally, the creating module 402 is further configured to determine task resource information of the task execution request in response to the task execution request sent by the task initiator; and scheduling virtual resources according to the task resource information, and creating a virtual container corresponding to the task execution request by utilizing the virtual resources.
Optionally, the acquiring module 404 is further configured to determine data source information carried in the task execution request; at least one data provider is determined from the data source information.
Optionally, the acquiring module 404 is further configured to send a data acquisition request to the at least one data provider based on the data source information; receiving at least one sample data returned by the at least one data provider for the data acquisition request; and generating a sample data set according to the at least one sample data and loading the sample data set into a virtual container corresponding to the task execution request.
Optionally, the obtaining module 404 is further configured to obtain a data identifier of each sample sub-data in the at least one sample data; and fusing the at least one sample data according to each data identifier, and generating a sample data set according to the fusion result.
Optionally, the determining module 406 is further configured to send a model building address corresponding to the virtual container to the task initiator, where the model building address is used to enable the task initiator to determine a processing interface of model processing information; and receiving the model processing information returned by the task initiator based on the model building address.
Optionally, the determining module 406 is further configured to detect the model processing information according to a preset information detection rule by loading a virtual container of the sample dataset; and constructing an initial task model according to the detection result, and generating a model training strategy corresponding to the initial task model.
Optionally, the device further includes a verification module configured to receive an authentication request sent by the task initiator, and determine initiator identity information corresponding to the task initiator; transmitting identity information of an executive party to the task initiator, wherein the identity information of the executive party is used for the task initiator to carry out identity verification on the task executive party; and carrying out identity verification on the task initiator based on the identity information of the initiator, and executing a step of responding to a task execution request sent by the task initiator and creating a virtual container corresponding to the task execution request under the condition that the verification is passed.
Optionally, the device further includes a registration module configured to establish a target transmission channel corresponding to the task initiator in response to a registration request sent by the task initiator; and returning a registration result corresponding to the registration request through the target transmission channel.
Optionally, the apparatus further includes an encryption module configured to feed back a task execution result to the task initiator according to a training result, including: encrypting the training result by using the symmetric key to obtain an encrypted training result, and encrypting the symmetric key by using the encrypted public key to obtain an encrypted symmetric key; and generating a task execution result according to the encryption training result and the encryption symmetric key and feeding back to the task initiator.
The programmable data processing device based on the trusted execution environment provided by the specification is applied to a task execution party and comprises: the creation module is configured to respond to a task execution request sent by a task initiator and create a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment; the acquisition module is configured to determine at least one data provider according to the task execution request, acquire a sample data set corresponding to the at least one data provider and load the sample data set into the virtual container; the determining module is configured to receive model processing information sent by the task initiator, and determine an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through a virtual container loading the sample data set; and the training module is configured to train the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feed back a task execution result to the task initiator according to a training result. The method comprises the steps that a task executive party provides a trusted execution environment by creating a virtual container corresponding to a task execution request, a sample data set required in the task execution request is loaded into the virtual container, an initial task model and a model training strategy corresponding to the initial task model are determined based on model processing information by the virtual container loaded with the sample data set, then the initial task model is trained in the virtual container according to the model training strategy, so that model training calculation is performed in the trusted execution environment of the task executive party, and privacy and safety of the sample data set are guaranteed.
The foregoing is a schematic solution of a programmable data processing apparatus based on a trusted execution environment of this embodiment. It should be noted that, the technical solution of the programmable data processing apparatus based on the trusted execution environment and the technical solution of the programmable data processing method based on the trusted execution environment belong to the same concept, and details of the technical solution of the programmable data processing apparatus based on the trusted execution environment, which are not described in detail, can be referred to the description of the technical solution of the programmable data processing method based on the trusted execution environment.
Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a programmable data processing system based on a trusted execution environment, and fig. 5 shows a schematic flow chart of a programmable data processing system based on a trusted execution environment according to an embodiment of the present disclosure. As shown in fig. 5, the system includes a task initiator 502 and a task executor 504;
the task initiator 502 generates and sends a task execution request to the task executor;
in one implementation, a task initiator generates a task execution request and sends the task execution request to a task executor.
The task executor 504, responding to the task execution request, and creating a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment; determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container; receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set; training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result.
The task initiator 502 generates a local task model according to the task execution result.
In one implementation, after receiving the task execution result, the task initiator may locally generate a local task model based on content included in the task execution result, and put the local task model into the project for implementation.
Optionally, the system further comprises a task coordinator, wherein the task coordinator belongs to a data provider;
the task cooperative party responds to a data acquisition request sent by the task executing party, determines the identity information of the executing party corresponding to the task executing party, performs identity verification on the task executing party based on the identity information of the executing party, and sends sample data to the task executing party under the condition that verification is passed, wherein the sample data is used for generating the sample data set.
In one implementation manner, the task cooperative party is a data provider for providing data in the current model training task, after the task execution party determines the data provider, a data acquisition request is sent to the task cooperative party, so that the task cooperative party sends local sample data to the task execution party for model training, the task cooperative party authenticates the identity of the task execution party, namely, verifies the environment legitimacy of the virtual container, determines the identity information of the execution party of the task execution party, namely, the authentication report information of the virtual container, verifies the legitimacy of the virtual container based on the authentication report information, and sends the local sample data to the task execution party under the condition that verification is passed.
Optionally, encrypting the sample data by using a cooperative encryption private key to obtain encrypted sample data, and sending the encrypted sample data to the task executor; and the task executive party decrypts the encrypted sample data by utilizing the cooperative encryption public key corresponding to the task cooperative party to obtain the sample data.
In one implementation manner, in order to ensure the security of the interaction data between the task cooperative party and the task execution party, the data between the two parties may be encrypted, the task cooperative party may encrypt the sample data by using the cooperative encryption public key, and after receiving the encrypted sample data, the task execution party may decrypt the encrypted sample data based on the cooperative encryption public key, so as to obtain the sample data.
The programmable data processing system based on the trusted execution environment comprises a task initiator and a task executor; the task initiator generates and sends a task execution request to the task executor; the task executive party responds to the task execution request to create a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment; determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container; receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set; training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result; and the task initiator generates a local task model according to the task execution result. The method comprises the steps that a task executive party provides a trusted execution environment by creating a virtual container corresponding to a task execution request, a sample data set required in the task execution request is loaded into the virtual container, an initial task model and a model training strategy corresponding to the initial task model are determined based on model processing information by the virtual container loaded with the sample data set, then the initial task model is trained in the virtual container according to the model training strategy, so that model training calculation is performed in the trusted execution environment of the task executive party, and privacy and safety of the sample data set are guaranteed.
The foregoing is a schematic illustration of a programmable data processing system based on a trusted execution environment in accordance with the present embodiments. It should be noted that, the technical solution of the programmable data processing system based on the trusted execution environment and the technical solution of the programmable data processing method based on the trusted execution environment belong to the same concept, and details of the technical solution of the programmable data processing system based on the trusted execution environment, which are not described in detail, can be referred to the description of the technical solution of the programmable data processing method based on the trusted execution environment.
Fig. 6 illustrates a block diagram of a computing device 600 provided in accordance with one embodiment of the present description. The components of computing device 600 include, but are not limited to, memory 610 and processor 620. The processor 620 is coupled to the memory 610 via a bus 630 and a database 650 is used to hold data.
Computing device 600 also includes access device 640, access device 640 enabling computing device 600 to communicate via one or more networks 660. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, localAreaNetwork), wide area networks (WAN, wideAreaNetwork), personal area networks (PAN, personalAreaNetwork), or combinations of communication networks such as the internet. The access device 640 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless LocalAreaNetwork) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for MicrowaveAccess) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, near field communication (NFC, near Field Communication).
In one embodiment of the present description, the above-described components of computing device 600, as well as other components not shown in FIG. 6, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device shown in FIG. 6 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 600 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 600 may also be a mobile or stationary server.
Wherein the processor 620 is configured to execute computer-executable instructions that, when executed by the processor, perform the steps of the programmable data processing method described above based on a trusted execution environment.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the programmable data processing method based on the trusted execution environment belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the programmable data processing method based on the trusted execution environment.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the programmable data processing method described above based on a trusted execution environment.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the programmable data processing method based on the trusted execution environment belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the programmable data processing method based on the trusted execution environment.
An embodiment of the present disclosure further provides a computer program, where the computer program, when executed in a computer, causes the computer to perform the steps of the programmable data processing method based on a trusted execution environment.
The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solution of the programmable data processing method based on the trusted execution environment belong to the same concept, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the programmable data processing method based on the trusted execution environment.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, randomAccess Memory), an electrical carrier signal, a telecommunication signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be increased or decreased appropriately according to the requirements of the patent practice, for example, in some areas, according to the patent practice, the computer readable medium does not include an electric carrier signal and a telecommunication signal.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (15)

1. A programmable data processing method based on a trusted execution environment, applied to a task execution party, comprising:
responding to a task execution request sent by a task initiator, and creating a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment;
determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container;
receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set;
training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result.
2. The method of claim 1, wherein creating a virtual container corresponding to a task execution request in response to the task execution request sent by a task initiator comprises:
Responding to a task execution request sent by a task initiator, and determining task resource information of the task execution request;
and scheduling virtual resources according to the task resource information, and creating a virtual container corresponding to the task execution request by utilizing the virtual resources.
3. The method of claim 1, wherein determining at least one data provider based on the task execution request comprises:
determining data source information carried in the task execution request;
at least one data provider is determined from the data source information.
4. A method according to claim 3, wherein obtaining and loading a sample data set corresponding to the at least one data provider into the virtual container comprises:
transmitting a data acquisition request to the at least one data provider based on the data source information;
receiving at least one sample data returned by the at least one data provider for the data acquisition request;
and generating a sample data set according to the at least one sample data and loading the sample data set into a virtual container corresponding to the task execution request.
5. The method of claim 4, wherein generating a sample data set from the at least one sample data comprises:
Acquiring a data identifier of each sample sub-data in the at least one sample data;
and fusing the at least one sample data according to each data identifier, and generating a sample data set according to the fusion result.
6. The method of claim 1, wherein receiving model processing information sent by the task initiator comprises:
sending a model building address corresponding to the virtual container to the task initiator, wherein the model building address is used for enabling the task initiator to determine a processing interface of model processing information;
and receiving the model processing information returned by the task initiator based on the model building address.
7. The method of claim 1, wherein determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information by loading a virtual container of the sample dataset comprises:
detecting the model processing information according to a preset information detection rule by loading a virtual container of the sample data set;
and constructing an initial task model according to the detection result, and generating a model training strategy corresponding to the initial task model.
8. The method of claim 1, wherein prior to responding to the task execution request sent by the task initiator, the method further comprises:
receiving an identity verification request sent by the task initiator, and determining initiator identity information corresponding to the task initiator;
transmitting identity information of an executive party to the task initiator, wherein the identity information of the executive party is used for the task initiator to carry out identity verification on the task executive party;
and carrying out identity verification on the task initiator based on the identity information of the initiator, and executing a step of responding to a task execution request sent by the task initiator and creating a virtual container corresponding to the task execution request under the condition that the verification is passed.
9. The method according to claim 1, wherein the method further comprises:
responding to a registration request sent by the task initiator, and establishing a target transmission channel corresponding to the task initiator;
and returning a registration result corresponding to the registration request through the target transmission channel.
10. The method according to claim 1, wherein the method further comprises:
responding to a task execution request sent by a task initiator, determining an encryption public key corresponding to the task initiator and generating a symmetric key corresponding to the task execution request;
And feeding back a task execution result to the task initiator according to the training result, wherein the method comprises the following steps:
encrypting the training result by using the symmetric key to obtain an encrypted training result, and encrypting the symmetric key by using the encrypted public key to obtain an encrypted symmetric key;
and generating a task execution result according to the encryption training result and the encryption symmetric key and feeding back to the task initiator.
11. A programmable data processing system based on a trusted execution environment, the system comprising a task initiator and a task executor;
the task initiator generates and sends a task execution request to the task executor;
the task executive party responds to the task execution request to create a virtual container corresponding to the task execution request, wherein the virtual container belongs to a trusted execution environment; determining at least one data provider according to the task execution request, acquiring a sample data set corresponding to the at least one data provider, and loading the sample data set into the virtual container; receiving model processing information sent by the task initiator, and determining an initial task model and a model training strategy corresponding to the initial task model based on the model processing information through loading a virtual container of the sample data set; training the initial task model in a virtual container loaded with the sample data set according to the model training strategy, and feeding back a task execution result to the task initiator according to a training result;
And the task initiator generates a local task model according to the task execution result.
12. The system of claim 11, further comprising a task orchestrator, wherein the task orchestrator belongs to a data provider;
the task cooperative party responds to a data acquisition request sent by the task executing party, determines the identity information of the executing party corresponding to the task executing party, performs identity verification on the task executing party based on the identity information of the executing party, and sends sample data to the task executing party under the condition that verification is passed, wherein the sample data is used for generating the sample data set.
13. The system of claim 12, wherein the task orchestrator encrypts the sample data using an orchestrated encryption private key to obtain encrypted sample data, and sends the encrypted sample data to the task performer;
and the task executive party decrypts the encrypted sample data by utilizing the cooperative encryption public key corresponding to the task cooperative party to obtain the sample data.
14. A computing device, comprising:
a memory and a processor;
The memory is configured to store computer executable instructions, and the processor is configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the programmable data processing method according to any one of claims 1 to 10.
15. A computer-readable storage medium, characterized in that it stores computer-executable instructions which, when executed by a processor, implement the steps of the programmable data processing method based on a trusted execution environment as claimed in any one of claims 1 to 10.
CN202311028439.0A 2023-08-14 2023-08-14 Programmable data processing method and system based on trusted execution environment Active CN116992458B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311028439.0A CN116992458B (en) 2023-08-14 2023-08-14 Programmable data processing method and system based on trusted execution environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311028439.0A CN116992458B (en) 2023-08-14 2023-08-14 Programmable data processing method and system based on trusted execution environment

Publications (2)

Publication Number Publication Date
CN116992458A true CN116992458A (en) 2023-11-03
CN116992458B CN116992458B (en) 2024-09-03

Family

ID=88533811

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311028439.0A Active CN116992458B (en) 2023-08-14 2023-08-14 Programmable data processing method and system based on trusted execution environment

Country Status (1)

Country Link
CN (1) CN116992458B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117390659A (en) * 2023-12-13 2024-01-12 江苏量界数据科技有限公司 Authority control method based on distributed data calculation
CN117473324A (en) * 2023-11-16 2024-01-30 北京熠智科技有限公司 Model training method, system and storage medium based on SGX and XGBoost

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826053A (en) * 2019-10-11 2020-02-21 北京市天元网络技术股份有限公司 Container-based data sandbox operation result safe output method and device
CN111310208A (en) * 2020-02-14 2020-06-19 云从科技集团股份有限公司 Data processing method, system, platform, equipment and machine readable medium
CN113505895A (en) * 2021-08-05 2021-10-15 上海高德威智能交通系统有限公司 Machine learning engine service system, model training method and configuration method
CN113569987A (en) * 2021-08-19 2021-10-29 北京沃东天骏信息技术有限公司 Model training method and device
CN114586048A (en) * 2019-09-14 2022-06-03 甲骨文国际公司 Machine Learning (ML) infrastructure techniques
WO2022174787A1 (en) * 2021-02-22 2022-08-25 支付宝(杭州)信息技术有限公司 Model training

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114586048A (en) * 2019-09-14 2022-06-03 甲骨文国际公司 Machine Learning (ML) infrastructure techniques
CN110826053A (en) * 2019-10-11 2020-02-21 北京市天元网络技术股份有限公司 Container-based data sandbox operation result safe output method and device
CN111310208A (en) * 2020-02-14 2020-06-19 云从科技集团股份有限公司 Data processing method, system, platform, equipment and machine readable medium
WO2021159684A1 (en) * 2020-02-14 2021-08-19 云从科技集团股份有限公司 Data processing method, system and platform, and device and machine-readable medium
WO2022174787A1 (en) * 2021-02-22 2022-08-25 支付宝(杭州)信息技术有限公司 Model training
CN113505895A (en) * 2021-08-05 2021-10-15 上海高德威智能交通系统有限公司 Machine learning engine service system, model training method and configuration method
CN113569987A (en) * 2021-08-19 2021-10-29 北京沃东天骏信息技术有限公司 Model training method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117473324A (en) * 2023-11-16 2024-01-30 北京熠智科技有限公司 Model training method, system and storage medium based on SGX and XGBoost
CN117390659A (en) * 2023-12-13 2024-01-12 江苏量界数据科技有限公司 Authority control method based on distributed data calculation
CN117390659B (en) * 2023-12-13 2024-04-02 江苏量界数据科技有限公司 Authority control method based on distributed data calculation

Also Published As

Publication number Publication date
CN116992458B (en) 2024-09-03

Similar Documents

Publication Publication Date Title
CN110633805B (en) Longitudinal federal learning system optimization method, device, equipment and readable storage medium
CN109167695B (en) Federal learning-based alliance network construction method and device and readable storage medium
US10554420B2 (en) Wireless connections to a wireless access point
CN116992458B (en) Programmable data processing method and system based on trusted execution environment
CN109274652B (en) Identity information verification system, method and device and computer storage medium
CN110633806A (en) Longitudinal federated learning system optimization method, device, equipment and readable storage medium
CN113127916A (en) Data set processing method, data processing device and storage medium
CN112613956B (en) Bidding processing method and device
CN114329529A (en) Asset data management method and system based on block chain
CN107196919B (en) Data matching method and device
CN116502732B (en) Federal learning method and system based on trusted execution environment
CN112434334A (en) Data processing method, device, equipment and storage medium
CN115242553B (en) Data exchange method and system supporting safe multi-party calculation
CN112861102A (en) Block chain-based electronic file processing method and system
Chenli et al. Fairtrade: Efficient atomic exchange-based fair exchange protocol for digital data trading
CN116244725A (en) File processing method and device based on block chain, equipment and file contribution system
CN111125734B (en) Data processing method and system
CN111131227B (en) Data processing method and device
CN114418769A (en) Block chain transaction charging method and device and readable storage medium
CN113761513A (en) Data processing method, device, equipment and computer readable storage medium
CN116132185B (en) Data calling method, system, device, equipment and medium
Zhan et al. Multi-party Non-interactive Atomic Fair Data Exchange based on Blockchain
CN114428970A (en) Service calling method, terminal device, server and electronic device
CN117595996A (en) Electronic signature processing method and device, electronic equipment and storage medium
CN118101206A (en) Data processing method, apparatus, device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant