CN115829751A - Data product circulation method, system, computing node and storage medium - Google Patents

Data product circulation method, system, computing node and storage medium Download PDF

Info

Publication number
CN115829751A
CN115829751A CN202211668014.1A CN202211668014A CN115829751A CN 115829751 A CN115829751 A CN 115829751A CN 202211668014 A CN202211668014 A CN 202211668014A CN 115829751 A CN115829751 A CN 115829751A
Authority
CN
China
Prior art keywords
data
federal learning
product
federal
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211668014.1A
Other languages
Chinese (zh)
Inventor
杨一帆
刘汪根
张燕
夏正勋
唐剑飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transwarp Technology Shanghai Co Ltd
Original Assignee
Transwarp Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transwarp Technology Shanghai Co Ltd filed Critical Transwarp Technology Shanghai Co Ltd
Priority to CN202211668014.1A priority Critical patent/CN115829751A/en
Publication of CN115829751A publication Critical patent/CN115829751A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data product circulation method, a data product circulation system, a computing node and a storage medium. The method comprises the following steps: after receiving the release information of the federal learning data product sent by a data transaction platform, applying a data transaction certificate to the data transaction platform so that the data transaction platform audits the participants based on the configuration information of the federal learning data product, and sending the data transaction certificate and a start instruction of a federal learning task to the participants who have audited; and if a federal learning task starting instruction issued by the data transaction platform is received, participating in a federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes, so that the data transaction platform sends the corresponding output results to a target participant. The method can safely and conveniently circulate the product while realizing the sharing of the federal learning data product, avoids data leakage and improves the data safety and privacy of each participant.

Description

Data product circulation method, system, computing node and storage medium
Technical Field
The embodiment of the invention relates to the technical field of data circulation, in particular to a data product circulation method, a data product circulation system, a computing node and a storage medium.
Background
In the era of digital economy, data has become an important strategic resource and a key production element, and has become a core engine for the deepened development of digital economy. Activating the potential of the data elements and releasing the value of the data elements become key measures for promoting the development of digital economy, and establishing a sound data element market is an important guarantee for fully playing the data value.
Data exchange in the traditional data element market is mainly carried out in a data entity sharing mode. However, the traditional data entity sharing mode has the risk of revealing data privacy and cannot meet the actual requirements of all participants in the data element market.
Disclosure of Invention
The invention provides a data product circulation method, a data product circulation system, a computing node and a storage medium, which are used for solving the problems that the traditional data entity sharing mode has the risk of revealing data privacy and can not meet the actual requirements of all participants in a data element market.
According to an aspect of the present invention, there is provided a data product circulation method applied to a participant for data product circulation in a data product circulation system, including:
after receiving the release information of the federal learning data product sent by a data transaction platform, applying a data transaction certificate to the data transaction platform so that the data transaction platform audits the participants based on the configuration information of the federal learning data product, and sending the data transaction certificate and a start instruction of a federal learning task to the participants who have audited;
and if a federal learning task starting instruction issued by the data transaction platform is received, participating in a federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes, so that the data transaction platform sends the corresponding output results to a target participant.
According to another aspect of the present invention, there is provided a data product circulation method applied to a data transaction platform in a data product circulation system, including:
after a data transaction voucher application sent by a participant is received, auditing the participant based on configuration information of a federal learning product;
sending a data transaction certificate and a federal learning task starting instruction to the checked participants so that the corresponding participants participate in the federal learning task to obtain output results corresponding to different federal learning scenes;
and sending the corresponding output result to the target participant.
According to another aspect of the invention, a circulation system of data products is provided, which comprises a data transaction platform and participants for data circulation, wherein the participants comprise a participant as a federal learning task initiator;
the system comprises a participating party serving as a federal learning task initiating party and a data transaction platform, wherein the participating party is used for locally completing configuration information of a federal learning data product and initiating a federal learning data product registration request to the data transaction platform, and the federal learning data product registration request comprises the configuration information of the federal learning data product;
the data transaction platform is used for managing the federal learning data product and then sending the information of issuing the federal learning data product to each participant;
the participator is used for applying for a data transaction certificate to the data transaction platform after receiving the information issued by the federal learning data product and sent by the data transaction platform;
the data transaction platform is further used for auditing the participants based on the configuration information of the federal learning product after receiving a data transaction voucher application request sent by the participants, and sending data transaction vouchers and federal learning task starting instructions to the participants who are approved;
the participators are also used for participating in the federal learning task to obtain output results corresponding to different scenes; and the data transaction platform is also used for sending the corresponding output result to a target participant.
According to another aspect of the present invention, there is provided a computing node comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the data product distribution method of any of the embodiments of the invention.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to perform a data product circulation method according to any one of the embodiments of the present invention when the computer instructions are executed.
According to the technical scheme of the embodiment of the invention, after receiving the information issued by the Federal learning data product sent by the data transaction platform, the data transaction platform is applied for the data transaction voucher so that the data transaction platform audits the participants based on the configuration information of the Federal learning data product, and the data transaction voucher and a Federal learning task starting instruction are sent to the participants who are audited; if a federal learning task starting instruction issued by the data transaction platform is received, the data transaction platform participates in the federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes, so that the data transaction platform sends the corresponding output results to target participants, the problems that data privacy risks are revealed in a traditional data entity sharing mode and actual requirements of the participants in a data element market cannot be met are solved, the federal learning data product sharing is achieved, the product can be safely and regularly circulated, data leakage is avoided, and the beneficial effects of data safety and privacy of the participants are improved.
It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a data product circulation method according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating a data product circulation method according to a second embodiment of the present invention;
fig. 3 is a schematic flow chart of a data product circulation method according to a third embodiment of the present invention;
fig. 4 is a schematic flow chart of a data product circulation method according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data product circulation system according to a fifth embodiment of the present invention;
fig. 6a is a schematic data flow chart of a learning scenario of a federated learning data product according to an exemplary embodiment of the present invention;
fig. 6b is a schematic data flow chart of a learning scenario of a federated learning data product according to an exemplary embodiment of the present invention;
fig. 7 is a schematic structural diagram of a compute node of the data product circulation method according to the embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention. It should be understood that the various steps recited in the method embodiments of the present invention may be performed in a different order and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the invention is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It is noted that references to "a", "an", and "the" modifications in the present invention are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that reference to "one or more" unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present invention are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Example one
Fig. 1 is a flowchart of a data product circulation method according to an embodiment of the present invention, where the method is applicable to a case where data products are managed and circulated, and the method may be executed by a computing node as a participant, where the computing node may be implemented by software and/or hardware and is generally integrated on a data product circulation system. In the system, any party can be used as a data provider and a data consumer, and any party can complete the federal learning task according to own data and data owned by other parties.
As shown in fig. 1, a data product circulation method according to a first embodiment of the present invention includes the following steps:
s110, after receiving information issued by a data transaction platform and sent by a federal learning data product, applying a data transaction certificate to the data transaction platform so that the data transaction platform audits participants based on configuration information of the federal learning data product, and sending the data transaction certificate and a federal learning task execution instruction to the participants who are audited.
The data transaction platform can be a platform for data circulation and management, and can be used for registration, subscription, auditing, publishing, updating and state management of data products. The data transaction platform may also be responsible for state information management of the federal learning data product. A federal learning data product may be understood as a data product used for federal learning tasks; the data product is a product form produced based on data, and it needs to be explained that the data product does not contain real original data; federal learning is a technology for realizing a multi-party collaborative optimization model by protecting data privacy through an encryption technology on the premise that all participants with data can not get out of an original library.
The method comprises the following steps that a data transaction platform issues a federal learning data product basic information subscription message to a participant as a data consumer; the data transaction platform can also be used as a participant of the data provider to issue a successful release message of the federal learning data product.
The data transaction certificate is a certificate which is issued by the data transaction platform to the participant and can prove that the participant has the data transaction qualification, and the data transaction certificate is not limited to be applied to the data transaction platform in any way.
The configuration information of the federal learning data product is distinguished from a definition mode of sharing the configuration information of the traditional data product by a data entity, and the configuration information can comprise basic information of the defined federal learning data product and change information of the federal learning data product; the federal learning data product essential information includes one or more of the following: the method comprises the following steps of generating federal learning data product basic information, participator data information, federal learning task information and state information of the federal learning data product; the federal learning data product change information includes one or more of the following: and the federation learns an output result attribution strategy, a participant management strategy, a security strategy and an online version management strategy.
The process of auditing the participants based on the configuration information of the federated learning data product is not specifically limited herein.
The federated learning tasks may include a federated learning model learning task and a federated learning inference task, among others. The federal learning model learning task can be a task for performing model training on the federal learning model, and the federal learning inference task can be a task for performing inference based on the trained federal learning model. The federal learning task start instruction may be an instruction to start a federal learning task.
In this embodiment, the data product circulation may include a plurality of parties, and the plurality of parties may include a task initiator for initiating a federal learning task of a federal learning data product. The initiator needs to complete configuration of the federal learning data product locally, namely, define the federal data product, and send configuration information to the data transaction platform, so that the data transaction platform can release the federal learning data product, and check whether the participant has a transaction qualification according to the configuration information of the federal learning data product. It should be noted that the federal learning data product issued by the data transaction platform only contains logic and attribute configuration, and does not contain source data; in the data circulation process, the local data of each participant can not be out of the domain, and data leakage is avoided.
Specifically, in the circulation process of the federal learning data product, each participant can know the basic information of the federal learning data product through the release information of the federal learning data product sent by a data transaction platform, such as the data source format requirement and the quantity requirement of the federal learning data product and the corresponding federal learning task; if the participator wants to participate in the federal learning task of the federal learning data product, transaction certificates need to be applied to the data product transaction platform, the data transaction platform respectively verifies the data transaction certificate applications submitted by the participator according to the configuration information of the federal learning data product sent by the task initiator, and if the data transaction certificate applications submitted by the participator pass the verification, the participator can receive the data transaction certificates and the start instructions of the federal learning task issued by the data transaction platform.
And S120, if a federal learning task starting instruction issued by the data transaction platform is received, participating in a federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes, so that the data transaction platform sends the corresponding output results to a target participant.
The participator receiving the start command of the federal learning task issued by the transaction platform can participate in the federal learning task of the federal learning data product. The federal learning task initiation instructions may include federal learning model learning task instructions and federal learning reasoning task instructions.
In this embodiment, the specific content of the federal learning scenario and the federal learning task may be determined according to the configuration information of the federal data product. The method comprises the steps of obtaining configuration information of a federated learning task, wherein the configuration information comprises configuration information of a federated learning task, and determining whether a current federated learning scene is a learning scene or a reasoning scene according to the federated learning scene in the federated learning task information in the configuration information.
Specifically, if the federal learning scene is a learning scene, the participator receives a federal learning model learning task instruction, the executable federal learning task is a federal learning model learning task, and the corresponding output result can be a final federal learning model; the final federated learning model can be obtained by at least one participant after completing the standard federated learning model learning process together; if the federal learning scene is a reasoning scene, the participator receives a federal learning reasoning task instruction, the executable federal learning task is a federal reasoning task, and the corresponding output result can be a reasoning result; wherein, the reasoning result can be obtained by at least one participant participating in the process of reasoning in the federal learning.
It should be noted that the federal learning types can include horizontal federal learning and vertical federal learning, and the vertical federal learning model reasoning requires the common participation of multiple participants to complete.
The target participants can be participants capable of receiving the output result, can be determined according to the federal learning output result attribution strategy in the configuration information, can take all the participants as the target participants, and can also take the data demand party as the target participants.
The data product circulation method provided by the embodiment of the invention comprises the steps that firstly, after receiving the information issued by a federal learning data product sent by a data transaction platform, data transaction certificates are applied to the data transaction platform, so that the data transaction platform audits participants based on the configuration information of the federal learning data product, and the data transaction certificates and federal learning task starting instructions are sent to the participants who have passed the audit; and then, if a federal learning task starting instruction issued by the data transaction platform is received, participating in a federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes, so that the data transaction platform sends the corresponding output results to a target participant. The method defines an open shared federal learning data product, can safely and compliantly realize the circulation of data elements through the federal learning data product, avoids data leakage, and improves the data safety and privacy of each participant.
On the basis of the above-described embodiment, a modified embodiment of the above-described embodiment is proposed, and it is to be noted herein that, in order to make the description brief, only the differences from the above-described embodiment are described in the modified embodiment.
In one embodiment, the federated learning task comprises a federated learning model learning task, and the different federated learning scenarios comprise a federated data product learning scenario; correspondingly, the step of participating in the federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes comprises the following steps: participating in the learning process of the federal learning model to obtain a final federal learning model; and taking the final federal learning model as an output result corresponding to the learning scene of the federal data product.
In this embodiment, the federal learning scene is a federal data product learning scene, the federal learning task is a federal learning model learning task, and at least one participant participates in the federal learning model learning task of the federal learning data product to obtain a final federal learning model. The learning process of the federated learning model is not described in detail herein for the prior art.
The learning process of the federal learning model can comprise the following steps: and combining the product data of a plurality of participants to complete the parameter learning of the algorithm under the condition of protecting privacy.
In one embodiment, the federal learning task includes a federal learning inference task, the different federal learning scenarios include a federal data product inference scenario, and accordingly, the obtaining of the output results corresponding to the different federal learning scenarios by participating in the federal learning task of the federal learning data product includes: carrying out federated learning reasoning based on the final federated learning model to obtain a reasoning result; and taking the inference result as an output result corresponding to the inference scene of the federal data product.
In this embodiment, the federal learning scenario is a federal data product reasoning scenario, the federal learning task is a federal learning reasoning task, and at least one participant participates in the federal learning reasoning task of the federal learning data product to obtain a reasoning result.
The process of federated learning inference may include: and combining the data of a plurality of participants and carrying out learning inference through a final federal learning model. The federal learning reasoning process is prior art and is not described herein in detail.
It can be understood that the number of the federal learning inference tasks can be multiple, the process of each federal learning inference task is the same, and after the current federal learning inference task is completed, the next federal learning inference task needs to be executed continuously until the federal learning inference task is finished, and the condition for finishing the federal learning inference task can be considered as a setting.
Example two
Fig. 2 is a schematic flow chart of a data product circulation method according to a second embodiment of the present invention, and the second embodiment is optimized based on the foregoing embodiments. Please refer to the first embodiment for a detailed description of the present embodiment.
As shown in fig. 2, a data product circulation method provided by the second embodiment of the present invention includes the following steps:
and S210, serving as a participant of a federal learning task initiator, and locally finishing configuration information of a federal learning data product.
The participants may include federal learning task initiators and non-federal learning task initiators, among others. The federal learning task originator needs to complete the configuration information of the federal learning data product locally. The federal learning task initiator can serve as a data provider, and the non-federal learning task initiator can serve as a data consumer.
Specifically, the functions of the configuration information mainly include: enabling the data transaction platform to determine whether to release the federal learning product to appointed participants based on the basic information of the federal learning data product and the data information of the participants in the configuration information; enabling the participating party to determine whether to participate in a federal learning task of a federal learning data product or not based on basic information of the federal learning data product in the configuration information, namely whether to apply a data transaction certificate to a data transaction platform or not; enabling the data transaction platform to verify whether the participant is qualified to participate in the federal learning task or not based on the participant data information, the federal learning task information and the participant management strategy in the configuration information, namely whether the participant issues a data transaction certificate to the participant who submits the data transaction certificate application or not; and enabling the data transaction platform to determine which party to send the output result to based on the federal learning output result attribution strategy in the configuration information, namely determining the target participation method.
Optionally, the configuration information of the federal learning data product includes basic information of the federal learning data product and change information of the federal learning data product.
Optionally, the federal learning data product essential information includes one or more of the following: the method comprises the following steps of generating federal learning data product basic information, participator data information, federal learning task information and state information of the federal learning data product; the basic information of the federal learning data product comprises a product name, product introduction and a product adaptation scene; the participant data information includes a data source description and a data field description; the federal learning task information comprises algorithm and parameter information, a federal learning scene and a federal learning type.
The basic information of the federal learning data product describes the name, introduction, model division, adaptive scene and the like of the federal learning data product, and helps a demand side to quickly search the data product and know the federal learning scene adaptive to the data product.
The participant data information may be understood as specifying the data sources and data fields of each participant to clarify the data requirements of the federated learning model for the participant. The data source description comprises a data source type, a data storage format, a data classification and the like; the data field description includes column name, chinese name, field type, field description, and whether it is a sensitive type. It should be noted that, in the horizontal federal learning scenario, all data fields of the participants must be the same, and any participant who wants to join in horizontal federal learning needs to ensure that local data meets the requirements on the data fields in the logical model of the federal learning data product; in a longitudinal federal learning scene, the data fields of all participants are different, and only one participant is allowed to hold the label field, so that the longitudinal federal learning data product also needs to determine the distribution scheme of all participants to the data fields according to the local data condition of the participants.
The federal learning task information is used for explaining algorithm information selected by a federal learning data product, and comprises the following steps: algorithm name, algorithm type (classification/clustering/regression, etc.), federal learning type, federal learning scenario, input feature name and number, tag name, tag type, output type, etc. Federal learning algorithm participation may include general configuration parameters, hyper-parameters, and model parameters. Wherein the federal learning types include horizontal federal learning and vertical federal learning; the federal learning scenario is divided into a learning scenario and an inference scenario.
Federal learning data product status information includes: commit, registered, published, running, finished, etc.
Optionally, the federal learning data product change information includes one or more of the following: the method comprises the steps of (1) outputting a result attribution strategy, a participant management strategy, a safety strategy and an online version management strategy through federal learning; the participant management strategy comprises a participant fixed quantity requirement, a participant quantity range and participant management information; the security policy comprises a transaction credential application and authorization policy, a party authentication policy and an encryption protocol.
Wherein, the federal learning output result attribution strategy specifies which participants the output result belongs to. The main attribution strategy is owned by all participants and owned by data demand parties, etc.
The participant management policy may include participant information management, participant fixed quantity requirements, and participant quantity range, i.e., participant quantity minimum and maximum limits. The participant information management may include initial participant information storage and participant information maintenance, such as addition or deletion; the fixed number of participants refers to the number of specified participants of the federal learning data product; the number range of the participants refers to the minimum number and the maximum number of the federate learning task participants, and the participants are not allowed to newly increase or quit beyond the interval range.
The security policy may include participant transaction credential application and authorization policy, participant authentication policy, encryption protocol, and the like. The participant transaction certificate application and authorization strategy comprises the following steps: the method comprises the following steps that a participant wants to join a federal learning task, a transaction certificate is required to be applied to a data transaction platform firstly, the data transaction platform checks whether data of the participant meet data requirements in a logic model or not, influences on the current federal learning task and other participant services are influenced to determine whether the transaction certificate is issued or not, and only the participant with the transaction certificate has authority to join the federal learning task; the participant authentication strategy means that before the start of a federal learning task, a data transaction platform needs to authenticate all participants, the data transaction platform needs to be checked with participant information defined in a participant information management strategy, and the participants with unmatched information fail to authenticate. In addition, all parties need to hold transaction credentials; the encryption protocol refers to the description of data, a model encryption protocol and a transmission link encryption protocol used by the federal learning data product, such as a homomorphic encryption, a DH algorithm, secret sharing and other encryption protocols.
The online version management strategy of the data product specifically comprises the following steps: version information management, version upgrade/change policies, etc. The version information management is responsible for recording and tracing the version condition of the federal learning data product. The version upgrading/changing means that when the content defined by the federal learning data product is changed, for example, the product state is updated, the information of the participants is changed, the maximum value and the minimum value of the number of the participants are changed, the affiliation strategy of the federal learning output result is changed, and the like, the version of the data product needs to be synchronously changed or upgraded on line. The multiple version upgrade/change policies include: a version change/upgrade mode without service interruption, a version change/upgrade mode with service interruption of an influencing party, a version change/upgrade mode with service interruption, and the like.
S220, initiating a federal learning data product registration request to the data transaction platform so that the data transaction platform can manage the federal learning data product and then send information released by the federal learning data product to each participant; wherein the federated learning data product registration request includes configuration information for a federated learning data product.
In this embodiment, the federal learning data product registration request may be understood as a request for registering a federal learning data product on a data transaction platform. Managing the federated learning data product may include registering, auditing, releasing, etc. the federated learning data product.
In this embodiment, after the data provider serving as the participant of the federal learning task initiator locally completes configuration information of the federal learning data product, the data provider may initiate a federal learning data product registration request to the data transaction platform, so that the data transaction platform, after receiving the federal learning data product registration request, performs management such as registration, audit, release and the like on the federal learning data product, sends a successful release message of the federal learning data product to the participant of the federal learning task initiator, and sends a basic information subscription message of the release federal learning data product to the participant of the non-federal learning task initiator, i.e., the data consumer. The participation party of the non-federal learning task initiator can comprise a data consuming party specified by the federal learning task initiator, and can also comprise a data consuming party which sends basic information subscription information of the federal learning data product to the data transaction platform. Optionally, the subscription message may adopt an active push manner or a subscription request manner.
It can be understood that, since the federal learning data product registration request includes the configuration information of the federal learning data product, the data transaction platform can register, audit and issue the configuration information of the federal learning data product after receiving the federal learning data product registration request, so that all the participating parties can obtain the configuration information of the federal learning data product.
And S230, the participating party serving as the initiator of the federal learning task automatically applies for a data transaction certificate to the data transaction platform after receiving the information issued by the federal learning data product sent by the data transaction platform, so that the data transaction platform audits the participating party based on the configuration information of the federal learning data product and sends the data transaction certificate and a start instruction of the federal learning task to the approved participating party.
Specifically, after receiving the information issued by the federal learning data product and sent by the data transaction platform, the participating party serving as the initiator of the federal learning task can directly apply for the data transaction certificate from the data transaction platform and request the data transaction platform to issue the data transaction certificate, so that the participating party holding the data transaction certificate can participate in the federal learning task of the federal learning data product.
S240, the participants serving as non-federal learning task initiators passively trigger application of data transaction certificates to the data transaction platform after receiving federal learning data product release information sent by the data transaction platform, so that the data transaction platform audits the participants based on configuration information of the federal learning data product and sends the data transaction certificates and federal learning task starting instructions to the participants who are audited.
Specifically, after receiving the federal learning data product release information sent by the data transaction platform, the participant serving as the non-federal learning task initiator needs to be manually triggered to apply for a data transaction certificate, where the triggering condition is not specifically limited and may be set according to a specific business process.
It should be noted that S230 and S240 may be executed simultaneously.
And S250, if a federal learning task starting instruction issued by the data transaction platform is received, participating in a federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes, so that the data transaction platform sends the corresponding output results to a target participant.
On one hand, in the method, the definition of the configuration information of the federal learning data product is different from the definition mode of the traditional data product shared by data entities, and the data product definition based on the federal learning is adopted, so that the data leakage of the data product in the circulation process can be avoided; on the other hand, the method integrates the learning process of the federal learning model and the reasoning process of the federal learning model, can avoid data leakage and protect the privacy of each participant in the process, and provides guidance for the actual transaction of the federal learning data product.
EXAMPLE III
Fig. 3 is a schematic flow chart of a data product circulation method according to a third embodiment of the present invention, where the method may be applied to a situation where a data product is managed and circulated, and the method may be executed by a computing node as a data transaction platform, where the computing node may be implemented by software and/or hardware and is generally integrated on a data product circulation system. In the system, any party can be used as a data provider and a data consumer, and any party can complete the federal learning task according to own data and data owned by other parties.
As shown in fig. 3, a data product circulation method provided by the second embodiment of the present invention includes the following steps:
and S310, after the data transaction voucher application sent by the participant is received, auditing the participant based on the configuration information of the federal learning product.
In this embodiment, after receiving a data transaction voucher application sent by a participant, the data transaction platform may audit the participant sending the data transaction voucher application according to configuration information of a federal learning data product configured by a federal learning task initiator, so as to determine whether the participant meets requirements.
In this embodiment, auditing the participants based on the configuration information of the federal learning product may include auditing the participants in a federal data product learning scenario; auditing of the participants may also be included in a federal data product reasoning scenario. The auditing modes corresponding to different scenes are different, and the auditing process is not specifically described here.
And S320, sending a data transaction certificate and a federal learning task starting instruction to the approved participants so that the corresponding participants participate in the federal learning task to obtain output results corresponding to different federal learning scenes.
In this embodiment, if the application of the data transaction certificate sent by the participating party is approved, the data transaction platform may send the data transaction certificate to the participating party, and send a federal learning task start instruction to the participating party holding the data transaction certificate, so that the participating party holding the data transaction certificate can participate in the federal learning task of the federal learning data product to obtain a corresponding output result.
The federal learning task starting instruction can comprise a federal learning model learning task instruction and a federal learning reasoning task instruction; the federated learning tasks may include federated learning model learning tasks and federated learning inference tasks; different federated learning scenarios may include a federated data product learning scenario and a federated data product reasoning scenario.
And S330, sending the corresponding output result to a target participant.
In this embodiment, the corresponding output result may include the final federal learning model and the inference result.
According to the technical scheme of the embodiment of the invention, after a data transaction certificate application sent by a participant is received, the participant is audited based on configuration information of a federal learning product; sending a data transaction certificate and a federal learning task starting instruction to the checked participants so that the corresponding participants participate in the federal learning task to obtain output results corresponding to different federal learning scenes; and sending the corresponding output result to the target participant. When realizing the sharing of the federal learning data product, the product can be circulated safely and compliantly, thereby avoiding data leakage and improving the data safety and privacy of each participant.
On the basis of the above-described embodiment, a modified embodiment of the above-described embodiment is proposed, and it is to be noted herein that, in order to make the description brief, only the differences from the above-described embodiment are described in the modified embodiment.
In one embodiment, the manner of auditing the data transaction voucher application request based on the configuration information of the federal learning data product may be: checking whether the participants applying the data transaction voucher meet the requirement of the number of the participants or are the participants appointed by the initiator of the federal learning task; it may also be a data description, such as a data source description and a data field description, etc., that is configured by the federal learning task initiator to be audited whether the data provided by the participants meets. Optionally, whether the data quality of the data provided by the participant meets the requirement may be checked.
Optionally, the configuration information of the federal learning data product includes basic information of the federal learning data product and change information of the federal learning data product.
Optionally, the federal learning data product essential information includes one or more of the following: the method comprises the following steps of generating federal learning data product basic information, participator data information, federal learning task information and state information of the federal learning data product; the basic information of the federal learning data product comprises a product name, product introduction and a product adaptation scene; the participant data information includes a data source description and a data field description; the federal learning task information comprises algorithm and parameter information, a federal learning scene and a federal learning type.
Optionally, the federal learning data product change information includes one or more of the following: the method comprises the steps of a federal learning output result attribution strategy, a participant management strategy, a safety strategy and an online version management strategy; the participant management strategy comprises a participant fixed quantity requirement, a participant quantity range and participant management information; the security policy comprises a transaction credential application and authorization policy, a party authentication policy and an encryption protocol.
Example four
Fig. 4 is a schematic flow chart of a data product circulation method according to a fourth embodiment of the present invention, and the fourth embodiment is optimized based on the foregoing embodiments. Please refer to the third embodiment for a detailed description of the present embodiment.
As shown in fig. 4, a data product circulation method provided by the fourth embodiment of the present invention includes the following steps:
s410, receiving a federal learning data product registration request initiated by a participant serving as a federal learning task initiator, wherein the federal learning data product registration request comprises configuration information of a federal learning data product.
In this embodiment, after the participating party serving as the federal learning task initiator locally completes the configuration information of the federal learning data product, a register request of the federal learning data product including the configuration information may be sent to the data transaction platform.
The manner in which the data trafficking platform receives the federal learning data product registration request is not particularly limited herein.
And S420, managing the federated learning data product and then sending information released by the federated learning data product to each participant.
In this embodiment, after receiving the federal learning data product registration request, the data transaction platform may perform management such as registration, audit and release on configuration information included in the federal learning data product registration request.
It should be noted that the information issued by the data transaction platform to the participating party as the federal learning task initiator and the participating party other than the federal learning task initiator may be different, and since the participating party as the federal learning task initiator is the configuration party of the federal learning data product, the data transaction platform only needs to inform the federal learning task initiator that the federal learning data product has been issued; and the information released by the data transaction platform to the federal learning data product which is sent by the party serving as the initiator of the non-federal learning task needs to contain the basic information of the federal learning data product, so that the party serving as the initiator of the non-federal learning task can know the federal learning data product. The participation party of the non-federal learning task initiator can comprise a data consuming party specified by the federal learning task initiator, and can also comprise a data consuming party which sends basic information subscription information of the federal learning data product to the data transaction platform.
And S430, after receiving the data transaction voucher application sent by the participant, auditing the participant based on the configuration information of the federal learning product.
In this embodiment, when the federal learning scenario is a federal data product learning scenario, the auditing of the participants may include auditing the participant's permissions, auditing whether the participant data meets the data field requirements, and initiating federal data quality assessment.
Specifically, when the different federal learning scenarios include a federal data product learning scenario, the configuration information based on the federal learning product is used to audit the participants, including: checking whether a participant who sends a data transaction certificate application has participation right or not based on a participant management strategy in the configuration information of the federal learning product; if the participation right is provided, whether the local data of the participant meets the participant data information in the configuration information of the federal learning product is checked; if the audit is passed, a federal data quality evaluation request is sent to the participant, so that the participant receiving the federal data quality evaluation request locally completes the federal data quality evaluation and then sends a generated federal data quality evaluation report to the data transaction platform; and receiving a federal data quality evaluation report sent by the participant, and auditing the federal data quality evaluation report.
Managing and checking whether the participator is a designated participator according to participator information in the participator management strategy; checking whether the number of the participants exceeds the limitation of the number of the participants according to the number range of the participants and the fixed number of the participants in the management strategy of the participants; and checking whether the participant data meets the data field requirement according to the data source description and the data field description in the participant data information. It should be noted that, in the horizontal federal learning scenario, all data fields of the participants must be the same, and any participant who wants to join in horizontal federal learning needs to ensure that local data meets the requirements on the data fields in the logical model of the federal learning data product; in a longitudinal federal learning scene, the data fields of all participants are different, and only one party is allowed to hold the label fields, so that the longitudinal federal learning data product also needs to determine the distribution scheme of all the participants to the data fields according to the local data condition of the participants.
In a federal data product learning scene, after the data transaction platform finishes auditing the participants, a federal data quality evaluation request needs to be initiated to the participants who pass the auditing; and the data transaction platform can audit the received federal data quality evaluation report.
In this embodiment, when the federal learning scenario is a federal data product reasoning scenario, the auditing of the participants may include auditing participant permissions and auditing whether participant data meets data field requirements.
Specifically, when the different federal learning scenarios include a federal learning data product reasoning scenario, the configuration information based on the federal learning product is used for auditing the participants, including: checking whether a participant who sends a data transaction certificate application has participation right or not based on a participant management strategy in the configuration information of the federal learning product; and if the participation right is provided, whether the local data of the participant meets the participant data information in the configuration information of the federal learning product is checked.
Managing and checking whether the participator is a designated participator according to participator information in the participator management strategy; checking whether the number of the participants exceeds the limitation of the number of the participants according to the number range of the participants and the fixed number of the participants in the management strategy of the participants; and auditing whether the participant data meets the data field requirement according to the data source description and the data field description in the participant data information.
And S440, sending a data transaction certificate and a federal learning task starting instruction to the checked participants so that the corresponding participants participate in the federal learning task to obtain output results corresponding to different federal learning scenes.
In this embodiment, when the data transaction platform issues the data transaction voucher to the approved party, it is further required to locally update the information of the party, that is, update the information of the party of the federal learning data product.
It should be noted that, when the different scenarios include a federal data product learning scenario, before the participant participates in the federal learning task of the federal learning data product to obtain a corresponding output result, the method further includes: determining that the current scene is a federal data product learning scene according to the product adaptation scene and the federal learning scene; determining that the federal learning type of the learning scene of the federal data product is horizontal federal learning or vertical federal learning according to the federal learning type; and determining a federal learning model according to the determined federal learning type and the algorithm and parameter information.
S450, determining a target participant according to the federal learning output result attribution strategy in the configuration information.
The federal learning output result attribution policy may be a policy for determining which participants the output result belongs to. The federal learning output result attribution strategy can be owned by all participants or by data demand parties.
According to the technical scheme of the embodiment of the invention, through applying and managing the data transaction voucher, the participating party can apply for participating in the federal learning task only by the data transaction voucher; the method has the advantages that the federal data quality evaluation and audit links are added in the data transaction voucher application, so that the data quality of the participants participating in the federal learning task can be guaranteed to a certain extent, and the accuracy of the federal learning model can be improved.
Optionally, auditing the participants based on the configuration information of the federal learning product further includes: and performing authorization, authentication and encryption based on the security policy.
The security policy comprises a participant transaction certificate application and authorization policy, a participant authentication policy, an encryption protocol and the like. And (3) applying and authorizing policy of the transaction certificate of the participant: the method comprises the following steps that a participant wants to join a federal learning task, a transaction certificate is required to be applied to a data transaction platform firstly, the data transaction platform checks whether data of the participant meet data requirements in a logic model or not, influences on the current federal learning task and other participant services are influenced to determine whether the transaction certificate is issued or not, and only the participant with the transaction certificate has authority to join the federal learning; the participant authentication policy means: before the federal learning task starts, the data transaction platform needs to authenticate all the participants, the data transaction platform needs to be checked with the participant information defined in the participant information management strategy, and the participants with unmatched information fail to authenticate; the encryption protocol refers to: data/model encryption protocols, transmission link encryption protocols used by federally learned data products are described, for example: homomorphic encryption, DH algorithms, secret sharing, etc.
Further, in the circulation process of data, updating the state of the Federal learning data product according to the state information of the Federal learning data product in the configuration information; the status of the federal learning data product includes one or more of the following: submission, registration, release, running and neutralization are finished.
In one embodiment, the data transaction platform may also update the status of the federal learning data product.
Specifically, when the participant sends the configuration information of the federal learning data product, the data transaction platform can modify the state of the federal learning data product into submission; when the data transaction platform registers the federal learning data product, the state of the federal learning data product can be updated to be registered; after the data transaction platform sends the release information of the federal learning data product to each participant, the state of the federal learning data product can be updated to be released; when the participator executes the federal learning task, the data transaction platform can update the state of the federal learning data product to be in operation; after the execution of the federal learning task is completed, the data transaction platform can update the state of the federal learning data product to be finished.
Further, the method further comprises: and managing the versions of the federal learning data product on line based on an on-line version management strategy.
Specifically, the online management of the version of the federal learning data product based on the online version management policy includes: according to the online version management strategy, aiming at the condition that the existing services of all participants have no influence, selecting a service uninterrupted mode to change the version or upgrade the version; according to the online version management strategy, aiming at the condition that the existing service of part of participants has influence, selecting a service pause mode of the influencing party to change the version or upgrade the version; and selecting a service interruption mode to change the version or upgrade the version aiming at the condition that the existing service of all the participants has influence according to the online version management strategy.
In this embodiment, by evaluating the influence of version change on the participating party, different version upgrade/change policies may be selected, for example: the version changing/upgrading mode is not interrupted by the service, the version changing/upgrading mode is interrupted by the service of the influencing party, and the version changing/upgrading mode is interrupted by the service.
Specifically, for the situation that the existing services of all the participants have no influence, such as data product state updating, security policy changing and the like, a real-time version changing mode without service interruption can be adopted, that is, on the premise of maintaining the operation of the existing services, version changing can be carried out at any time; aiming at the condition that part of the participants have influence on the existing service, for example, in the running process of a horizontal federal learning task, a new participant is added, the information of the participant is changed, most of the participants are not influenced, but the information of the participant influences a federal learning coordinator, at the moment, the influence party can be adopted to suspend the service running, the service of other participants is not interrupted, and the affected participant continues the service running after the version change is finished; aiming at the conditions that the existing business of all the participants has influence, such as the change of operator over parameters, the change of the definition of a transverse federal data field and the like, a business interruption version upgrading strategy can be adopted, and all the participants can stop the existing running service within the appointed time and restart the service after version upgrading.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a data product circulation system according to a fifth embodiment of the present invention. As shown in fig. 5, the system includes a data transaction platform 120 and a participant 110 conducting data circulation, where the participant 110 includes a participant 111 as a federal learning task initiator; wherein the content of the first and second substances,
the participating party 111 serving as a federal learning task initiating party is used for locally completing configuration information of a federal learning data product and initiating a federal learning data product registration request to the data transaction platform 120, wherein the federal learning data product registration request comprises the configuration information of the federal learning data product;
the data transaction platform 120 is used for sending federal learning data product release information to each participant after managing the federal learning data product;
the participating party 110 is configured to apply for a data transaction certificate to the data transaction platform 120 after receiving the federal learning data product release information sent by the data transaction platform 120;
the data transaction platform 120 is further configured to, after receiving a data transaction credential application request sent by the participant 110, audit the participant 110 based on configuration information of the federal learning product, and send a data transaction credential and a federal learning task start instruction to the participant who has passed the audit;
the participating party 110 is further configured to participate in the federal learning task to obtain output results corresponding to different scenarios;
the data transaction platform 120 is further configured to send the corresponding output result to the target participant.
Specifically, for a federated learning scenario, the data circulation system may perform the following: the participant 111 serving as a federal learning task initiator is used for locally completing configuration information of a federal learning data product and initiating a federal learning data product registration request to the data transaction platform, wherein the federal learning data product registration request comprises the configuration information of the federal learning data product; the data transaction platform 120 is configured to send federal learning data product release information to each participant 110 after performing management such as registration, audit and release on the federal learning data product; the participating party 110 is configured to apply for a data transaction certificate to the data transaction platform 120 after receiving the federal learning data product release information; the data transaction platform 120 is further configured to, after receiving a data transaction certificate application request sent by a participant, audit the participant based on configuration information of a federal learning product, where the audit includes auditing participant permission and federal data quality assessment, and send a data transaction certificate and a federal learning task start instruction to the participant who has passed the audit; the participator 110 is also used for participating in a federal learning model learning task of a federal learning product to obtain a final federal learning model; the data trafficking platform 120 is also used to send the final federal learning model to the target participants.
Specifically, for the federated inference scenario, the data flow system may perform the following: the participant 111 serving as a federal learning task initiator is used for locally completing configuration information of a federal learning data product and initiating a federal learning data product registration request to the data transaction platform, wherein the federal learning data product registration request comprises the configuration information of the federal learning data product; the data transaction platform 120 is configured to send federal learning data product release information to each participant 110 after performing management such as registration, audit and release on the federal learning data product; the participating party 110 is configured to apply for a data transaction certificate to the data transaction platform 120 after receiving the federal learning data product release information; the data transaction platform 120 is further configured to, after receiving a data transaction voucher application request sent by a participant, audit the participant based on configuration information of the federal learning product, where the audit includes auditing participant permissions, and send a data transaction voucher and a federal learning task start instruction to the participant who has passed the audit; the participator 110 is also used for participating in a federal learning reasoning task of a federal learning product, and reasoning the federal learning data product based on a final federal learning model to obtain a reasoning result; the data trafficking platform 120 is also used to send inference results to the target participants.
The content of the method embodiments can be referred to where this embodiment is not described, and the description will not be repeated here.
According to the circulation system of the data product provided by the embodiment of the invention, the system can safely and conveniently circulate the product while realizing the sharing of the federal learning data product, so that the data leakage is avoided, and the data safety and privacy of each participant are improved.
The embodiments of the present invention provide several specific implementation manners based on the technical solutions of the above embodiments.
Fig. 6a is a schematic data flow chart of a learning scenario of a federated learning data product according to an exemplary embodiment of the present invention, and as shown in fig. 6a, the data flow chart includes the following steps:
step 1: and releasing the federal learning data product.
Step 101: the data provider, namely a participant serving as a federal learning task initiator, locally completes the definition of the federal learning data product, namely the configuration information of the federal learning data product, and initiates a register request of the federal learning data product to the data transaction platform.
Step 102: the data transaction platform is responsible for managing the federal learning data product, receives a register request of the federal learning data product submitted by a data provider, completes the storage of the federal learning data product, and sets the state as: is registered; after the registration is finished, the data are published after being checked and approved by the data transaction platform, the basic information subscription message of the federal learning data product is published to a data product appointed data consumer, namely a participant which is not a federal learning task initiator, and the product state is updated as follows: the release is already carried out; and the data transaction platform sends a successful releasing message of the federal learning data product to the data provider to complete the releasing of the whole data product. The subscription message may be actively pushed or may be a subscription request.
Step 2: data transaction voucher management and data quality evaluation.
Step 201: and the participant applies for the data transaction voucher from the data transaction platform. After receiving the successful release message of the federal learning data product, the data provider immediately starts a transaction certificate application process and automatically sends a transaction certificate application to the data transaction platform; after the data consumer receives the subscription message of the federal learning data product, the data consumer needs to start a transaction certificate application process, and the starting triggering condition can be set according to a specific business process and sends the data transaction certificate application to the data transaction platform.
Step 202: the data transaction platform checks the authority of the participants, checks whether the application participants specify the participants for the federal learning data product, checks whether the number of the current participants exceeds the limitation of the number of the participants, and the like; and if the authority of the participant is not approved, rejecting the request of the participant for applying the transaction certificate.
Step 203: the data trading platform reviews whether the participant local data meets the data field requirements of the federal learning data product. In the horizontal federal learning scenario, the local data fields of all participants must be completely consistent with the defined data fields; in a longitudinal federated learning scenario, the local data fields of a participant need to be consistent with the data fields assigned to that participant in the federated learning data product. And if the participant does not meet the data requirement defined by the data product, rejecting the request of the participant for applying the transaction certificate.
Step 204: the data trading platform initiates federal data quality evaluation to all participants, after all participants jointly complete data quality evaluation, an evaluation report is sent to the data trading platform, the data trading platform reviews the data quality reports of all participants, and if the data quality reports of the participants do not meet requirements, the requests of the participants for applying for trading vouchers are rejected.
Step 205: and the data transaction platform sends data transaction certificates to all the participants, and updates the information of the participants of the federal learning data product.
And step 3: and (4) carrying out a data circulation and model attribution process based on federal learning.
The data transaction platform sends a message for starting a federal learning model learning task to all participants, and updates the state of a federal learning data product as follows: in operation. All the participating parties finish the model learning task of the federal learning data product, and finally according to the attribute strategy of the federal learning result,
and outputting the result to the participant for final federal learning output.
And 4, step 4: the data transaction platform sends a data circulation end message to the data provider, and the data transaction platform updates the state of the federal learning data product as follows: and is finished.
Fig. 6b is a schematic data flow chart of a learning scenario of a federated learning data product according to an exemplary embodiment of the present invention, and as shown in fig. 6b, the method includes the following steps:
step 1: and releasing the federal learning data product.
Step 101: the data provider, namely a participant serving as a federal learning task initiator, locally completes the definition of the federal learning data product, namely the configuration information of the federal learning data product, and initiates a register request of the federal learning data product to the data transaction platform.
Step 102: the data transaction platform is responsible for managing the federal learning data product, receives a register request of the federal learning data product submitted by a data provider, completes the storage of the federal learning data product and sets the state as follows: is registered; after the registration is finished, the data are published after being checked and approved by the data transaction platform, the basic information subscription message of the federal learning data product is published to a data product appointed data consumer, namely a participant which is not a federal learning task initiator, and the product state is updated as follows: the release is already carried out; and the data transaction platform sends a successful releasing message of the federal learning data product to the data provider to complete the releasing of the whole data product. The subscription message may be actively pushed or may be a subscription request.
Step 2: and managing data transaction certificates.
Step 201: and the participant applies for the data transaction voucher from the data transaction platform. After receiving the successful release message of the federal learning data product, the data provider immediately starts a transaction certificate application process and automatically sends a transaction certificate application to the data transaction platform; after receiving the subscription message of the federal learning data product, the data consumer needs to start a transaction voucher application process, and the starting triggering condition can be set according to a specific business process and sends a data transaction voucher application to the data transaction platform.
Step 202: the data transaction platform checks the authority of the participants, checks whether the application participants specify the participants for the federal learning data product, checks whether the number of the current participants exceeds the limitation of the number of the participants, and the like; and if the authority of the participant is not approved, rejecting the request of the participant for applying the transaction certificate.
Step 203: the data trading platform reviews whether the participant local data meets the data field requirements of the federal learning data product. In the horizontal federal learning scenario, the local data fields of all participants must be completely consistent with the defined data fields; in a longitudinal federated learning scenario, the local data fields of a participant need to be consistent with the data fields assigned to that participant in the federated learning data product. And if the participant does not meet the data requirement defined by the data product, rejecting the request of the participant for applying the transaction certificate.
And step 3: and the data transaction platform sends a message for starting the federal learning reasoning task to all the participants to finish the federal learning data product reasoning task. And finally, outputting the final federal learning reasoning result to the participants according to the federal learning result attribution strategy.
And 4, step 4: and the data demand direction sends a message of ending the federal learning inference task to the data transaction platform.
And 5: repeating the step 3 and the step 4 until the inference service is finished, wherein the inference service finishing condition can be set manually, such as a service period; after the reasoning service is finished, the data transaction platform updates the state of the federal learning data product as follows: and is finished.
EXAMPLE six
FIG. 7 illustrates a schematic diagram of a computing node 10 that may be used to implement embodiments of the present invention. A compute node is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The computing nodes may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 7, the computing node 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the computing node 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the computing node 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the computing node 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 executes the above-described respective methods and processes, such as a data product circulation method applied to a party who performs data product circulation in a data product circulation system, and a data product circulation method applied to a data trading platform in the data product circulation system.
In some embodiments, the data product circulation method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the data product circulation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the data product circulation method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here may be implemented on a computing node having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the computing node. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired result of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (22)

1. A data product circulation method, applied to a participant in data product circulation system for data product circulation, the method comprising:
after receiving the release information of the federal learning data product sent by a data transaction platform, applying a data transaction certificate to the data transaction platform so that the data transaction platform audits the participants based on the configuration information of the federal learning data product, and sending the data transaction certificate and a start instruction of a federal learning task to the participants who have audited;
and if a federal learning task starting instruction issued by the data transaction platform is received, participating in a federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes, so that the data transaction platform sends the corresponding output results to a target participant.
2. The method of claim 1, further comprising, prior to receiving the federal learning data product release information sent by the data trading platform:
the configuration information of the federal learning data product is completed locally as a participant of the federal learning task initiator;
initiating a federal learning data product registration request to the data transaction platform so that the data transaction platform can manage the federal learning data product and then send information of issuing the federal learning data product to each participant; wherein the federated learning data product registration request includes configuration information for a federated learning data product.
3. The method according to claim 2, wherein the applying for data transaction credentials to the data transaction platform after receiving the federal learning data product release information sent by the data transaction platform comprises:
the data transaction platform is used as a data transaction platform, and the data transaction platform is used for automatically applying for a data transaction certificate after receiving the information issued by the federal learning data product sent by the data transaction platform;
and the participation party serving as a non-federal learning task initiator passively triggers the application of a data transaction certificate to the data transaction platform after receiving the information issued by the federal learning data product sent by the data transaction platform.
4. The method of claim 1, wherein the federated learning task comprises a federated learning model learning task and the different federated learning scenarios comprise a federated data product learning scenario; correspondingly, the step of participating in the federal learning task of the federal learning data product to obtain output results corresponding to different federal learning scenes comprises the following steps:
participating in the learning process of the federal learning model to obtain a final federal learning model;
and taking the final federal learning model as an output result corresponding to the learning scene of the federal data product.
5. The method according to claim 1, wherein the federated learning task comprises a federated learning reasoning task, the different federated learning scenarios comprise federated data product reasoning scenarios, and accordingly, participating in the federated learning task of the federated learning data product to obtain output results corresponding to the different federated learning scenarios comprises:
carrying out federated learning reasoning based on the final federated learning model to obtain a reasoning result;
and taking the inference result as an output result corresponding to the inference scene of the federal data product.
6. A data product circulation method is applied to a data transaction platform in a data product circulation system, and the method comprises the following steps:
after a data transaction voucher application sent by a participant is received, auditing the participant based on configuration information of a federal learning product;
sending data transaction certificates and federal learning task starting instructions to the checked participants so that the corresponding participants participate in the federal learning task to obtain output results corresponding to different federal learning scenes;
and sending the corresponding output result to the target participant.
7. The method of claim 6, further comprising, prior to receiving the request for application of data transaction credentials sent by the party:
receiving a federal learning data product registration request initiated by a participant as a federal learning task initiator, wherein the federal learning data product registration request comprises configuration information of a federal learning data product;
and managing the federal learning data product and then sending information released by the federal learning data product to each participant.
8. The method of claim 7, wherein sending federal learning data product release information to the participants after managing the federal learning data product comprises:
registering the federated learning data product;
auditing the registered federal learning data products;
and after the audit is passed, sending a successful releasing message of the federal learning data product to the participant serving as the initiator of the federal learning task, and releasing a basic information subscription message of the federal learning data product to the participant serving as the initiator of the non-federal learning task.
9. The method of claim 6, wherein the configuration information for the federated learning data product includes federated learning data product base information and federated learning data product alteration information.
10. The method of claim 9, wherein the federal learning data product essential information includes one or more of the following:
the method comprises the following steps of generating federal learning data product basic information, participator data information, federal learning task information and state information of the federal learning data product;
the basic information of the federal learning data product comprises a product name, product introduction and a product adaptation scene; the participant data information includes a data source description and a data field description; the federal learning task information comprises algorithm and parameter information, a federal learning scene and a federal learning type.
11. The method of claim 9, wherein the federal learning data product change information includes one or more of the following:
the method comprises the steps of (1) outputting a result attribution strategy, a participant management strategy, a safety strategy and an online version management strategy through federal learning;
the participant management strategy comprises a participant fixed quantity requirement, a participant quantity range and participant information management; the security policy comprises a transaction credential application and authorization policy, a party authentication policy and an encryption protocol.
12. The method of claim 6, wherein when the different federated learning scenarios include a federated data product learning scenario, the reviewing of participants based on configuration information for a federated learning product includes:
checking whether a participant who sends a data transaction certificate application has participation right or not based on a participant management strategy in the configuration information of the federal learning product;
if the participation right is provided, whether the local data of the participant meets the participant data information in the configuration information of the federal learning product is checked;
if the audit is passed, a federal data quality evaluation request is sent to the participant, so that the participant receiving the federal data quality evaluation request locally completes the federal data quality evaluation and then sends a generated federal data quality evaluation report to the data transaction platform;
and receiving a federal data quality evaluation report sent by the participant, and auditing the federal data quality evaluation report.
13. The method of claim 6, wherein when the different federated learning scenarios include a federated learning data product inference scenario, the reviewing of participants based on configuration information for federated learning products includes:
checking whether a participant who sends a data transaction certificate application has participation right or not based on a participant management strategy in the configuration information of the federal learning product;
and if the participation right is provided, whether the local data of the participant meets the participant data information in the configuration information of the federal learning product is checked.
14. The method of any one of claims 12 or 13, wherein auditing the participants based on configuration information for a federal learning product further comprises: and performing authorization, authentication and encryption based on the security policy.
15. The method according to any one of claims 6-8, wherein when the different scenarios include a federated learning scenario, before the participant participates in a federated learning task of the federated learning data product to obtain a corresponding output result, further comprising:
determining that the current scene is a federal data product learning scene according to the product adaptation scene and the federal learning scene;
determining that the federal learning type of the learning scene of the federal data product is horizontal federal learning or vertical federal learning according to the federal learning type;
and determining a federal learning model according to the determined federal learning type and the algorithm and parameter information.
16. The method according to any one of claims 6-8, wherein the target party is determined according to a federal learning output result attribution policy in the configuration information.
17. The method according to any one of claims 6 to 8 or 12 to 13, characterized in that, in the circulation process of the data, the state of the federated learning data product is updated according to the state information of the federated learning data product in the configuration information;
the status of the federal learning data product includes one or more of the following:
submission, registration, release, running and neutralization are finished.
18. The method of any one of claims 6-8 or 12-13, further comprising: and managing the versions of the federal learning data product on line based on an on-line version management strategy.
19. The method of claim 18, wherein managing versions of a federated learning data product online based on an online version management policy comprises:
according to the online version management strategy, aiming at the condition that the existing services of all participants have no influence, selecting a service uninterrupted mode to change the version or upgrade the version;
according to the online version management strategy, aiming at the condition that the existing service of part of participants has influence, selecting a service pause mode of the influencing party to change the version or upgrade the version;
and selecting a service interruption mode to change the version or upgrade the version aiming at the condition that the existing service of all the participants has influence according to the online version management strategy.
20. A system for circulating a data product, comprising: the system comprises a data transaction platform and participants for data circulation, wherein the participants comprise participants serving as federal learning task initiators;
the system comprises a participating party serving as a federal learning task initiating party and a data transaction platform, wherein the participating party is used for locally completing configuration information of a federal learning data product and initiating a federal learning data product registration request to the data transaction platform, and the federal learning data product registration request comprises the configuration information of the federal learning data product;
the data transaction platform is used for sending federal learning data product release information to each participant after managing the federal learning data product;
the participator is used for applying a data transaction certificate to the data transaction platform after receiving the information issued by the federal learning data product sent by the data transaction platform;
the data transaction platform is also used for auditing the participants based on the configuration information of the federal learning product after receiving a data transaction certificate application request sent by the participants, and sending data transaction certificates and federal learning task starting instructions to the participants who pass the audit;
the participators are also used for participating in the federal learning task to obtain output results corresponding to different scenes;
and the data transaction platform is also used for sending the corresponding output result to a target participant.
21. A computing node, wherein the computing node comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the data product distribution method of any one of claims 1-5 or 6-19.
22. A computer-readable storage medium storing computer instructions for causing a processor to perform the data product circulation method of any one of claims 1-5 or 6-19 when executed.
CN202211668014.1A 2022-12-23 2022-12-23 Data product circulation method, system, computing node and storage medium Pending CN115829751A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211668014.1A CN115829751A (en) 2022-12-23 2022-12-23 Data product circulation method, system, computing node and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211668014.1A CN115829751A (en) 2022-12-23 2022-12-23 Data product circulation method, system, computing node and storage medium

Publications (1)

Publication Number Publication Date
CN115829751A true CN115829751A (en) 2023-03-21

Family

ID=85518099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211668014.1A Pending CN115829751A (en) 2022-12-23 2022-12-23 Data product circulation method, system, computing node and storage medium

Country Status (1)

Country Link
CN (1) CN115829751A (en)

Similar Documents

Publication Publication Date Title
US11716199B2 (en) Blockchain generation apparatus, blockchain verification apparatus, and program
US11651109B2 (en) Permission management method, permission verification method, and related apparatus
US11790370B2 (en) Techniques for expediting processing of blockchain transactions
US11418341B2 (en) Distributed consent protecting data across systems and services
CN108335207B (en) Asset management method and device and electronic equipment
CN111125779A (en) Block chain-based federal learning method and device
US11126659B2 (en) System and method for providing a graph protocol for forming a decentralized and distributed graph database
KR20200067116A (en) Blockchain smart contract update using decentralized decision
CN108022090B (en) Virtual account management method, device, system and readable storage medium
US11107318B2 (en) Detecting excluded players and related systems and methods
US20220045849A1 (en) Group service implementation method and device, equipment and storage medium
JP2022527375A (en) Systems and methods for virtual distributed ledger networks
CN115829751A (en) Data product circulation method, system, computing node and storage medium
US20230111782A1 (en) Request processing method based on consortium blockchain, device, and storage medium
CN114900334B (en) NFT authority control method, system, computer readable storage medium and terminal equipment
CN115829561B (en) Transaction method, system, computing node and storage medium for data products
CN113014540B (en) Data processing method, device, equipment and storage medium
CN114331397A (en) Information processing method, device, electronic equipment and storage medium
CN115829517A (en) Method, system, computing node and medium for expanding and contracting capacity in data product circulation
KR20210115611A (en) Method for performing authenticating child via an authentication application operating on user terminal device, device and storage medium for performing the same
Wilczynski Blockchain-based task scheduling in computational clouds
US11798358B2 (en) Casino decentralized management to collect and share player credit data
CN116703395B (en) Digital RMB payment method, device, equipment, system and medium
US20240037543A1 (en) Systems and methods for entity labeling based on behavior
CN117472866B (en) Federal learning data sharing method under block chain supervision and excitation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination