CN114547658A - Data processing method, device, equipment and computer readable storage medium - Google Patents

Data processing method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN114547658A
CN114547658A CN202210198357.XA CN202210198357A CN114547658A CN 114547658 A CN114547658 A CN 114547658A CN 202210198357 A CN202210198357 A CN 202210198357A CN 114547658 A CN114547658 A CN 114547658A
Authority
CN
China
Prior art keywords
model
data
feature
trained
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210198357.XA
Other languages
Chinese (zh)
Inventor
何元钦
康焱
骆家焕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202210198357.XA priority Critical patent/CN114547658A/en
Publication of CN114547658A publication Critical patent/CN114547658A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

The application provides a data processing method, a data processing device, data processing equipment and a computer-readable storage medium, which are applied to first participant equipment; the method comprises the following steps: acquiring first characteristic data, second characteristic data and encrypted characteristics obtained by encrypting the first participant equipment according to third characteristic data, wherein the first characteristic data, the second characteristic data and the second participant equipment are held by the first participant equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user; training based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model; sending the trained first model to target equipment for aggregation to obtain a global model; and acquiring the attribute information of the object to be recommended, and processing the attribute information by using the global model sent by the target equipment to obtain a recommendation result. By the method, model pre-training is carried out, the effect of a combined modeling model can be improved, the recommendation result is determined based on the extracted features of the high degree of discrimination, and the recommendation success rate can be improved.

Description

Data processing method, device, equipment and computer readable storage medium
Technical Field
The present application relates to artificial intelligence technology, and in particular, to a data processing method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product.
Background
With the trend of gradually strengthening data privacy protection in various industries, federal learning is a technology which can cooperate with multi-party data to establish machine learning under the condition of protecting data privacy, and becomes one of the key points of cooperation among various enterprises/industries. The longitudinal federated learning is that under the condition that the data characteristics of the participants are overlapped less and the users are overlapped more, the part of the users and the data with the same users and different user data characteristics of the participants are taken out for combined modeling, and the better service can be provided for the clients by the participants through improving the performance of the model.
Under a longitudinal federal scene, both participating parties perform supervised joint modeling only based on the labeled users overlapped by the two parties. However, in actual scenarios, because the tag data is difficult to obtain, both parties have only a small amount of tagged data. In addition, because the respective scenes are different, the number of overlapped users of the two parties is possibly less, so that the data which can be used for the joint modeling by the two parties is less, and the model effect of the joint modeling is influenced. For example, when cross-platform recommendation is performed, company a and company B have user groups with high overlapping degree, but their products are different, that is, their features are not overlapped from the perspective of longitudinal federation, for example, one feature is book ID, the other feature is movie ID, each company has a scoring matrix of users for products, but scoring data is less, and a combined model is performed according to only a small amount of scoring data, so that the model effect is not good, and the recommendation effect is affected.
Disclosure of Invention
Embodiments of the present application provide a data processing method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product, which can improve a model effect of joint modeling, determine a recommendation result based on extracted features of a high degree of discrimination, and improve a recommendation success rate.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a data processing method, which is based on a federal learning system, wherein the federal learning system comprises a first participant device and at least one second participant device, the method is applied to the first participant device, and the method comprises the following steps:
acquiring first characteristic data, second characteristic data and encryption characteristics sent by the first party equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristics are held by the first party equipment, the encryption characteristics are obtained by encrypting third characteristic data held by the second party equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
training a preset first model based on the first characteristic data, the second characteristic data and the encrypted characteristic to obtain a trained first model;
sending the trained first model to target equipment so that the target equipment aggregates the trained first model to obtain a global model; the target device is a server device or a participant device in the federal learning system;
receiving the global model sent by the target equipment;
and acquiring attribute information of an object to be recommended, and processing the attribute information by using the global model to obtain a recommendation result.
An embodiment of the present application provides a data processing apparatus, including:
the first obtaining module is used for obtaining first characteristic data, second characteristic data and encryption characteristics sent by a second party device, wherein the first characteristic data, the second characteristic data and the encryption characteristics are held by the first party device, the encryption characteristics are obtained by encrypting third characteristic data held by the second party device, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
the training module is used for training a preset first model based on the first characteristic data, the second characteristic data and the encrypted characteristic to obtain a trained first model;
the first sending module is used for sending the trained first model to target equipment so that the target equipment can aggregate the trained first model to obtain a global model; the target equipment is server equipment or participant equipment in the federal learning system;
the receiving module is used for receiving the global model sent by the target equipment;
the second acquisition module is used for acquiring the attribute information of the object to be recommended;
and the processing module is used for processing the attribute information by using the global model to obtain a recommendation result.
In the foregoing solution, the training module is further configured to:
respectively carrying out conversion processing on the first characteristic data and the second characteristic data by using a preset conversion model to obtain converted first characteristic data and converted second characteristic data;
performing feature extraction processing on the converted first feature data by using a preset first model to obtain local features corresponding to the first feature data;
determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature;
processing the local feature and the first private feature by using a preset first loss function to determine a first loss value;
and reversely transmitting the first loss value to the preset first model so as to adjust the parameters of the preset first model to obtain the trained first model.
In the foregoing solution, the training module is further configured to:
training a preset second model based on the converted second feature data and the encrypted features to obtain a trained second model;
and performing feature extraction processing on the converted first feature data by using the trained second model to obtain a first private feature corresponding to the first feature data.
In the foregoing solution, the training module is further configured to:
performing projection processing on the converted second characteristic data by using a preset second model to obtain projection characteristics;
processing the projection characteristic and the encryption characteristic by using a preset second loss function to determine a second loss value;
and reversely transmitting the second loss value to the preset second model so as to adjust the parameters of the preset second model and obtain the trained second model.
In the foregoing solution, the training module is further configured to:
acquiring a preset second model, wherein the preset second model comprises an initial first sub-model and an initial second sub-model;
performing feature extraction processing on the converted second feature data by using the initial first sub-model to obtain a second private feature;
and performing projection processing on the second private characteristic by using the initial second sub-model to obtain a projection characteristic.
In the foregoing solution, the training module is further configured to:
reversely transmitting the second loss value to the initial first submodel to adjust the parameters of the initial first submodel to obtain a trained first submodel;
reversely transmitting the second loss value to the initial second submodel to adjust the parameters of the initial second submodel to obtain a trained second submodel;
and determining the trained first sub-model and the trained second sub-model as a trained second model.
In the above solution, the apparatus further includes:
the adjusting module is used for reversely transmitting the first loss value to the trained first submodel so as to adjust the parameters of the trained first submodel to obtain an updated first submodel;
correspondingly, the training module is further configured to:
and performing feature extraction processing on the converted first feature data by using the updated first sub-model to obtain a first private feature corresponding to the first feature data.
In the foregoing scheme, when the first participant device holds tag data, the processing module is further configured to:
acquiring training data and label data corresponding to the training data;
constructing an initial classification model based on the conversion model, the feature extraction model and a preset classifier, wherein the feature extraction model comprises the global model and/or the trained second sub-model;
training the initial classification model based on the training data and the label data to obtain a trained classification model;
and processing the attribute information by using the trained classification model to obtain a recommendation result.
In the foregoing solution, the processing module is further configured to:
processing the training data by using the initial classification model to obtain a training result;
processing the label data and the training result by using a preset third loss function to determine a third loss value;
and reversely transmitting the third loss value to the initial classification model so as to adjust the parameters of the initial classification model and obtain a trained classification model.
The embodiment of the present application further provides a data processing method, based on a federal learning system, where the federal learning system includes a first participant device and at least one second participant device, and the method is applied to the second participant device, and includes:
acquiring third feature data and a preset third model held by the second participant device, wherein the preset third model comprises an initial third sub-model;
performing feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature;
encrypting the third private characteristic to obtain an encrypted characteristic;
and sending the encryption characteristic to the first participant device so that the first participant device determines a global model for processing the attribute information of the object to be recommended based on the encryption characteristic.
An embodiment of the present application further provides a data processing apparatus, including:
a third obtaining module, configured to obtain third feature data held by a second participant device and a preset third model, where the preset third model includes an initial third sub-model;
the feature extraction module is used for performing feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature;
the encryption module is used for encrypting the third private characteristic to obtain an encrypted characteristic;
and the second sending module is used for sending the encrypted features to the first party equipment so that the first party equipment determines a global model for processing the attribute information of the object to be recommended based on the encrypted features.
An embodiment of the present application provides an electronic device, including:
a memory for storing executable instructions;
and the processor is used for realizing the data processing method provided by the embodiment of the application when the processor executes the executable instructions stored in the memory.
The embodiment of the application provides a computer-readable storage medium, which stores executable instructions for causing a processor to execute, so as to implement the data processing method provided by the embodiment of the application.
The embodiment of the present application provides a computer program product, which includes a computer program, and the computer program, when executed by a processor, implements the data processing method provided by the embodiment of the present application.
The embodiment of the application has the following beneficial effects:
according to the data processing method provided by the embodiment of the application, the problem that the quality of the data source of the passive side is difficult to accurately evaluate due to less label data in the overlapped user of the active side and the passive side in a longitudinal scene by introducing a label-free joint modeling process is solved. The data of the non-label coincident users and the data of the non-label coincident users are used for carrying out combined modeling, a combined model of the feature extraction part is optimized, the model effect of the combined modeling can be improved, the recommendation result is determined based on the extracted features with high degree of discrimination, and the recommendation success rate can be improved.
Drawings
FIG. 1 is a block diagram of an architecture of a data processing system according to an embodiment of the present application;
2A-2B are schematic structural diagrams of an electronic device provided by an embodiment of the application;
FIG. 3 is a schematic flow chart diagram of a data processing method provided in an embodiment of the present application;
fig. 4 is a schematic flowchart of training a preset first model according to an embodiment of the present disclosure;
FIG. 5 is a schematic flowchart of training a second model according to an embodiment of the present disclosure;
fig. 6 is a schematic flowchart of a process of performing projection processing on the converted second feature data according to an embodiment of the present application;
fig. 7 is a schematic flow chart illustrating an implementation process of processing attribute information according to an embodiment of the present application;
FIG. 8 is a schematic flow chart diagram of a data processing method according to an embodiment of the present application;
fig. 9 is a schematic diagram of a longitudinal federated learning system architecture provided in an embodiment of the present application;
FIG. 10 is a schematic overall structure diagram of a longitudinal federated learning model pre-training method provided in an embodiment of the present application;
FIG. 11 is a flowchart illustrating private model training based on alignment data according to an embodiment of the present disclosure;
FIG. 12 is a schematic flowchart of local model training based on local data according to an embodiment of the present disclosure;
fig. 13 is a schematic flowchart of local model aggregation and distribution provided in the embodiment of the present application;
fig. 14 is a schematic flowchart of performing labeled supervised learning based on a pre-training model according to an embodiment of the present application.
Detailed Description
In order to make the objectives, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the attached drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
Where similar language of "first/second" appears in the specification, the following description is added, and where reference is made to the term "first \ second \ third" merely for distinguishing between similar items and not for indicating a particular ordering of items, it is to be understood that "first \ second \ third" may be interchanged both in particular order or sequence as appropriate, so that embodiments of the application described herein may be practiced in other than the order illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Federal learning (fed learning) refers to a method of machine learning by federating different participants (participants, or party, also known as data owners, or clients). In federal learning, participants do not need to expose own data to other participants and coordinators (coordinators, also called parameter servers or aggregation servers), so that federal learning can protect user privacy and guarantee data security well.
2) And in the horizontal federal learning, under the condition that the data characteristics of each participant are overlapped more and the user is overlapped less, the part of data with the same data characteristics of the participants and not identical users is taken out for joint machine learning.
3) In the longitudinal federated learning, under the condition that the data features of the participants are overlapped less and the users are overlapped more, the part of the users and the data with the same users and different user data features of the participants are taken out for the joint machine learning training.
4) Comparative learning (comparative learning) is a method of describing the task of similar and dissimilar things for machine learning models. With this approach, a machine learning model can be trained to distinguish between similar and different images.
Embodiments of the present application provide a data processing method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product, which can improve a model effect of joint modeling, determine a recommendation result based on extracted features of a high degree of discrimination, and improve a recommendation success rate.
Based on the above explanations of terms and words involved in the embodiments of the present application, a data processing system provided in the embodiments of the present application is first described, referring to fig. 1, fig. 1 is a schematic diagram of an architecture of a data processing system provided in the embodiments of the present application, in a data processing system 100, a first participant device 400, a second participant device 410 (2 second participant devices are exemplarily shown, and are respectively denoted as 410-1 and 410-2 for distinction), the first participant device 400 and the second participant device 410 are connected to each other through a network 300, and are connected to a server device through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of both, and data transmission is implemented using a wireless link.
In some embodiments, the first participant device 400 and the second participant device 410 may be, but are not limited to, a laptop computer, a tablet computer, a desktop computer, a smart phone, a dedicated messaging device, a portable gaming device, a smart speaker, a smart watch, etc., and may also be client terminals of federal learning participants, such as participant devices storing user characteristic data at various banks or financial institutions, etc. The server device may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like, and is used for assisting each participant device in performing federal learning to obtain a federal learning model. The network 300 may be a wide area network or a local area network, or a combination of both. The first participant device 400 and the second participant device 410 may be directly or indirectly connected through wired or wireless communication, and the embodiments of the present application are not limited thereto.
The first participant device 400 is configured to obtain first feature data and second feature data that are owned by the first participant device, and receive an encryption feature sent by the second participant device, where the encryption feature is obtained by encrypting third feature data that is owned by the second participant device, where the second feature data and the third feature data are data of different features of the same user; then training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model; sending the trained first model to target equipment so that the target equipment can aggregate the trained models sent by the plurality of participant equipment to obtain a global model, wherein the target equipment is server equipment or any participant equipment in the federal learning system; receiving a global model sent by target equipment; and acquiring attribute information of the object to be recommended, and processing the attribute information by using the global model to obtain a recommendation result.
The second participant device 410, configured to obtain third feature data owned by the second participant device and a preset third model, where the preset third model includes an initial third sub-model; performing feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature; then, the third private characteristic is encrypted to obtain an encrypted characteristic; and finally, the encryption characteristics are sent to the first participant device, so that the first participant device determines a global model for processing the attribute information of the object to be recommended based on the encryption characteristics.
Referring to fig. 2A-2B, fig. 2A-2B are schematic structural diagrams of an electronic device provided in the embodiment of the present application, and in practical applications, the electronic device 500 may be implemented as the first party device 400 or the second party device 410 in fig. 1, which is used to describe an electronic device implementing the data processing method in the embodiment of the present application. The electronic device 500 shown in fig. 2A-2B includes: at least one processor 510, memory 550, at least one network interface 520, and a user interface 530. The various components in the electronic device 500 are coupled together by a bus system 540. It will be appreciated that the bus system 540 is used to enable communications among the components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 540 in fig. 2.
The Processor 510 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
The user interface 530 includes one or more output devices 531 enabling presentation of media content, including one or more speakers and/or one or more visual display screens. The user interface 530 also includes one or more input devices 532, including user interface components to facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The memory 550 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 550 optionally includes one or more storage devices physically located remote from processor 510.
The memory 550 may comprise volatile memory or nonvolatile memory, and may also comprise both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The memory 550 described in embodiments herein is intended to comprise any suitable type of memory.
In some embodiments, memory 550 can store data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
An operating system 551 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;
a network communication module 552 for communicating to other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;
a presentation module 553 for enabling presentation of information (e.g., a user interface for operating peripheral devices and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;
an input processing module 554 to detect one or more user inputs or interactions from one of the one or more input devices 532 and to translate the detected inputs or interactions.
In some embodiments, the data processing apparatus provided in this embodiment of the present application may be implemented in software, fig. 2A illustrates a schematic structural diagram of the electronic device provided in this embodiment of the present application as a first participant device 400, and the data processing apparatus 555 stored in the memory 550 may be software in the form of programs and plug-ins, and includes the following software modules: the first obtaining module 5551, the training module 5552, the first sending module 5553, the receiving module 5554, the second obtaining module 5555 and the processing module 5556 are logical, and thus any combination or further splitting may be performed according to the implemented functions. The functions of the respective modules will be explained below.
In some embodiments, as shown in fig. 2B, fig. 2B is a schematic structural diagram of the electronic device provided in the embodiment of the present application as the second participant device 410, and the software modules stored in the data processing apparatus 555 of the memory 550 may include: the third obtaining module 5557, the feature extracting module 5558, the encrypting module 5559, and the second sending module 5560 are logical modules, and thus may be arbitrarily combined or further separated according to the implemented functions. The functions of the respective modules will be explained below.
In other embodiments, the data processing apparatus provided in this embodiment may be implemented in hardware, and for example, the data processing apparatus provided in this embodiment may be a processor in the form of a hardware decoding processor, which is programmed to execute the data processing method provided in this embodiment, for example, the processor in the form of the hardware decoding processor may employ one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.
The data processing method provided by the embodiment of the present application will be described in conjunction with exemplary applications and implementations of the first participant device provided by the embodiment of the present application. The data processing method provided by the embodiment of the application is based on a federal learning system, wherein the federal learning system comprises a first participant device and at least one second participant device. Referring to fig. 3, fig. 3 is a schematic flowchart of a data processing method provided in an embodiment of the present application, and will be described with reference to the steps shown in fig. 3.
Step S301, obtain the first feature data and the second feature data held by the first party device, and the encryption feature sent by the second party device.
Here, the encrypted feature is obtained by the second participating device encrypting the held third feature data. The second characteristic data and the third characteristic data are data of different characteristics of the same user, namely the second characteristic data and the third characteristic data are aligned.
In practical implementation, in the context of a vertical federal model, at least two parties are typically involved, a first party being a party holding characteristic data and tag data (also referred to as a Guest party or an active party), and a second party being a party holding characteristic data (also referred to as a Host party or a passive party). The method provided by the embodiment of the application can be suitable for a longitudinal federal model in which a Guest party and at least one Host party participate. In this scenario, in the overlapped users of the first participant holding user and the second participant holding user, the first participant has no tagged data or only a small amount of tagged data, and cannot perform conventional joint modeling through the tagged data. The embodiment of the application provides joint modeling under a label-free data scene.
For example, in the embodiment of the present application, an active party a and a passive party B are taken as an example for description. The first party acts as the active party and the second party acts as the passive party. In practical application, the scheme based on the same can be extended to simultaneously combine a plurality of passive partiesAnd (5) training a model. The first characteristic data of the first participant is marked as XA,unalignedAnd the second characteristic data of the first participant device is marked as XA,alignedAnd the third characteristic data of the second party is marked as XB,alignedThe second participant equipment carries out conversion, feature extraction, projection, encryption and other processing on the third feature data to obtain a second encryption feature fB,alignedIn order to distinguish the encryption feature obtained by the encryption processing of the first party device from the encryption feature obtained by the encryption processing of the first party device, the feature obtained by the encryption processing of the first party device will be referred to as a first encryption feature, and the feature obtained by the encryption processing of the second party device will be referred to as a second encryption feature. The first participant equipment acquires first characteristic data X from the storage space of the first participant equipmentA,unalignedAnd second characteristic data XA,alignedObtaining a second cryptographic characteristic f from a second participant deviceB,aligned. Wherein the second characteristic data XA,alignedAnd third feature data XB,alignedThe data is aligned for both parties.
Step S302, training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model.
Utilizing (X)A,unaligned,XA,aligned,fB,aligned) And training a preset first model to obtain a trained first model. The first model preset here is a private feature extraction model initialized by the first participant device. In step S302, joint modeling is implemented based on model pre-training of unlabeled data.
Fig. 4 is a schematic flowchart of a process of training a preset first model according to an embodiment of the present application, and as shown in fig. 4, in an implementation manner, step S302 may be implemented by steps S3021 to S3025 shown in fig. 4:
step S3021, respectively performing conversion processing on the first feature data and the second feature data by using a preset conversion model to obtain converted first feature data and converted second feature data.
Each participant participating in the joint training has a conversion model C, and each participant converts respective original data X into a predefined feature (matrix or vector) with the same dimension to obtain a conversion feature h. For example, the first feature data X of the A sideA,unalignedAnd second feature data X of the A sideA,alignedInput to a predetermined conversion model CACarrying out conversion processing to obtain converted first characteristic data hA,unalignedAnd the converted second feature data hA,aligned. The same party B also performs the same operation to obtain the third characteristic data X of the party BB,alignedInput to preset conversion model CBCarrying out conversion processing to obtain the third feature data h after conversionB,aligned
Step S3022, performing feature extraction processing on the converted first feature data by using a preset first model to obtain a local feature corresponding to the first feature data.
Each participant has a respective local feature extraction model Mlocal(first model), and the structure of each local feature extraction model is the same. Each local feature extraction model MlocalAnd (4) acting on the output of each conversion model, namely acting on the converted feature data h (conversion features), and extracting the local features f corresponding to the original data. For example, the first model preset by the A side is MA,localConverting the first characteristic data h obtained in step S3021A,unalignedIs input to MA,localPerforming feature extraction to obtain first feature data XA,unalignedCorresponding local feature fA,unaligned
Step S3023, determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data, and the encrypted feature.
In an embodiment of the present application, determining the first private characteristic may be implemented as: training a preset second model based on the converted second feature data and the encrypted features to obtain a trained second model; and performing feature extraction processing on the converted first feature data by using the trained second model to obtain a first private feature corresponding to the first feature data.
Each participant has a respective private feature extraction model, and the preset second model comprises the private feature extraction model. The structure of the private feature extraction model of each participant can be the same or different, and the structure and parameter weight of the private model are private data of each participant and are unknown to other participants.
The first participant device trains a preset second model based on the converted second feature data and the encrypted features to obtain a trained second model, the second model comprises a private feature extraction model capable of extracting the private features of the data, and the private feature extraction model included in the trained second model is recorded as MA,private. Extraction of model M using private featuresA,privateFor the converted first characteristic data hA,unalignedThe converted second characteristic data hA,alignedAnd a second encryption characteristic FA,alignedProcessing the data to extract first characteristic data XA,unalignedCorresponding first private feature fA,aligned
Wherein the second encryption characteristic fB,alignedIs the second participant device to the third feature data XB,alignedAnd extracting the private features to obtain the private features, and further encrypting the private features to obtain the private features.
Step S3024, processing the local feature and the first private feature by using a preset first loss function, and determining a first loss value.
The step is a model updating process, and aims to enable the features of the same user to be as close as possible in the feature space and enable the features of different users to be as far away as possible in the feature space. In the embodiment of the present application, the classification target may be achieved by a preset first loss function. The method can be specifically realized as follows: the A side is based on (f)A,unaligned,fA,aligned) A first loss function is calculated. The preset first loss function may be a contrast loss function or a similarity loss function, and in the embodiment of the present application, the similarity loss function is exemplified and calculatedThe formula is shown in the following formula (1):
Figure BDA0003528090050000131
wherein i is a positive integer, the numeric area is [1, the number of local features]The number of local features equals the number of first feature data, fiCharacterizing the ith local feature; j is a positive integer and has a value in the range of [1 ] the number of the first private characteristic]The number of the first private features is equal to the number of the second feature data, fjCharacterizing a jth first private feature. i. j respectively traverse the samples in the current training pool of the A party, and the calculation mode of the first loss function is as the formula (1). Calculating similarity loss of each sample, and summing the similarity loss to obtain a final result, namely a first loss value, as shown in formula (2):
Figure BDA0003528090050000132
step S3025, reversely propagating the first loss value to the preset first model to adjust a parameter of the preset first model, so as to obtain the trained first model.
A side according to (f)A,unaligned,fA,aligned) Calculating a first loss value L thereofA,simAnd reversely transmitting the first loss value to the first model, and adjusting parameters of the first model by a gradient descent method to update the first model to obtain the trained first model.
The method is also suitable for the local feature extraction model M of the B-party training selfB,local
Step S303, sending the trained first model to the target device, so that the target device aggregates the trained first model to obtain a global model.
The target device is a server device or a participant device in the federal learning system.
The first participant device sends the trained first model, namely the trained local feature extraction model, to the server device or the preset participant device, and the server device or the preset participant device aggregates the trained local feature extraction models of the plurality of participant devices to obtain a global model, and then distributes the global model back to each participant device.
And step S304, receiving the global model sent by the target device.
In some embodiments, before processing the data to be processed, if the first participant device has a small amount of tag data, the global model may be adjusted according to the tag data, so as to further improve the effect of the joint model, and make the processing result more accurate.
Step S305, acquiring the attribute information of the object to be recommended.
And S306, processing the attribute information by using the global model to obtain a recommendation result.
After the global model is obtained, the first participant device may process the data to be processed by using the global model to obtain a processing result. If the method is applied to information recommendation, the data to be processed are attribute information of an object to be recommended, features in the attribute information are extracted by using the global model, the features with high degree of distinction can be extracted, a recommendation result is determined based on the features with high degree of distinction, and the recommendation success rate can be improved.
The data processing method provided by the embodiment of the application is based on a federal learning system comprising a first participant device and at least one second participant device, and is applied to the first participant device, and the method comprises the following steps: acquiring first characteristic data, second characteristic data and encryption characteristics sent by first participant equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristics are held by the first participant equipment, the encryption characteristics are obtained by encrypting third characteristic data held by the second participant equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user; training a preset first model based on the first characteristic data, the second characteristic data and the encryption characteristic to obtain a trained first model; sending the trained first model to target equipment to enable the target equipment to aggregate the trained first model to obtain a global model, wherein the target equipment is server equipment or participant equipment in a federal learning system; receiving a global model sent by target equipment; and acquiring attribute information of the object to be recommended, and processing the attribute information by using the global model to obtain a recommendation result. By the method, model pre-training is performed, the effect of a combined modeling model can be improved, the recommendation result is determined based on the extracted features with high degree of distinction, and the recommendation success rate can be improved.
Fig. 5 is a schematic flowchart of a process of training a second model according to an embodiment of the present disclosure, and as shown in fig. 5, in some embodiments, "training a preset second model based on the converted second feature data and the encrypted feature to obtain a trained second model" may be implemented as the following steps:
step S501, projection processing is carried out on the converted second characteristic data by using a preset second model, and projection characteristics are obtained.
Here, the preset second model includes an initial first sub-model and an initial second sub-model, where the initial first sub-model is a private feature extraction model capable of extracting private features, and is denoted as MA,privateThe initial second sub-model is a projection model for projecting the feature, denoted as HA
Fig. 6 is a schematic flowchart of a process of performing projection processing on the converted second feature data according to an embodiment of the present application, and as shown in fig. 6, in some embodiments, the step S501 may be implemented as the following steps:
in step S5011, a preset second model is acquired.
Step S5012, performing feature extraction processing on the converted second feature data by using the initial first submodel to obtain a second private feature.
Converting the second characteristic data hA,alignedInput to the initial first submodel MA,privateIn (3), the converted second characteristic data hA,alignedExtracting private characteristics to obtain second characteristic data XA,alignedCorresponding second private characteristic fA,aligned. This application is trueIn an embodiment, to ensure that the data privacy is not leaked, the second private feature may be encrypted by an encryption method such as a differential privacy method.
Step S5013, performing projection processing on the second private feature by using the initial second sub-model to obtain a projection feature.
Second private feature fA,alignedInput to the initial second submodel HATo the second private characteristic fA,alignedPerforming projection processing to obtain first characteristic data XA,alignedCorresponding projection feature, denoted as zA
Step S502, processing the projection feature and the encryption feature by using a preset second loss function, and determining a second loss value.
As in step S3024, this step is also a model updating process, and aims to make the features of the same user as close as possible in the feature space and the features of different users as far as possible in the feature space. In the embodiment of the present application, the classification target may be achieved by a preset second loss function. The method can be specifically realized as follows: the A side is based on (z)A,aligned,fB,aligned) A loss function is calculated. Here, the preset second loss function may be a contrast loss function or a similarity loss function, which is illustrated in the embodiment of the present application as a similarity loss function, and the calculation formula is shown in the following formula (3):
Figure BDA0003528090050000151
wherein i is a positive integer and has a value range of [1 ] and the number of the second encryption characteristics]The number of second encryption characteristics is equal to the number of third characteristic data, fiCharacterizing an ith second encryption feature; j is a positive integer with a value range of [1 ], the number of projection features]The number of projection features is equal to the number of first feature data, zjThe jth projection feature is characterized. i. j traverses the samples in the current training pools of party A and party B, respectively, fiAnd zjFeatures or encryption features for the same user, i.e. something like the A-partyIf the samples i of the local j and B sides are aligned, the overall loss function is calculated as shown in equation (3). Calculating the similarity loss of each aligned sample, and then summing the similarity losses to obtain a final result, namely a second loss value, as shown in formula (4):
Figure BDA0003528090050000152
step S503, reversely transmitting the second loss value to the preset second model to adjust the parameters of the preset second model, so as to obtain the trained second model.
A side according to (z)A,aligned,fB,aligned) Calculating a second loss value L thereofA,sim2And reversely transmitting the second loss value to the second model, and adjusting parameters of the second model by a gradient descent method to update the second model to obtain the trained second model.
The preset second model comprises an initial first sub-model and an initial second sub-model, and parameters of the sub-models can be adjusted respectively during reverse propagation to obtain a trained second model. In practical implementation, step S503 can be implemented as the following steps: reversely propagating the second loss value to the initial first submodel to adjust the parameters of the initial first submodel to obtain a trained first submodel; reversely transmitting the second loss value to an initial second submodel to adjust the parameters of the initial second submodel to obtain a trained second submodel; and determining the trained first sub-model and the trained second sub-model as a trained second model.
In some embodiments, after the private feature extraction model (i.e., the first sub-model) is updated by using the second loss value, the model may be further updated by using the first loss value, and the first loss value is propagated back to the trained first sub-model, so as to adjust parameters of the trained first sub-model, thereby obtaining an updated first sub-model. When the private feature is extracted next time, the updated first sub-model can be used for performing feature extraction processing on the converted first feature data to obtain a first private feature corresponding to the first feature data.
In some embodiments, when the first participant device holds the tag data, after the global model is obtained, the global model may be adjusted according to the tag data, so as to further improve the effect of the combined model, and make the recommendation result more accurate. Fig. 7 is a schematic view of an implementation flow of processing attribute information according to an embodiment of the present application, and as shown in fig. 7, the step S305 "processing attribute information by using a global model to obtain a recommendation result" may include the following steps:
and S3051, acquiring training data and label data corresponding to the training data.
Recording the obtained training data as XtrainAnd the label data corresponding to the training data is recorded as Ytrain
And S3052, constructing and obtaining an initial classification model based on the conversion model, the feature extraction model and a preset classifier.
The feature extraction model here includes a global model MA,localAnd/or a trained first submodel MA,private. Conversion model C from A-sideAFeature extraction model MAAnd a preset classifier PAThe initial classification model constructed is denoted as (C)A,MA,PA). In the embodiment of the present application, the feature extraction model MAMay include only the global model MA,localProcessing the attribute information by using a trained local feature extraction model; feature extraction model MAOr only the first trained submodel MA,privateProcessing the attribute information by using a trained private feature extraction model; feature extraction model MAA global model M may also be includedA,localAnd a trained first submodel MA,privateAnd simultaneously, the trained local feature extraction model and the trained private feature extraction model are utilized to process the attribute information, and when the attribute information is processed together, the sum of the outputs of the local feature extraction model and the private feature extraction model can be used, and the outputs of the local feature extraction model and the private feature extraction model can also be spliced.
And S3053, training the initial classification model based on the training data and the label data to obtain a trained classification model.
This step can be implemented as: processing the training data by using the initial classification model to obtain a training result; processing the label data and the training result by using a preset third loss function to determine a third loss value; and reversely transmitting the third loss value to the initial classification model to adjust the parameters of the initial classification model to obtain the trained classification model.
And S3054, processing the attribute information by using the trained classification model to obtain a recommendation result.
And inputting the attribute information of the object to be recommended into the trained classification model for feature extraction, and further recommending information based on the extracted features to obtain a recommendation result.
In the embodiment of the application, the model is adjusted according to the tag data, the effect of the combined model can be further improved, and then the attribute information is processed, so that the recommendation result is more accurate, and the recommendation success rate is further improved.
Next, the data processing method provided in the embodiment of the present application will be described in conjunction with an exemplary application and implementation of the second participant device provided in the embodiment of the present application. Referring to fig. 8, fig. 8 is another schematic flow chart of a data processing method provided in an embodiment of the present application, which will be described with reference to the steps shown in fig. 8.
Step S801, obtain third feature data and a preset third model held by the second participant device.
Here, the preset third model includes an initial third submodel for performing feature extraction on the third feature data.
Step S802, the third feature data is subjected to feature extraction processing by using the initial third sub-model, and third private features are obtained.
Step S803, perform encryption processing on the third private feature to obtain an encrypted feature.
The encryption feature is the second encryption feature in the above embodiment.
In this embodiment, in order to protect data privacy of the second party device, encryption may be performed on the second feature to obtain a second encryption feature.
Step S804, sending the encryption feature to the first party device, so that the first party device determines, based on the encryption feature, a global model for processing the attribute information of the object to be recommended.
In some embodiments, the preset third model may further include an initial fourth sub-model, which is used to perform projection processing on the second encryption feature to obtain a second projection feature. The second participant device executes the data processing method provided by the above embodiment, so that a joint model suitable for the second participant device can be obtained through training, and the attribute information of the object to be recommended of the second participant device is processed through joint modeling.
Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
In the longitudinal federated learning, under the condition that the data characteristics of the participants are overlapped less and the users are overlapped more, the part of the users and the data with the same users and different user data characteristics of the participants are taken out for the joint machine learning training. For example, there are two participants a and B belonging to the same region, where participant a is a bank and participant B is an e-commerce platform. Participants a and B have more users in the same area, but a and B have different services and different recorded user data characteristics. In particular, the user data characteristics of the a and B records may be complementary. In such a scenario, vertical federated learning may be used to help a and B build a joint machine learning predictive model, helping a and B provide better service to customers.
Fig. 9 is a schematic diagram of a longitudinal federated learning system architecture provided in an embodiment of the present application, and in order to assist a and B joint modeling, a coordinator C is required to participate. A first part: participants a and B implement the encrypted sample alignment, as illustrated in fig. 9. Because the user groups of the two enterprises A and B are not completely overlapped, the system confirms the common users of the two enterprises A and B on the premise that the A and B do not disclose respective data by utilizing an encryption-based user sample alignment technology, and does not expose the users which are not overlapped with each other, so that the modeling is carried out by combining the characteristics of the users.
A second part: and (5) training an encryption model. After the common user population is determined, the machine learning model can be trained using these data. In order to ensure the confidentiality of data in the training process, the coordinator C needs to perform encryption training. Taking the linear regression model as an example, the training process can be divided into the following 4 steps. And step one, the coordinator C distributes the public key to the A and the B to encrypt the data needing to be exchanged in the training process. Step two, the participants A and B interact with each other in an encrypted form to calculate intermediate results of the gradient. Thirdly, the step of: participants a and B perform calculations based on the encrypted gradient values, respectively, while participant B calculates a loss function from its tag data and summarizes the results to coordinator C. Coordinator C calculates the total gradient value by aggregating the results and decrypts it. Fourthly, the step: and the coordinator C respectively transmits the decrypted gradient back to the participants A and B, and the participants A and B update the parameters of the respective models according to the gradient.
And the participants and the coordinator iterate the steps until the loss function converges or the model parameters converge or the maximum iteration number or the maximum training time is reached, so that the whole model training process is completed.
It should be noted that, in both the horizontal federal learning and the vertical federal learning, the encryption operation and the encryption transmission are optional and need to be determined according to specific application scenarios, and not all application scenarios need the encryption operation and the encryption transmission.
As described above, the master in the longitudinal federal scenario owns tag data Y and feature data X1And has characteristic data X2(X2And X1Not identical) to the other, improves the performance of the model itself. Passive square (X)2) The data quality of (2) determines the improvement level of the effect of the active square model in the joint modeling. Therefore, assessing passive data quality is an important step in the federal modeling process. Data quality includes a plurality of evaluation indicators, e.g. byUser contact ratio, modeling effect improvement relative to reference data, and the like. The modeling effect improvement is usually obtained by evaluating after one group/multiple groups of labeled data and the passive side are subjected to combined modeling, and if the performance improvement of the model is large, the data quality of the passive side is considered to be good. However, in an actual scenario, sometimes the number of tag data of the active party is small, and the number of tag data that can be matched between the active party and the passive party is small, so that the data quality evaluation cannot be performed or the evaluation result cannot reflect the real data quality. Therefore, if the quality of the passive data can be evaluated through joint modeling based on a large amount of label-free data, on one hand, the range of accessible data sources can be expanded, and on the other hand, the accuracy of data evaluation is improved.
The embodiment of the application provides a longitudinal federated learning model pre-training method which can be used for only partial data alignment scenes. Fig. 10 is a schematic overall structure diagram of a longitudinal federated learning model pre-training method provided in an embodiment of the present application, and as shown in fig. 10, the pre-training process includes three steps, where the first step is self-supervised training of participants using their respective private models based on aligned users, the second step is model training of each participant using a local model based on local data, and the third step is to aggregate and release the local models in the second step again at a server.
The A party and the B party have data (X) respectivelyA,aligned,XA,unaligned) And (X)B,aligned,XB,unaligned) Wherein X isA,alignedAnd XB,alignedIs a portion of the two-sided data alignment. Two sides are respectively provided with a conversion model CAAnd CBThe user converts the respective raw data into a predefined feature (matrix or vector) of the same dimension, hAAnd hB. For any participant, there is only one transformation model C. Two parties have respective private feature model MA,privateAnd MB,privateActing on the output of the respective conversion models and outputting the characteristic fAAnd fBThe private structure model structures may be different, and the structures and weights are private data of the participating parties, unknown to the other parties. Two parties have respective local feature extraction model MA,localAnd MB,localThe two are connectedThe structure is the same. The input and output of the above private and local feature extraction models have the same dimensions, respectively. In addition, side A has a projection model PAThe B side has a projection model PBRespectively acting on fAAnd fBGenerating a projection zAAnd zB
Training a basic process:
in a first step, training is based on a private model of aligned users.
Fig. 11 is a schematic flowchart of the private model training based on the alignment data according to the embodiment of the present application, and during the training, initialization is performed first: each participant initializes a respective model MA,privateAnd MB,private,PAAnd PB. The A side and the B side realize data alignment through an encryption mode.
The A side and the B side pass through respective models MA,privateAnd MB,privateObtaining a characteristic f of the dataAAnd fBRespectively passing through PAAnd PBTo obtain zAAnd zB. Party B will fBTo party A, party A sends fATo party B, and f can be encrypted in multiple waysA,fBAnd the data security is further increased, such as a differential privacy method.
Secondly, the step is a model updating process, and aims to enable the features of the same user to be as close as possible in the feature space and enable the features of different users to be as far away as possible in the feature space. There are several ways to achieve this goal, such as by choosing a particular loss function, one typical way is as follows. The A side and the B side are respectively based on (z)A,fB) And (z)B,fA) A similarity loss function is calculated.
For the two aligned samples fi and zj, the loss function is specifically of the form (5):
Figure BDA0003528090050000201
wherein, i, j respectively traverse the samples in the A-party and B-party current training batch. For a certain sample i of party a, which is aligned with a sample j of party B, the overall loss function is calculated as above. Calculating the similarity loss of each aligned sample, and then summing to obtain the final result:
Figure BDA0003528090050000202
③ Party A and Party B are respectively based on (z)A,fB) And (z)B,fA) Calculating respective loss functions LA,simAnd LB,simAnd updating the respective model (M) by a gradient descent methodA,private,PA) And (M)B,private,PB)。
Second, local model training based on local data (fig. 3).
Fig. 12 is a schematic flowchart of a local model training process based on local data according to an embodiment of the present application, where the training process is initialized: m that parties A and B have now completed the current stepA,privateAnd MB,privateAnd (4) training. Here, the training process is described by way of example only for party a, and the B-party process is the same as party a.
Part A is based on local data (X)A,aligned,XA,unaligned) Based on MA,privateAnd MA,localRespectively obtaining corresponding characteristics, and then calculating a contrast loss function or a similarity loss function L of the characteristics. There are various ways to approximate the outputs of the two models, such as directly using the mean squared error loss function; if an additional projection model is added, the method as in the first step is used for zooming in.
Updating M based on the loss function calculated in the previous stepA,privateAnd MA,localAccording to the method of the previous step, whether to update M can be selectedA,private
And thirdly, aggregating and issuing the local models.
Fig. 13 is a schematic flow chart of local model aggregation and distribution provided in the embodiment of the present application, where during training, initialization is performed first: m that parties A and B have now completed the current stepA,localAnd MB,local training.
First, each participant integrates a local model, MA,localAnd MB,localAnd uploading the data to a server or a pre-determined participant to perform model aggregation, wherein the model aggregation is performed in a common model aggregation mode such as FedAvg.
And secondly, issuing the aggregated global model to each participant. One simple way is to directly replace the local model with the global model. Alternatively, the method as the second step may be adopted, and the local model is updated by the global model using the local data.
Supervised training/tuning based on pre-trained models. Fig. 14 is a schematic flow chart of labeled supervised learning based on a pre-training model according to an embodiment of the present application, and as shown in fig. 14, after pre-training is completed, a trained local model and a trained private model may be used in combination as a feature extraction model (M)A,private,MA,local) Adding a classification model Q to form a complete model (C, M)A,private,MA,localQ), as in fig. 5. The model is used for local training based on limited tagged data or for joint modeling in conjunction with previously pre-trained participants. Wherein M can be paired according to needs and scenesA,private,MA,localCharacteristic aggregation of the outputs of, e.g. only MA,privateOr MA,localOr the sum of the two outputs is used, or the outputs of the two are spliced.
Each step of the method can be expanded to multiple parties, so that the whole scheme is also suitable for scenes with multiple parties. By fully utilizing the data of each participant, including the non-label alignment data and the non-label misalignment data, a feature extraction model with excellent performance can be obtained through pre-training, so that the features with high degree of discrimination are obtained; and the recommendation result is determined based on the extracted features of the high degree of discrimination, so that the recommendation success rate can be improved. Feature extraction is carried out based on the model, and the problem that the federal modeling effect based on supervised learning is poor under the scene with less label data can be solved.
The method provided by the embodiment of the application combines self-supervision learning and model aggregation, and fully utilizes the non-label alignment data and the non-label misalignment data; by improving the data utilization efficiency, the performance of the feature extraction model is improved, and therefore the performance of the model on the final task is improved.
Continuing with the exemplary structure of the data processing apparatus 555 provided by the embodiment of the present application implemented as a software module, in some embodiments, as shown in fig. 2A, fig. 2A is a schematic structural diagram of the first participant device 400 provided by the embodiment of the present application, and the software module stored in the data processing apparatus 555 of the memory 540 may include:
a first obtaining module 5551, configured to obtain first feature data and second feature data that are held by a first party device, and an encryption feature that is sent by a second party device, where the encryption feature is obtained by performing encryption processing on third feature data that is held by the second party device, and the second feature data and the third feature data are data of different features of the same user;
a training module 5552, configured to train a preset first model based on the first feature data, the second feature data, and the encrypted feature, so as to obtain a trained first model;
a first sending module 5553, configured to send the trained first model to a target device, so that the target device aggregates the trained first model to obtain a global model; the target equipment is server equipment or participant equipment in the federal learning system;
a receiving module 5554, configured to receive the global model sent by the target device;
a second obtaining module 5555, configured to obtain attribute information of an object to be recommended;
the processing module 5556 is configured to process the attribute information by using the global model to obtain a recommendation result.
In some embodiments, the training module 5552 is further configured to:
respectively carrying out conversion processing on the first characteristic data and the second characteristic data by using a preset conversion model to obtain converted first characteristic data and converted second characteristic data;
performing feature extraction processing on the converted first feature data by using a preset first model to obtain a local feature corresponding to the first feature data;
determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature;
processing the local feature and the first private feature by using a preset first loss function to determine a first loss value;
and reversely transmitting the first loss value to the preset first model so as to adjust the parameters of the preset first model and obtain the trained first model.
In some embodiments, the training module 5552 is further configured to:
training a preset second model based on the converted second feature data and the encrypted features to obtain a trained second model;
and performing feature extraction processing on the converted first feature data by using the trained second model to obtain a first private feature corresponding to the first feature data.
In some embodiments, the training module 5552 is further configured to:
performing projection processing on the converted second characteristic data by using a preset second model to obtain projection characteristics;
processing the projection characteristic and the encryption characteristic by using a preset second loss function to determine a second loss value;
and reversely transmitting the second loss value to the preset second model so as to adjust the parameters of the preset second model and obtain the trained second model.
In some embodiments, the training module 5552 is further configured to:
acquiring a preset second model, wherein the preset second model comprises an initial first sub-model and an initial second sub-model;
performing feature extraction processing on the converted second feature data by using the initial first sub-model to obtain a second private feature;
and performing projection processing on the second private features by using the initial second sub-model to obtain projection features.
In some embodiments, the training module 5552 is further configured to:
reversely transmitting the second loss value to the initial first submodel to adjust the parameters of the initial first submodel to obtain a trained first submodel;
reversely transmitting the second loss value to the initial second submodel to adjust the parameters of the initial second submodel to obtain a trained second submodel;
and determining the trained first sub-model and the trained second sub-model as a trained second model.
In some embodiments, the apparatus further comprises:
the adjusting module is used for reversely transmitting the first loss value to the trained first submodel so as to adjust the parameters of the trained first submodel to obtain an updated first submodel;
accordingly, the training module 5552 is further configured to:
and performing feature extraction processing on the converted first feature data by using the updated first sub-model to obtain a first private feature corresponding to the first feature data.
In some embodiments, when the first participant device holds tag data, the processing module 5556 is further configured to:
acquiring training data and label data corresponding to the training data;
constructing an initial classification model based on the conversion model, the feature extraction model and a preset classifier, wherein the feature extraction model comprises the global model and/or the trained second sub-model;
training the initial classification model based on the training data and the label data to obtain a trained classification model;
and processing the attribute information by using the trained classification model to obtain a recommendation result.
In some embodiments, the processing module 5556 is further configured to:
processing the training data by using the initial classification model to obtain a training result;
processing the label data and the training result by using a preset third loss function to determine a third loss value;
and reversely transmitting the third loss value to the initial classification model so as to adjust the parameters of the initial classification model and obtain a trained classification model.
In some embodiments, as shown in fig. 2B, fig. 2B is a schematic structural diagram of the second participant device 410 provided in the embodiment of the present application, and the software modules stored in the data processing apparatus 555 of the memory 540 may include:
a third obtaining module 5557, configured to obtain third feature data held by the second participant device and a preset third model, where the preset third model includes an initial third sub-model;
the feature extraction module 5558 is configured to perform feature extraction processing on the third feature data by using the initial third sub-model to obtain a third private feature;
the encryption module 5559 is configured to encrypt the third private feature to obtain an encrypted feature;
a second sending module 5560, configured to send the encryption feature to the first participant device, so that the first participant device determines, based on the encryption feature, a global model for processing the attribute information of the object to be recommended.
It should be noted that the description of the apparatus in the embodiment of the present application is similar to the description of the method embodiment, and has similar beneficial effects to the method embodiment, and therefore, the description is not repeated.
The embodiment of the present application provides a computer program product, which includes a computer program, and the computer program, when executed by a processor, implements the data processing method provided by the embodiment of the present application.
Embodiments of the present application provide a computer-readable storage medium storing executable instructions, which when executed by a processor, will cause the processor to perform a method provided by embodiments of the present application, for example, a data processing method as shown in fig. 3.
In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (13)

1. A data processing method, based on a federated learning system that includes a first participant device and at least one second participant device, the method being applied to the first participant device, the method comprising:
acquiring first characteristic data, second characteristic data and encryption characteristics sent by the first party equipment, wherein the first characteristic data, the second characteristic data and the encryption characteristics are held by the first party equipment, the encryption characteristics are obtained by encrypting third characteristic data held by the second party equipment, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
training a preset first model based on the first characteristic data, the second characteristic data and the encrypted characteristic to obtain a trained first model;
sending the trained first model to target equipment so that the target equipment aggregates the trained first model to obtain a global model; the target equipment is server equipment or participant equipment in the federal learning system;
receiving the global model sent by the target equipment;
and acquiring attribute information of an object to be recommended, and processing the attribute information by using the global model to obtain a recommendation result.
2. The method according to claim 1, wherein the training a preset first model based on the first feature data, the second feature data and the encrypted feature to obtain a trained first model comprises:
respectively carrying out conversion processing on the first characteristic data and the second characteristic data by using a preset conversion model to obtain converted first characteristic data and converted second characteristic data;
performing feature extraction processing on the converted first feature data by using a preset first model to obtain local features corresponding to the first feature data;
determining a first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature;
processing the local feature and the first private feature by using a preset first loss function to determine a first loss value;
and reversely transmitting the first loss value to the preset first model so as to adjust the parameters of the preset first model and obtain the trained first model.
3. The method according to claim 2, wherein the determining the first private feature corresponding to the first feature data based on the converted first feature data, the converted second feature data and the encrypted feature comprises:
training a preset second model based on the converted second feature data and the encrypted features to obtain a trained second model;
and performing feature extraction processing on the converted first feature data by using the trained second model to obtain a first private feature corresponding to the first feature data.
4. The method according to claim 3, wherein the training a preset second model based on the converted second feature data and the encrypted features to obtain a trained second model comprises:
performing projection processing on the converted second characteristic data by using a preset second model to obtain projection characteristics;
processing the projection characteristic and the encryption characteristic by using a preset second loss function to determine a second loss value;
and reversely transmitting the second loss value to the preset second model so as to adjust the parameters of the preset second model and obtain the trained second model.
5. The method according to claim 4, wherein the projecting the converted second feature data by using a preset second model to obtain a projection feature comprises:
acquiring a preset second model, wherein the preset second model comprises an initial first sub-model and an initial second sub-model;
performing feature extraction processing on the converted second feature data by using the initial first sub-model to obtain a second private feature;
and performing projection processing on the second private characteristic by using the initial second sub-model to obtain a projection characteristic.
6. The method of claim 5, wherein the back propagating the second loss value to the preset second model to adjust parameters of the preset second model to obtain a trained second model comprises:
reversely transmitting the second loss value to the initial first submodel to adjust the parameters of the initial first submodel to obtain a trained first submodel;
reversely transmitting the second loss value to the initial second submodel to adjust the parameters of the initial second submodel to obtain a trained second submodel;
and determining the trained first sub-model and the trained second sub-model as a trained second model.
7. The method of claim 6, further comprising:
reversely transmitting the first loss value to the trained first sub-model to adjust the parameters of the trained first sub-model to obtain an updated first sub-model;
correspondingly, the performing feature extraction processing on the converted first feature data by using the trained second model to obtain a first private feature corresponding to the first feature data includes:
and performing feature extraction processing on the converted first feature data by using the updated first sub-model to obtain a first private feature corresponding to the first feature data.
8. The method of claim 6, wherein processing the attribute information using the global model to obtain a recommendation when the first participant device holds tag data comprises:
acquiring training data and label data corresponding to the training data;
constructing an initial classification model based on the conversion model, the feature extraction model and a preset classifier, wherein the feature extraction model comprises the global model and/or the trained second sub-model;
training the initial classification model based on the training data and the label data to obtain a trained classification model;
and processing the attribute information by using the trained classification model to obtain a recommendation result.
9. The method of claim 8, wherein training the initial classification model based on the training data and the label data to obtain a trained classification model comprises:
processing the training data by using the initial classification model to obtain a training result;
processing the label data and the training result by using a preset third loss function to determine a third loss value;
and reversely transmitting the third loss value to the initial classification model so as to adjust the parameters of the initial classification model and obtain a trained classification model.
10. A data processing apparatus, comprising:
the first obtaining module is used for obtaining first characteristic data, second characteristic data and encryption characteristics sent by a second party device, wherein the first characteristic data, the second characteristic data and the encryption characteristics are held by the first party device, the encryption characteristics are obtained by encrypting third characteristic data held by the second party device, and the second characteristic data and the third characteristic data are data of different characteristics of the same user;
the training module is used for training a preset first model based on the first characteristic data, the second characteristic data and the encrypted characteristic to obtain a trained first model;
the first sending module is used for sending the trained first model to target equipment so that the target equipment can aggregate the trained first model to obtain a global model; the target equipment is server equipment or participant equipment in the federal learning system;
the receiving module is used for receiving the global model sent by the target equipment;
the second acquisition module is used for acquiring the attribute information of the object to be recommended;
and the processing module is used for processing the attribute information by using the global model to obtain a recommendation result.
11. An electronic device, comprising:
a memory for storing executable instructions;
a processor for implementing the data processing method of any one of claims 1 to 9 when executing executable instructions stored in the memory.
12. A computer-readable storage medium storing executable instructions for implementing the data processing method of any one of claims 1 to 9 when executed by a processor.
13. A computer program product comprising a computer program, characterized in that the computer program realizes the data processing method of any one of claims 1 to 9 when executed by a processor.
CN202210198357.XA 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium Pending CN114547658A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210198357.XA CN114547658A (en) 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210198357.XA CN114547658A (en) 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114547658A true CN114547658A (en) 2022-05-27

Family

ID=81662210

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210198357.XA Pending CN114547658A (en) 2022-03-02 2022-03-02 Data processing method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114547658A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863309A (en) * 2023-09-04 2023-10-10 中电科网络安全科技股份有限公司 Image recognition method, device, system, electronic equipment and storage medium
CN117853212A (en) * 2024-03-06 2024-04-09 之江实验室 Longitudinal federal financial wind control method based on knowledge migration and self-supervision learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863309A (en) * 2023-09-04 2023-10-10 中电科网络安全科技股份有限公司 Image recognition method, device, system, electronic equipment and storage medium
CN116863309B (en) * 2023-09-04 2024-01-09 中电科网络安全科技股份有限公司 Image recognition method, device, system, electronic equipment and storage medium
CN117853212A (en) * 2024-03-06 2024-04-09 之江实验室 Longitudinal federal financial wind control method based on knowledge migration and self-supervision learning

Similar Documents

Publication Publication Date Title
CN113159327B (en) Model training method and device based on federal learning system and electronic equipment
CN110189192B (en) Information recommendation model generation method and device
WO2021120676A1 (en) Model training method for federated learning network, and related device
US20220230071A1 (en) Method and device for constructing decision tree
US20230078061A1 (en) Model training method and apparatus for federated learning, device, and storage medium
CN111401558B (en) Data processing model training method, data processing device and electronic equipment
US20230023520A1 (en) Training Method, Apparatus, and Device for Federated Neural Network Model, Computer Program Product, and Computer-Readable Storage Medium
CN112085159B (en) User tag data prediction system, method and device and electronic equipment
JP7095140B2 (en) Multi-model training methods and equipment based on feature extraction, electronic devices and media
CN110797124A (en) Model multi-terminal collaborative training method, medical risk prediction method and device
CN111784001B (en) Model training method and device and computer readable storage medium
CN111081337B (en) Collaborative task prediction method and computer readable storage medium
CN112749749B (en) Classification decision tree model-based classification method and device and electronic equipment
CN114547658A (en) Data processing method, device, equipment and computer readable storage medium
CN112989399B (en) Data processing system and method
WO2023174036A1 (en) Federated learning model training method, electronic device and storage medium
CN111563267A (en) Method and device for processing federal characteristic engineering data
CN116186769A (en) Vertical federal XGBoost feature derivation method based on privacy calculation and related equipment
CN109614780B (en) Biological information authentication method and device, storage medium and electronic equipment
CN112949866A (en) Poisson regression model training method and device, electronic equipment and storage medium
CN113807157A (en) Method, device and system for training neural network model based on federal learning
CN116629379A (en) Federal learning aggregation method and device, storage medium and electronic equipment
CN114723012A (en) Computing method and device based on distributed training system
CN114996741A (en) Data interaction method, device, equipment and storage medium based on federal learning
CN113487423A (en) Personal credit risk prediction model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination