CN117648702A

CN117648702A - Data processing method, device, computer equipment and storage medium

Info

Publication number: CN117648702A
Application number: CN202210981370.2A
Authority: CN
Inventors: 张弘; 黄东波
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-08-16
Filing date: 2022-08-16
Publication date: 2024-03-05

Abstract

The embodiment of the application discloses a data processing method, a device, computer equipment and a storage medium, which can be applied to scenes such as artificial intelligence, cloud technology and the like, and comprise the following steps: acquiring object platform characteristics of a business object; acquiring service model parameters associated with N object groups and object group coding features associated with the N object groups based on the first service model; the first business model and the second business model are obtained after federal learning is carried out on the sample object; performing self-attention feature extraction processing on service model parameters, object group coding features and object platform features in a first service model to obtain object group attention features of a service object for N object groups; and predicting the service index parameters of the service object aiming at the candidate service data in the first service model based on the candidate data characteristics, the object platform characteristics and the object group attention characteristics of the candidate service data. By adopting the embodiment of the application, the data security of the business object can be improved.

Description

Data processing method, device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a data processing method, a data processing device, a computer device, and a storage medium.

Background

In an advertising scenario, a first business model deployed by a first device (e.g., an advertising master side) and a second business model deployed by a second device (e.g., an advertising platform side) may be trained in a federal learning manner. When predicting the business index parameter (e.g., conversion rate) of the business object for the candidate business data (e.g., a certain advertisement data), the real-time request for the business object sent by the second device to the first device will include the object identifier of the business object, so that the first device predicts the business index parameter of the business object for the candidate advertisement data based on the object identifier of the business object. Since the first device and the second device need to transmit the object identifier of the service object in the online application process of the model, this means that there is a risk that the object identifier of the service object is revealed, so that the data security of the service object is reduced.

Disclosure of Invention

The embodiment of the application provides a data processing method, a data processing device, computer equipment and a storage medium, which can improve the data security of a business object.

In one aspect, a method for processing data is provided, where the method is executed by a first device and includes:

acquiring object platform characteristics of a business object; the object platform features are obtained by extracting features of object information of the service object on the second device based on the second service model;

acquiring service model parameters associated with N object groups and object group coding features associated with the N object groups based on the first service model; n is a positive integer greater than 1; the first business model and the second business model are obtained after federal learning is carried out on the sample object;

performing self-attention feature extraction processing on service model parameters, object group coding features and object platform features in a first service model to obtain object group attention features of a service object for N object groups;

and predicting the service index parameters of the service object aiming at the candidate service data in the first service model based on the candidate data characteristics, the object platform characteristics and the object group attention characteristics of the candidate service data.

Based on first sample information and a first initial model of the sample object on the first device, acquiring sample group coding features associated with the N sample groups and initial model parameters associated with the N sample groups; n is the number of object groups configured by the first initial model, and N is a positive integer greater than 1; the sample label of the first sample information is used for representing the actual index parameter of the sample object aiming at the sample service data;

in a first initial model, carrying out self-attention feature extraction processing on sample platform features, sample group coding features and initial model parameters of a sample object to obtain sample group attention features of the sample object for N sample groups; the sample platform features are obtained by extracting features of second sample information of the sample object on the second device based on a second initial model by the second device;

predicting a prediction index parameter of a sample object for sample service data in a first initial model based on sample data characteristics, sample platform characteristics and sample group attention characteristics of the sample service data;

determining a model total loss corresponding to the first initial model based on the prediction index parameter, the actual index parameter, the sample set coding feature and the sample set attention feature; the model total loss is used for indicating the second equipment to perform federal learning training on the second initial model to obtain a second service model;

Based on the total model loss and the attention characteristics of the sample group, performing federal learning training on the first initial model to obtain a first business model; the first business model is used for predicting business index parameters of the business object.

An aspect of an embodiment of the present application provides a data processing apparatus, including:

the platform characteristic acquisition module is used for acquiring object platform characteristics of the business object; the object platform features are obtained by extracting features of object information of the service object on the second device based on the second service model;

the service parameter acquisition module is used for acquiring service model parameters associated with the N object groups and object group coding features associated with the N object groups based on the first service model; n is a positive integer greater than 1; the first business model and the second business model are obtained after federal learning is carried out on the sample object;

the object group feature determining module is used for carrying out self-attention feature extraction processing on the service model parameters, the object group coding features and the object platform features in the first service model to obtain object group attention features of the service object for N object groups;

and the index parameter prediction module is used for predicting the service index parameters of the service object aiming at the candidate service data in the first service model based on the candidate data characteristics, the object platform characteristics and the object group attention characteristics of the candidate service data.

The object platform features are acquired by the first equipment when acquiring a flow request aiming at a business object; the flow requests of the business objects belong to X flow requests sent by the second equipment; x is a positive integer; the X flow requests are screened based on initial platform features corresponding to Y initial objects and a third service model deployed on the second device when the second device generates flow requests corresponding to Y initial objects respectively; y is a positive integer greater than or equal to X; the initial platform characteristic of the initial object is obtained by extracting the characteristic of the object information of the initial object on the second device based on the second service model.

The service model parameters comprise object group pooling vectors corresponding to N object groups respectively; the object group coding feature comprises N object group coding vectors corresponding to the N object groups respectively; the N object groups comprise an object group i; i is a positive integer less than or equal to N;

the object group feature determination module includes:

a platform feature input unit for inputting object platform features to a self-attention layer in the first business model;

a vector acquisition unit for acquiring an object group pooling vector K of the object group i from the N object group pooling vectors _i Obtaining an object group encoding vector V of an object group i from N object group encoding vectors _i ；

A correlation processing unit for pooling the object platform features and the object group into a vector K through the self-attention layer _i Object group encoding vector V _i Performing correlation processing to obtain an attention coefficient W corresponding to the object group i _i ；

And the summation processing unit is used for obtaining the object group attention characteristics of the business object for the N object groups by carrying out summation processing on the N attention coefficients through the self-attention layer when obtaining the attention coefficients corresponding to each of the N object groups.

Wherein, this index parameter prediction module includes:

the splicing processing unit is used for splicing the object platform characteristics and the object group attention characteristics through the characteristic splicing layer in the first service model to obtain service splicing characteristics corresponding to the service objects;

the candidate data feature acquisition unit is used for acquiring candidate data features obtained after feature extraction of the candidate business data;

the index parameter prediction unit is used for inputting the service splicing characteristics and the candidate data characteristics into the multi-layer perceptron in the first service model, extracting and processing the characteristics of the service splicing characteristics and the candidate data characteristics through the multi-layer perceptron, and predicting the service index parameters of the service object aiming at the candidate service data.

The number of the candidate business data is M, and M is a positive integer;

the apparatus further comprises:

the decision parameter determining module is used for taking the M business index parameters as M decision parameters when the business index parameters of the business object aiming at each candidate business data in the M candidate business data are acquired based on the first business model;

the decision parameter return module is used for returning the M decision parameters to the second equipment so as to enable the second equipment to screen the service data to be displayed, which is used for being sent to the service object, from the M candidate service data; the service data to be displayed are P candidate service data which are acquired based on the data sorting result and are ranked ahead; the data sorting result is obtained by the second equipment after sorting the M candidate service data based on sorting parameters corresponding to the M candidate service data respectively; the ranking parameter of a candidate service data is determined based on the decision parameter of a candidate service data; p is a positive integer less than or equal to M.

an initial parameter acquisition module for acquiring sample group coding features associated with the N sample groups and initial model parameters associated with the N sample groups based on first sample information and a first initial model of the sample object on the first device; n is the number of object groups configured by the first initial model, and N is a positive integer greater than 1; the sample label of the first sample information is used for representing the actual index parameter of the sample object aiming at the sample service data;

The sample group feature determining module is used for carrying out self-attention feature extraction processing on sample platform features, sample group coding features and initial model parameters of the sample object in the first initial model to obtain sample group attention features of the sample object for N sample groups; the sample platform features are obtained by extracting features of second sample information of the sample object on the second device based on a second initial model by the second device;

the prediction module is used for predicting prediction index parameters of the sample object for the sample service data based on the sample data characteristics, the sample platform characteristics and the sample group attention characteristics of the sample service data in the first initial model;

the loss determination module is used for determining the model total loss corresponding to the first initial model based on the prediction index parameter, the actual index parameter, the sample group coding characteristic and the sample group attention characteristic; the model total loss is used for indicating the second equipment to perform federal learning training on the second initial model to obtain a second service model;

the training module is used for performing federal learning training on the first initial model based on the total model loss and the attention characteristics of the sample group to obtain a first service model; the first business model is used for predicting business index parameters of the business object.

Wherein the apparatus further comprises:

the second identification set acquisition module is used for acquiring a second object identification set associated with the second equipment based on the sample alignment request when the sample alignment request sent by the second equipment is acquired; the object identifications in the second set of object identifications are object identifications of historical objects recorded on the second device;

a first identification set acquisition module for acquiring a first object identification set associated with a first device based on object identifications of historical objects recorded on the first device;

the alignment processing module is used for performing alignment processing on the first object identification set and the second object identification set to obtain an identification intersection set, and taking a historical object corresponding to the identification object in the identification intersection set as a sample object with an intersection relation with the second device; the identification objects in the identification intersection set belong to a first object identification set and a second object identification set;

an identification intersection return module is used for returning the identification intersection set to the second device so that the second device can determine the sample platform characteristics of the sample objects in the identification intersection set.

The second identifier set obtaining module includes:

the encryption information acquisition unit is used for acquiring the identification encryption information in the sample alignment request when the sample alignment request sent by the second equipment is acquired; the identification encryption information is obtained by the second equipment after encryption processing is carried out on the identification signature information and the platform object identification set based on the first public key of the first equipment; the identification signature information is obtained after the second equipment signs the platform object identification set based on a second private key of the second equipment;

The decryption processing unit is used for decrypting the identification encryption information based on the first private key of the first device to obtain identification signature information and a platform object identification set;

the signature verification processing unit is used for acquiring a second public key of the second equipment, and carrying out signature verification processing on the identification signature information based on the second public key to obtain a signature verification result;

and the identification set determining unit is used for taking the platform object identification set as a second object identification set associated with the second device when the signature verification result indicates that the signature verification is successful.

Wherein the number of sample objects is H, H is a positive integer greater than or equal to N; the first sample information comprises first sample sub-information of H sample objects respectively on the first device;

the initial parameter acquisition module comprises:

the grouping unit is used for acquiring the number of object groups configured for the first initial model, and grouping H sample objects based on the number of the object groups to obtain N sample groups; the sample group includes sample group j; j is a positive integer less than or equal to N;

a sample group encoding vector determining unit for acquiring first sample sub-information associated with each sample object in the sample group j from the H first sample sub-information, and performing feature extraction processing on the acquired first sample sub-information to obtain a sample group encoding vector V corresponding to the sample group j _j ；

The sample group coding feature determining unit is used for obtaining sample group coding features associated with the N sample groups when sample group coding vectors corresponding to each sample group in the N sample groups are obtained;

an initial parameter determination unit for determining initial model parameters associated with the N sample groups based on the sample group encoding features.

Wherein the loss determination module comprises:

a first model loss determination unit configured to determine a first model loss of the first initial model based on the sample group encoding feature and the sample group attention feature;

a second loss determination unit configured to determine a second model loss of the second initial model based on the prediction index parameter and the actual index parameter;

and the total loss determining unit is used for carrying out superposition processing on the first model loss and the second model loss to obtain the model total loss corresponding to the first initial model.

Wherein, this training module includes:

the training result determining unit is used for performing federal learning training on the first initial model based on the total model loss to obtain a model training result;

the parameter adjusting unit is used for adjusting the initial model parameters based on the attention characteristics of the sample group and the total model loss if the model training result indicates that the trained first initial model does not meet the model convergence condition associated with the first initial model;

And the business model determining unit is used for taking the adjusted first initial model as a transition model, performing federal learning training on the transition model, and taking the transition model meeting the model convergence condition as the first business model until the transition model after federal learning training meets the model convergence condition.

In one aspect, the present application provides a computer device comprising: a processor, a memory, a network interface;

the processor is connected with the memory and the network interface, wherein the network interface is used for providing a data communication function, the memory is used for storing a computer program, and the processor is used for calling the computer program to enable the computer device to execute the method provided by the embodiment of the application.

In one aspect, the present application provides a computer readable storage medium storing a computer program adapted to be loaded and executed by a processor, so that a computer device having the processor performs the method provided in the embodiments of the present application.

In one aspect, the present application provides a computer program product comprising a computer program stored on a computer readable storage medium; the processor of the computer device reads the computer program from the computer-readable storage medium, and the processor executes the computer program, so that the computer device performs the method in the embodiment of the present application.

In the embodiment of the application, when the first device predicts the service index parameter of the service object for the candidate service data, the first device does not need to acquire the object identifier of the service object, but acquires the object platform characteristic of the service object sent by the second device. The object platform features are obtained by directly extracting features of object information of the service object on the second device based on a second service model of the second device. Because the first business model completed by the federal learning training has an object grouping function, namely, the sample objects participating in the training can be divided into N object groups, N is a positive integer greater than 1, the first equipment can acquire business model parameters associated with the N object groups and object group coding features associated with the N object groups based on the first business model, so that the first equipment can perform self-attention feature extraction processing on the business model parameters, the object group coding features and the object platform features in the first business model, and the attention features of the business objects to the object groups of the N object groups can be obtained. Since the object group attention feature can be used for replacing the object identifier of the service object, it means that the risk that the object identifier of the service object is leaked can be greatly reduced when the service index parameter is predicted based on the candidate data feature, the object platform feature and the object group attention feature of the candidate service data in the first service model, so that the data security of the service object is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a federal learning-based model application provided in an embodiment of the present application;

FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 4 is a diagram of data interaction in an advertising scenario provided by an embodiment of the present application;

FIG. 5 is a schematic flow chart of a data processing method according to an embodiment of the present application;

FIG. 6 is a schematic view of a scenario for determining a sample object according to an embodiment of the present application;

FIG. 7 is a block diagram of the nature of a self-attention mechanism provided by an embodiment of the present application;

FIG. 8 is a schematic diagram of model training based on federal learning according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a computer device provided in an embodiment of the present application;

FIG. 12 is a schematic diagram of a data processing system according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

It should be appreciated that embodiments of the present application provide a federal learning method based on object grouping, which is applicable to the field of artificial intelligence. Among them, artificial intelligence (Artificial Intelligence, abbreviated as AI) is a theory, method, technique and application system that simulates, extends and expands human intelligence by digital computer or calculation controlled by digital computer, senses environment, acquires knowledge and obtains an optimal result by using knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.

Among them, machine Learning (ML) is a multi-domain interdisciplinary, and involves multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

The federal learning is a distributed machine learning paradigm, and comprises two or more participants, wherein the participants perform joint machine learning through a safe algorithm protocol, and can combine multi-party data source modeling and provide a combined model reasoning service under the condition that original data of all the participants do not exist locally and are not transmitted. For example, in the embodiment of the present application, the first service model deployed on the first device and the second service model deployed on the second device may be obtained after federal learning on a sample object, where the sample object belongs to a historical object recorded in the first device and the second device and having an intersection relationship, that is, the sample object is recorded in both the first device and the second device.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a network architecture according to an embodiment of the present application. As shown in fig. 1, the network architecture may include a server 12F, a server that belongs to the same federal platform as the server 12F, and a cluster of terminal devices. It will be appreciated that the number of servers belonging to the same federal platform as the server 12F may be one or more, and for convenience of explanation, one server (e.g., the server 11F shown in fig. 1) is taken as an example in the embodiment of the present application, to describe a specific implementation of federal learning between the server 11F and the server 12F. The terminal device cluster shown in fig. 1 may include one or more terminal devices, and the terminal device cluster may specifically include a terminal device 100a, a terminal device 100b, terminal devices 100c, …, and a terminal device 100n.

As shown in fig. 1, the terminal devices 100a, 100b, 100c, …, 100n may respectively perform network connection with the above-mentioned server 11F, so that each terminal device may perform data interaction with the server 11F through the network connection. In addition, the terminal device 100a, the terminal device 100b, the terminal devices 100c, …, and the terminal device 100n may also respectively perform network connection with the server 12F, so that each terminal device may perform data interaction with the server 12F through the network connection. It should be understood that the network connection is not limited to a connection manner, and may be directly or indirectly connected through a wired communication manner, may be directly or indirectly connected through a wireless communication manner, or may be other manners, which is not limited herein.

Wherein each terminal device in the terminal device cluster may include: smart terminals with data processing functions such as smart phones, tablet computers, notebook computers, desktop computers, smart speakers, smart watches, vehicle-mounted terminals, smart televisions and the like. It should be appreciated that each terminal device in the cluster of terminal devices shown in fig. 1 may be provided with an application client that, when running in each terminal device, may interact with the server (e.g., server 11F or server 12F) shown in fig. 1, respectively, as described above. The application clients may include, among other things, social clients, multimedia clients (e.g., video clients), entertainment clients (e.g., game clients), information flow clients, educational clients, live clients, and the like. The application client may be an independent client, or may be an embedded sub-client integrated in a client (for example, a social client, an educational client, and a multimedia client), which is not limited herein.

As shown in fig. 1, in the embodiment of the present application, the server 11F and the server 12F may be independent physical servers, may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides a cloud computing service. The number of terminal devices and servers is not limited in the embodiment of the application.

It is to be understood that the server 11F and the server 12F may have different object information of the history object recorded therein, respectively. In the specific embodiments of the present application, related data such as object information is related, and when the embodiments of the present application are applied to specific products or technologies, user permission or consent needs to be obtained, and the collection, use and processing of related data all comply with related laws and regulations and standards of related countries and regions.

In a data recommendation scenario (e.g., an advertising scenario), the server 11F may be a computer device corresponding to the ad-hosting side, mainly responsible for part of the training of the federal model (e.g., training of the first business model), part of the online reasoning, auditing of privacy compliance, and presentation of related reports. Advertisers are sponsors of advertising campaigns, merchants selling or advertising their products and services on the internet, and providers of federation marketing advertisements. In other words, merchants who promote, sell their products or services can act as advertisers, also referred to as clients in the advertising system. The object information (i.e., first object information) of the history object recorded by the server 11F may include an object identification (e.g., object ID) of the history object and first history data information. The first history data information herein may include history object data information and history business data information. For example, the historical object data information refers to the merchandise purchased by the historical object at the advertiser, features of the merchandise in more detail (price of merchandise, reviews of merchandise, etc.), and tags for indicating the depth event of whether the historical object is converted (e.g., whether the historical object is subject to purchase or shopping cart addition, etc.). The historical business data information herein may include data information associated with advertisement data (such as advertisement pictures, advertisement titles, etc.).

In the data recommendation scenario (e.g., advertisement delivery scenario), the server 12F may be a computer device corresponding to the advertisement platform side, and is mainly responsible for part of training of the federal model (e.g., training of the second business model), accessing the real-time request, part of online reasoning, requesting the characteristics, and issuing the object identifier of the sample group. An advertising platform refers to a platform for meeting the needs of advertisers to place advertisements and providing traffic distribution. The object information (i.e., second object information) of the history object recorded by the server 12F may include the object identification history of the history object and the second history data information. The second history data information here may include history object basic information and data information common to the server 11F (i.e., the above-described history business data information). The history object basic information herein may include the age, gender, and contextual characteristics of the history object (e.g., which articles the history object has seen before and after triggering the sample business data, the time of triggering the sample business data, the geographic location the history object was in when triggering the sample business data, etc.).

For ease of understanding, embodiments of the present application may select one terminal device from the plurality of terminal devices shown in fig. 1 as an object device of a service object (e.g., a user accessing an application client in real time). For example, the embodiment of the application may use the terminal device 100a shown in fig. 1 as an object device, where the object device may be integrated with an application client, and the object device may implement data interaction between a service data platform corresponding to the application client and the server 11F.

It should be appreciated that embodiments of the present application may refer to a computer device deployed with a first business model (e.g., server 11F shown in fig. 1) that refers to a business model for predicting business metric parameters (e.g., conversion rate, trigger rate, bid, etc.). The embodiment of the application may also refer to the computer device deployed with the second service model as a second device (for example, the server 12F shown in fig. 1), where the second service model is a service model for extracting features of object information on the second device. The first service model and the second service model may be obtained after federal learning is performed on the sample object. In addition, in order to reduce the occupation of the transmission bandwidth, the second device in the embodiment of the present application may further be deployed with a third service model, where the third service model is a service model (also called a crowd screening model) for reducing the request level when the number of the traffic requests is large.

When predicting business index parameters of a business object (e.g., object a) for candidate business data (e.g., certain advertisement data), object platform features of the business object need to be obtained. The object platform feature is obtained by extracting the feature of the object information of the service object on the server 12F based on the second service model by the server 12F. It can be understood that, since the first service model completed by the federal learning training has an object grouping function, i.e. the sample objects involved in the training are divided into N object groups, where N is a positive integer greater than 1, the server 11F may obtain, based on the first service model, service model parameters associated with the N object groups and object group encoding features associated with the N object groups. Further, the server 11F may perform a self-attention feature extraction process on the service model parameters, the object group coding features and the object platform features in the first service model to obtain object group attention features of the service object for the N object groups, and then predict the service index parameters of the service object for the candidate service data based on the candidate data features, the object platform features and the object group attention features of the candidate service data in the first service model.

Therefore, the server 11F in the embodiment of the present application does not need to obtain the object identifier of the service object, but obtains the object platform feature of the service object through the server 12F, so that the server 11F predicts the service index parameter based on the object platform feature and the first service model, thereby greatly reducing the risk of leakage of the object information of the service object, and improving the data security of the service object.

For ease of understanding, further, please refer to fig. 2, fig. 2 is a schematic diagram of a model application based on federal learning according to an embodiment of the present application. As shown in fig. 2, in the advertisement delivery scenario, the device 21F in the embodiment of the present application may be a first device corresponding to the advertisement primary side, that is, a computer device deployed with a service model 210m (that is, a first service model), for example, the device 21F may be the server 11F shown in fig. 1. The device 22F in the embodiment of the present application may be a second device corresponding to the advertisement platform side, that is, a computer device deployed with the service model 220m (that is, the second service model), for example, the device 22F may be the server 12F shown in fig. 1.

Here, the business model 210m and the business model 220m are obtained after federal learning is performed on the sample object. As shown in fig. 2, the business model 210m may include a network layer L ₁ (Attention, self-Attention layer), network layer L ₂ (concatate, feature concatenation layer,) and network layer L ₃ (Multi-Layer preference, MLP for short, i.e., multi-Layer perceptron). A multi-layer perceptron may be included in the business model 220 m. It should be appreciated that upon completion of federal learning training of the business model 210m by the device 21F, the device 21F may refer to the N sample groups divided into sample objects as N object groups, where N is a positive integer greater than 1. For example, the N object groups may specifically include object group 1, object groups 2, …, and object group N.

The database 2100K shown in fig. 2 may be a database corresponding to the device 21F, and the database 2100K may store calendarsThe first object information of the history object, for example, the first object information herein may include an object identification of the history object and first history data information of the history object. Further, upon completion of the first business model training, the database 2100K can have stored therein business model parameters associated with N object groups (object group pooling features, such as feature 2K shown in FIG. 2) and object group encoding features associated with N object groups (such as feature 2V shown in FIG. 2). Wherein the feature 2K can be a pooling vector K of the object group corresponding to the object group 1 ₁ Object group pooling vector K corresponding to object group 2 ₂ … and object group pooling vector K corresponding to object group N _N Is composed of the components. The feature 2V may be encoded by an object group code vector V corresponding to the object group 1 ₁ Object group coding vector V corresponding to object group 2 ₂ … and object group code vector V corresponding to object group N _N Is composed of the components. The database 2200K shown in fig. 2 may be a database corresponding to the device 22F, and the database 2200K may store second object information of the historical object, for example, the second object information herein may include an object identifier of the historical object and second historical data information of the historical object.

In the advertisement delivery scenario, device 21F may obtain object platform characteristics of a business object sent by device 22F when the business object accesses an application client associated with the advertisement platform corresponding to device 22F. The object platform feature is obtained by extracting the feature of the object information 22X based on the service model 220m shown in fig. 2 when the device 22F acquires the object information (for example, the object information 22X) of the service object from the database 2200K. The object information 22X may include object information of various dimensions, such as object identification of the business object, age, gender, contextual characteristics of the business object, advertisement data information pushed for business object history, and the like.

It should be appreciated that in the specific embodiment of the present application, related data such as object information of a business object, and when the above embodiments of the present application are applied to specific products or technologies, it is required to obtain permission or consent of a user (i.e. the business object), and the collection, use and processing of related data all comply with related laws and regulations and standards of related countries and regions.

Further, the device 21F may obtain, from the database 2100K, the feature 2K associated with the N object groups and the feature 2V associated with the N object groups based on the business model 210m, and further perform, in the business model 210m, a self-attention feature extraction process on the feature 2K, the feature 2V, and the object platform feature (for example, the feature 2Q shown in fig. 2) to obtain an object group attention feature (for example, the feature 2W shown in fig. 2) of the business object for the N object groups.

Meanwhile, the device 21F may perform feature extraction processing on the service data 20S to obtain candidate data features (for example, feature 2S shown in fig. 2) corresponding to the service data 20S. At this time, the device 21F may predict the business index parameters of the business object for the business data 20S based on the feature 2S, the object platform feature, and the feature 2W in the business model 210 m. In other words, the device 21F may be at the network layer L of the traffic model 210m ₂ In the process, feature 2W and object platform features are subjected to feature stitching processing to obtain service stitching features (for example, feature 2P) corresponding to the service objects, and then the service stitching features are processed at the network layer L of the service model 210m ₃ In the process, feature 2P and feature 2S are subjected to feature extraction processing to predict the business index parameters of the business object for the business data 20S.

It can be seen that when predicting the business index parameters of the business object for the business data 20S, the object identifier of the business object is not required to be transmitted between the device 21F and the device 22F, but the object platform feature acquired through the business model 220m deployed on the device 22F is transmitted between the two devices, so that the object group attention feature for replacing the object identifier is acquired through the business model 210m deployed on the device 21F and the object platform feature, which results in that the risk that the object information of the business object is revealed can be greatly reduced when the business index parameters of the business object are predicted through the object group attention feature, the object platform feature and the feature 2S of the business data 20S, so that the user privacy of the business object is effectively protected, and the data security of the business object is improved.

The specific implementation manner of predicting the business index parameters of the business object for the candidate business data by using the user group information (e.g., the object group attention feature) to replace the object identifier of the business object on the basis of performing federal learning by the first device deployed with the first business model and the second device deployed with the second business model can be seen in the embodiments corresponding to fig. 3-8 below.

Further, referring to fig. 3, fig. 3 is a flow chart of a data processing method according to an embodiment of the present application. As shown in fig. 3, the method may be performed by the first device, which may be a terminal device deployed with the first service model, or may be a server deployed with the first service model (for example, the server 11F shown in fig. 1) and is not limited herein. For easy understanding, the embodiment of the present application is described by taking the method performed by the server deployed with the first service model as an example, where the method may at least include the following steps S101 to S104:

step S101, obtaining object platform characteristics of a business object.

The object platform features are obtained by extracting features of object information of the service object on the second device based on the second service model. Specifically, when a business object accesses an application client associated with a business platform (e.g., an advertisement platform) corresponding to a second device, the second device may obtain object information (i.e., second object information) of the business object recorded on the second device. Further, the second device may perform feature extraction processing on object information of the service object based on a second service model deployed on the second device, so as to obtain an object platform feature of the service object, and further generate a flow request for the service object based on the object platform feature of the service object. At this time, the second device may send a traffic request for the service object to the first device, so that the first device obtains the object platform feature of the service object based on the traffic request.

The object platform features are acquired by the first equipment when acquiring a flow request aiming at a business object; the traffic requests of the service object belong to X traffic requests sent by the second equipment; x is a positive integer; the X flow requests are screened based on initial platform features corresponding to Y initial objects and a third service model deployed on the second device when the second device generates flow requests corresponding to Y initial objects respectively; y is a positive integer greater than or equal to X. The initial platform feature of the initial object is obtained by extracting the feature of object information of the initial object on the second device based on the second service model.

It can be understood that at a certain moment, the second device in the embodiment of the present application may generate a large number of flow requests for a plurality of real-time service objects accessed on the service platform, and further may send the large number of flow requests to the first device, so that the first device predicts service index parameters of each service object for candidate service data. Of course, since the computing resources of the first device are limited, and a large number of flow requests cannot be processed at the same time, in order to reduce the occupation of the transmission bandwidth, in the embodiment of the present application, a third service model, that is, a crowd screening model for screening target crowd, may be deployed on the second device, so that when the number of flow requests is large, the request magnitude is reduced.

For example, if the number of initial objects accessing the service platform at a time is 1000, the second device may perform feature extraction on object information recorded by the 1000 initial objects on the second device based on the second service model, and the second service model outputs initial platform features corresponding to the 1000 initial objects. At this time, the second device may generate a flow request for each initial object based on the initial platform feature corresponding to each initial object, to obtain 1000 flow requests.

Further, the second device may compare the total number of current traffic requests with the number of traffic requests processed by the first device. If the total number of current traffic requests is greater than the processing number, the second device may filter the current traffic requests based on a third traffic model. Optionally, if the total number of the current flow requests is less than or equal to the processing number, the second device may directly send the current flow requests to the first device together.

For example, if the computing resource of the first device can process 10 traffic requests, this means that even if the second device sends 1000 traffic requests to the first device, the first device cannot process them, which in turn results in 990 traffic requests being transmitted ineffectively between the two devices. In order to reduce the transmission bandwidth, the second device may input initial platform features corresponding to 1000 initial objects to the third service model, and determine transmission ordering parameters corresponding to the 1000 initial objects by using the third service model, so that the 1000 flow requests may be ordered by using the transmission ordering parameters of each initial object to obtain a request ordering result. Further, the second device may obtain, based on the number of processes of the first device, the top 10 traffic requests from the 1000 traffic requests, and then issue the 10 traffic requests to the first device in the RTA/ADX scenario. Where RTA (Real Time API) refers to real-time API and ADX (ad exchange) refers to advertising exchange platform. At this time, the first device may obtain a flow request of an initial object from the 10 flow requests as a flow request of a service object, and further may obtain an object platform feature of the service object from the flow request of the service object.

Step S102, based on the first service model, service model parameters associated with the N object groups and object group coding features associated with the N object groups are acquired.

The first business model and the second business model are obtained after federal learning is carried out on sample objects, and N object groups can be obtained after training is completed. The N object groups herein may be obtained by grouping the sample objects by the first device based on a self-attention mechanism, or may be obtained by grouping the sample objects by the first device based on other clustering algorithms, which will not be limited herein. For easy understanding, in this embodiment of the present application, sample objects may be grouped based on a self-attention mechanism, and then when training the first service model is completed, the pooling features of the object groups associated with the N object groups (i.e. the service model parameters of the first service model) and the object group coding features associated with the N object groups may be stored together. Wherein, N is a positive integer greater than 1.

Step S103, self-attention feature extraction processing is carried out on the service model parameters, the object group coding features and the object platform features in the first service model, and object group attention features of the service object for N object groups are obtained.

The service model parameters herein refer to object group pooling features associated with N object groups, where the object group pooling features may include object group pooling vectors corresponding to the N object groups respectively. The object group coding feature includes N object group coding vectors corresponding to the N object groups. The N object groups include object group i, where i is a positive integer less than or equal to N. Specifically, the first device may input the object platform feature to the self-attention layer in the first business model, and may further obtain an object group pooling vector K of the object group i from the N object group pooling vectors _i Obtaining an object group encoding vector V of an object group i from N object group encoding vectors _i . At this time, the first device may pool the object platform feature, object group, into the vector K through the self-attention layer _i Object group encoding vector V _i Performing correlation processing to obtain an attention coefficient W corresponding to the object group i _i . The first device may pay attention to the N number of attention by the self-attention layer when deriving the attention coefficient corresponding to each of the N number of object groupsAnd summing the force coefficients to obtain the object group attention characteristics of the business object aiming at the N object groups.

The specific manner of determining the attention characteristics of the object group by the first device in the embodiment of the present application may be referred to the following formula (1) -formula (3):

S _i ＝F(Q,K _i )＝Similarity(Q,K _i ) (1)

A _i ＝softmax(S _i ) (2)

wherein Q may be an object platform feature of the business object; k (K) _i The vector can be pooled for the object group corresponding to the object group i; s is S _i May be used to represent vector similarity or correlation between object platform features and object group pooling vectors for object group i, the most common approaches may include: vector dot product of both, vector similarity of both, or by reintroducing additional neural networks; a is that _i Can be used for representing the normalization result corresponding to the object group i; w (W) _i May be used to represent the attention coefficient corresponding to object group i; the Attention may be used to represent object group Attention characteristics of a business object for N object groups.

As shown in FIG. 2, feature 2K may be a group pooling vector K of objects corresponding to group 1 of objects ₁ Object group pooling vector K corresponding to object group 2 ₂ … and object group pooling vector K corresponding to object group N _N Is composed of the components. Feature 2V may be encoded by object group code vector V corresponding to object group 1 ₁ Object group coding vector V corresponding to object group 2 ₂ … and object group code vector V corresponding to object group N _N Is composed of the components.

Upon acquiring the object platform feature of the business object, the device 21F may take the object platform feature (e.g., feature 2Q) of the business object as input to the network layer L shown in fig. 2 ₁ Query vectors (i.e., self-attention layer). Further, the method comprises the steps of,the device 21F may obtain an object group pooling vector K of the object group i from the N object group pooling vectors included in the feature 2K _i From the N object group encoded vectors included in the feature 2V, an object group encoded vector V of the object group i is obtained _i . At this time, the device 21F may pass through the network layer L ₁ Pooling vector K for feature 2Q and object group _i Object group encoding vector V _i Performing correlation processing to obtain an attention coefficient W corresponding to the object group i _i 。

For example, for object group 1, the device 21F may obtain an object group pooling vector K for object group 1 from N object group pooling vectors included in feature 2K ₁ The object group code vector V of the object group 1 is obtained from the N object group code vectors included in the feature 2V ₁ . Further, the device 21F is at the network layer L ₁ In which, based on the above formula (1), the object platform feature and the object group pooling vector K of the object group 1 can be determined ₁ Based on the above formula (2), the vector similarity of the object group 1 is normalized to obtain a normalized result a corresponding to the object group 1 ₁ Further, based on the above formula (3), the normalization result A ₁ Object group encoding vector V for object group 1 ₁ The dot multiplication processing is performed, so that the attention coefficient W corresponding to the object group 1 can be obtained ₁ 。

By analogy, the device 21F may obtain the attention coefficient corresponding to each of the N object groups, and further perform summation processing on the N attention coefficients, so as to obtain the attention feature (for example, feature 2W shown in fig. 2) of the object group of the N object groups of the service object.

Step S104, predicting the business index parameters of the business object aiming at the candidate business data based on the candidate data characteristics, the object platform characteristics and the object group attention characteristics of the candidate business data in the first business model.

Specifically, the first device may access the feature concatenation layer (e.g., network layer L shown in fig. 2) in the first service model ₂ ) Object platform features and object group attentionThe features are spliced to obtain service splice features (e.g., feature 2P shown in fig. 2) corresponding to the service object. Meanwhile, the first device may further acquire candidate data features (for example, feature 2S) obtained after feature extraction of the candidate service data. Further, the first device may input the service splice feature and the candidate data feature to a multi-layer perceptron (e.g., network layer L shown in FIG. 2) in the first service model ₃ ) And carrying out feature extraction processing on the service splicing features and the candidate data features through the multi-layer perceptron, so as to predict service index parameters of the service object aiming at the candidate service data.

It is understood that the number of candidate service data herein may be M, where M is a positive integer. When the first device obtains the service index parameters of the service object for each candidate service data in the M candidate service data based on the first service model, the first device can take the M service index parameters as M decision parameters, and then can return the M decision parameters to the second device, so that the second device screens the service data to be displayed, which are used for being sent to the service object, from the M candidate service data. Wherein, the service data to be displayed are P candidate service data with the front ordering acquired by the second equipment based on the data ordering result; the data sorting result is obtained by the second equipment after sorting the M candidate service data based on sorting parameters corresponding to the M candidate service data respectively; the ranking parameter of a candidate service data is determined based on the decision parameter of a candidate service data; p is a positive integer less than or equal to M. Among other things, the ranking parameters herein may include bid, conversion rate, trigger rate, quality, and the like. In this embodiment, parameters other than decision parameters may be referred to as auxiliary parameters. For example, if the traffic indicator parameter predicted by the first traffic model is conversion rate, the auxiliary parameters may include trigger rate, bid, quality, and the like. The conversion rate (conversion rate) refers to the probability that the online advertising system predicts the conversion of the candidate business data after the candidate business data is clicked under a certain condition; the triggering rate (click through rate) refers to the probability that the online advertising system predicts that the candidate business data is clicked after being put in under a certain condition; bid (bid) refers to the price an advertiser is bidding for an advertisement, i.e., the desired cost to obtain a single conversion at the time of impression.

For ease of understanding, further, please refer to fig. 4, fig. 4 is a data interaction diagram in an advertisement delivery scenario provided in an embodiment of the present application. As shown in fig. 4, in the advertisement delivery scenario, the device 41F in the embodiment of the present application may be a first device corresponding to the advertisement primary side, that is, a computer device deployed with a first service model, and for example, the device 41F may be the server 11F shown in fig. 1. The device 42F in this embodiment of the present application may be a second device corresponding to the advertisement platform side, that is, a computer device deployed with a second service model, for example, the device 42F may be the server 12F shown in fig. 1.

As shown in fig. 4, the business index parameters predicted by the first business model deployed in device 41F may be used for advertisement placement decisions, including crowd-sourcing, bidding, weighting, and the like. The first business model refers to a business model associated with a decision-making manner. For example, if the decision mode is a crowd-round mode, the first business model may include the model 410m shown in fig. 4 (e.g., a Lookalike model, an LTR model). The purpose of the Lookalike model is to find out other people similar to the target people from a large number of people based on the target people; the LTR model is used for the ordered model.

If the decision mode is a bid weighting mode, the first business model may include a model 420m (e.g., a PKAM model, i.e., a key client model) shown in fig. 4, and the business index parameter output by the PKAM model may be a weighted conversion coefficient of the pointer to the candidate business data.

If the decision mode is a first bid mode (e.g., CPA bid mode), the first business model may include a model 430m (e.g., PLTV model) as shown in FIG. 4, and the business index parameter output by the PLTV model may be a user lifetime value of the pointer to the candidate business data. Where CPA (Cost Per Action) refers to the cost per action, i.e., the manner of pricing refers to billing by the actual effectiveness of the ad placement, i.e., by the effective questionnaire or order of the response, without limiting the amount of ad placement.

If the decision mode is a second bid mode (e.g., CPC bid mode), the first traffic model may include the model 440m shown in fig. 4 (e.g., PCVR model and PLTV model), and the traffic index parameters output by the PCVR model may indicate a predicted conversion rate for candidate traffic data. Where CPC (Cost Per Click) refers to the benefit per trigger.

If the decision mode is a third bid mode (e.g., CPM bid mode), the first traffic model may include models 450m (e.g., PCTR model, PCVR model, and PLTV model) shown in fig. 4, which are used to predict trigger rates for candidate traffic data. Here, CPM (cost per mille) refers to thousands of revenues of display, that is, advertising revenues that can be obtained by displaying every thousand times, and the unit of display may be a web page, an advertising unit, or even a single advertisement.

It should be appreciated that the device 42F shown in FIG. 4 may recall M candidate business data for a business object in an advertising scenario, where M is a positive integer. At this point, device 42F may obtain object platform features of the business object based on the second business model to generate a traffic request for the business object for transmission to device 41F. Upon receiving the traffic request of the service object, the device 41F may acquire the object platform feature of the service object to predict the service index parameters for the M candidate service data.

For example, the device 41F may obtain the object group attention feature for the N object groups based on the first service model deployed on the device 41F, and may further predict the service index parameter of the service object for each candidate service data based on the object group attention feature, the candidate data feature corresponding to the M candidate service data, and the object platform feature. Further, the device 41F may take the traffic index parameters of the M candidate traffic data as M decision parameters to return to the device 42F.

The specific manner of determining the ranking parameters of the candidate service data by the computer device can be seen in formulas (4) - (5):

CPM＝bid×CTR×CVR (4)

Ranking_score＝CPM+quality (5)

Wherein, CPM refers to thousands of show benefits of candidate business data; bid refers to a bid for candidate business data; CTR refers to the triggering rate of the business object for candidate business data; CVR refers to the conversion rate of the business object for candidate business data; quality refers to the quality of the candidate service data; ranking score refers to a Ranking score used to rank candidate traffic data.

For example, during the rougher ranking process, the device 42F may directly use the thousands of presentation benefits determined by equation (4) as ranking parameters for candidate business data. Alternatively, in the sorting process, the device 42F may use the sorting score obtained based on the formula (4) and the formula (5) as the sorting parameter of the candidate service data, which will not be limited herein. It will be appreciated that the higher the value corresponding to the ranking parameter, the earlier the candidate traffic data ranking.

It may be appreciated that the device 42F may select one candidate service data from the M candidate service data as the data to be ranked, and may further determine the ranking parameter of the data to be ranked based on the service index parameter of the data to be ranked. The ranking parameters herein may be determined according to the above formulas (4) - (5), by the bid corresponding to the data to be ranked, the conversion rate of the service object for the data to be ranked, the trigger rate of the service object for the data to be ranked, and the quality score of the candidate service data.

When the sorting parameters corresponding to the M candidate service data respectively are obtained, the device 42F may perform sorting processing on the M candidate service data based on the M sorting parameters, so as to obtain a data sorting result. The data sorting result herein may be used to indicate a display priority order of the M candidate service data, that is, the higher the value corresponding to the sorting parameter, the earlier the display order of the corresponding candidate service data. Further, the device 42F may obtain P candidate service data ranked first based on the data ranking result, and use the P candidate service data as service data to be displayed for sending to the service object.

For example, if the number M of candidate service data is 200, the computer device may obtain the top 10 candidate service data from the data sorting result after sorting the 200 candidate service data, and use the 10 candidate service data as the service data to be displayed for sending to the service object (i.e. the service data recommended to the service object).

Therefore, when the device 42F in the embodiment of the present application issues the traffic request in the RTA/ADX scenario, the object platform feature of the service object is issued by online reasoning through the model, so as to help the advertisement demander platform (demand side platform, abbreviated as DSP) on the advertisement main side of the RTA/ADX participating in the advertisement ordering competition to better estimate the service index parameter, and improve the advertisement delivery return rate. The goal of the joint modeling is to help optimize the first business model of the device 41F (i.e., the ad main side) to improve the accuracy of business index prediction and the advertising effect on the basis of protecting the privacy of the user, thereby promoting the advertisement consumption. At the same time, the decision results of device 41F may be mined as features to help optimize the effect of the models deployed on the device 42F side.

Further, referring to fig. 5, fig. 5 is a flow chart of a data processing method according to an embodiment of the present application. The method may be performed by the first device deployed with the first service model, where the first device may be any one of the terminal devices in the terminal device cluster shown in fig. 1, for example, the terminal device 100 a), may be performed by the server 11F shown in fig. 1, or may be performed interactively by the terminal device with the model application function and the server with the model training function. And are not limited herein. The method may include at least the following steps S201-S209:

step S201, obtain the object platform feature of the business object.

Step S202, based on the first business model, obtaining business model parameters associated with N object groups and object group coding features associated with the N object groups.

The first business model and the second business model are obtained after federal learning is carried out on the sample object. It can be appreciated that, when the first device completes training the first service model, the first device may store the object group pooling feature associated with the N object groups (i.e. the service model parameters of the first service model) and the object group encoding feature associated with the N object groups together, so when the first device obtains the object platform feature of the service object, the first device may quickly obtain the service model parameters associated with the N object groups (for example, feature 2K shown in fig. 2 and described above) and the object group encoding feature associated with the N object groups (for example, feature 2V shown in fig. 2 and described above) directly based on the first service model deployed on the first device. Wherein, N is a positive integer greater than 1.

In step S203, the self-attention feature extraction process is performed on the service model parameters, the object group coding features and the object platform features in the first service model, so as to obtain the object group attention features of the service object for the N object groups.

The service model parameters herein refer to object group pooling features associated with N object groups, where the object group pooling features may include object group pooling vectors corresponding to the N object groups respectively. The object group coding feature includes N object group coding vectors corresponding to the N object groups. The N object groups include object group i, where i is a positive integer less than or equal to N. Specifically, the first device may input the object platform feature to the self-attention layer in the first business model, and may further obtain an object group pooling vector K of the object group i from the N object group pooling vectors _i Obtaining an object group encoding vector V of an object group i from N object group encoding vectors _i . At this time, the first device may pool the object platform feature, object group, into the vector K through the self-attention layer _i Object group encoding vector V _i Performing correlation processing to obtain object group i pairsAttention coefficient of stress W _i . When obtaining the attention coefficient corresponding to each of the N object groups, the first device may obtain the object group attention characteristics of the service object for the N object groups by summing the N attention coefficients through the self-attention layer.

In step S204, the business index parameters of the business object for the candidate business data are predicted in the first business model based on the candidate data features, the object platform features and the object group attention features of the candidate business data.

Specifically, the first device may access the feature concatenation layer (e.g., network layer L shown in fig. 2) in the first service model ₂ ) And performing splicing processing on the object platform characteristics and the object group attention characteristics to obtain service splicing characteristics (such as characteristics 2P shown in fig. 2) corresponding to the service objects. Meanwhile, the first device may further acquire candidate data features (for example, feature 2S) obtained after feature extraction of the candidate service data. Further, the first device may input the service splice feature and the candidate data feature to a multi-layer perceptron (e.g., network layer L shown in FIG. 2) in the first service model ₃ ) And carrying out feature extraction processing on the service splicing features and the candidate data features through the multi-layer perceptron, so as to predict service index parameters of the service object aiming at the candidate service data.

The data processing method in the embodiment of the application may include a model training process (i.e., an offline stage) and a model application process (i.e., an online stage). The offline stage may include sample intersection, joint training, compliance auditing, and the like. The online phase may include deployment and upper limit of the model, data link setup at the time of online reasoning. It can be understood that the steps S201 to S204 illustrate a model application process, and the detailed implementation of the model application process can be referred to the description of the steps S101 to S104 in the embodiment corresponding to fig. 3, which will not be repeated here.

The model training process may be specifically described in the following steps S205 to S209.

In step S205, sample group coding features associated with the N sample groups and initial model parameters associated with the N sample groups are acquired based on the first sample information and the first initial model of the sample object on the first device.

Here, N refers to the number of object groups configured for the first initial model, and N is a positive integer greater than 1. The sample tag of the first sample information may be used to characterize the actual index parameters of the sample object for the sample traffic data. The number of sample objects is H, and H is a positive integer greater than or equal to N; the first sample information includes first sample sub-information of the H sample objects respectively on the first device. Specifically, the first device may obtain the number of object groups configured for the first initial model, and may further group the H sample objects based on the number of object groups to obtain N sample groups. Wherein the sample set herein may comprise sample set j; j is a positive integer less than or equal to N. Further, the first device may acquire first sample sub-information associated with each sample object in the sample group j from the H first sample sub-information, and perform feature extraction processing on the acquired first sample sub-information to obtain a sample group encoding vector V corresponding to the sample group j _j . When the sample group coding vector corresponding to each of the N sample groups is obtained, the first device may obtain sample group coding features associated with the N sample groups, and may further determine initial model parameters associated with the N sample groups based on the sample group coding features.

It should be appreciated that when the first device and the second device perform federal learning on the first initial model and the second initial model at the federal platform, the first device may obtain a sample alignment request sent by the second device, and may further obtain a second set of object identifiers associated with the second device based on the sample alignment request. Wherein the object identification in the second set of object identifications is an object identification of a history object recorded on the second device. In order to effectively ensure the security of the object identifier during data transmission, the second device in the embodiment of the application can process the object identifier set through an encryption algorithm and then transmit the object identifier set when sending the object identifier set recorded by the second device to the first device. Alternatively, the second device may encode the object identifier set based on the identifier encoding rule and then transmit the encoded object identifier set. It will not be limited here. The method for encoding the object identifier by the first device is the same as the method for encoding the object identifier by the second device.

For ease of understanding, embodiments of the present application may be described in terms of processing and then transmitting an object identifier set through an encryption algorithm. The first device may acquire the identifier encryption information in the sample alignment request when acquiring the sample alignment request sent by the second device, where the identifier encryption information is obtained by encrypting, by the second device, identifier signature information and a platform object identifier set based on a first public key of the first device, and the identifier signature information may be obtained by signing, by the second device, the platform object identifier set based on a second private key of the second device. Further, the first device may decrypt the identifier encryption information based on the first private key of the first device to obtain identifier signature information and a platform object identifier set, and then further obtain a second public key of the second device, and perform signature verification processing on the identifier signature information based on the second public key to obtain a signature verification result; when the signature verification result indicates that the signature verification is successful, the first device may treat the platform object identification set as a second object identification set associated with the second device.

At the same time, the first device may also obtain a first set of object identifiers associated with the first device based on object identifiers of historical objects recorded on the first device. Further, the first device may perform alignment processing on the first object identification set and the second object identification set to obtain an identification intersection set, so as to use a history object corresponding to the identification object in the identification intersection set as a sample object having an intersection relationship with the second device; the identification objects in the identification intersection set belong to a first object identification set and a second object identification set. At this point, the first device may return the identified intersection set to the second device to cause the second device to determine sample platform characteristics that identify sample objects in the intersection set.

For ease of understanding, further, please refer to fig. 6, fig. 6 is a schematic diagram of a scenario for determining a sample object according to an embodiment of the present application. As shown in fig. 6, in the advertisement delivery scenario, the device 61F in the embodiment of the present application may be a first device corresponding to the advertisement primary side, that is, a computer device deployed with a first service model, and for example, the device 61F may be the server 11F shown in fig. 1. The device 62F in this embodiment of the present application may be a second device corresponding to the advertisement platform side, that is, a computer device deployed with a second service model, for example, the device 62F may be the server 12F shown in fig. 1.

As shown in fig. 6, when federal learning is performed on the first initial model where the device 61F is located and the second initial model where the device 62F is located in the federal platform, the device 62F may acquire the object identification of the history object recorded on the device 62F from the device 62F correspondence database 6200K, and may further use the acquired object identification as the object identification set (i.e., the platform object identification set) associated with the device 62F. Wherein, the platform object identification set can take 6 as examples, and specifically can comprise object identification B _a Object identification B _b Object identification B _c Object identification B _d Object identification B _e Object identification B _f 。

To effectively ensure data security when object identifiers are transferred between two devices, the device 62F may sign the set of platform object identifiers based on the private key of the device 62F (i.e., the second private key) to obtain the identifier signature information. It will be appreciated, among other things, that the device 62F may obtain a hash calculation rule for the platform object identification set, which may be a digest algorithm agreed upon in advance by the device 62F with other devices in the federal platform (e.g., device 61F). Thus, the device 62F may hash the set of platform object identifiers based on the hash rule to obtain digest information (e.g., digest information h) of the set of platform object identifiers. The summary information of the platform object identifier set determined by the device 62F may be referred to as first summary information in the embodiments herein. Further, the device 62F may perform signature processing on the first digest information based on the private key of the device 62F, so that identification signature information may be obtained. Further, the device 62F may obtain a public key (i.e. the first public key) of the device 61F, and encrypt the identification signature information and the platform object identification set, so as to obtain identification encrypted information, and may generate a sample alignment request for sending to the device 61F by using the identification encrypted information.

When the device 61F receives the sample alignment request, the identifier encryption information in the sample alignment request may be obtained, and further, the identifier encryption information may be decrypted based on the private key (i.e., the first private key) of the device 61F, so as to obtain identifier signature information and a platform object identifier set. At this time, the device 61F may acquire the public key (i.e., the second public key) of the device 62F, and perform the signature verification process on the tag signature information based on the second public key, so as to obtain a signature verification result.

It may be appreciated that the device 61F may perform signature verification on the identification signature information based on the public key of the device 62F, to obtain the first digest information. Meanwhile, the device 61F may also acquire the same hash rule as the device 62F, and perform hash computation on the platform object identifier set, so that digest information (for example, digest information H) of the platform object identifier set may be obtained. The summary information of the platform object identifier set determined by the device 61F may be referred to as second summary information in the embodiments herein. At this time, the device 61F may compare the first summary information with the second summary information to obtain a signature verification result, so as to determine whether the platform object identifier set is tampered. It will be appreciated that if the first summary information is not the same as the second summary information, the device 61F may determine that the verification result indicates that the verification failed. Alternatively, if the first digest information is the same as the second digest information, the device 61F may determine that the signature verification result indicates that the signature verification was successful, which means that the platform object identification set has not been tampered with, and the platform object identification set is indeed transmitted by the device 62F, at which time the device 61F may treat the platform object identification set as the second object identification set associated with the device 62F.

Meanwhile, the first device may further acquire, from the database 6100K corresponding to the device 61F, the object identifier of the history object recorded on the device 61F, and further may use the acquired object identifier as the object identifier set associated with the device 61F (i.e., the first object identifier set). Wherein the first object identifier set may take 5 as an example, and specifically may include object identifier B _a Object identification B _c Object identification B _d Object identification B _e Object identification B _g 。

At this time, the device 61F may perform alignment processing on the first object identification set and the second object identification set, so as to obtain an identification intersection set, that is, the device 61F may add, to the identification intersection set, an object identification that exists together with the first object identification set and the second object identification set, and further may use a historical object corresponding to the identification object in the identification intersection set as a sample object that has an intersection relationship with the second device. Wherein the set of identification intersections shown in FIG. 6 may include object identification B _a Object identification B _c Object identification B _d Object identification B _e That is, the sample object herein may include the object identification B _a Corresponding object a, object identification B _c Corresponding object c, object identification B _d Corresponding object d, object identification B _e A corresponding object e.

Further, the device 61F may return the identification intersection set to the device 62F, so that the second device inputs object information (i.e. second sample information) recorded by the sample objects on the device 62F into the second initial model based on the object identification of the sample objects in the identification intersection set, and the second initial model outputs the sample platform feature corresponding to each sample object, and further sends the sample platform feature corresponding to each sample object to the device 61F, so that the device 61F performs federal learning training on the first initial model.

It should be understood that when the first device acquires H sample objects, the first device may acquire the number of object groups configured for the first initial model in advance, where the number of object groups may be the number of clusters obtained by clustering, by using a clustering algorithm, the sample features corresponding to the object information (i.e., the first sample information) of the sample objects on the first device. For example, the number of object groups may be N, where N is a positive integer greater than 1, and H is a positive integer greater than or equal to N.

Further, the first device may group the H sample objects based on the number of object groups to obtain N sample groups. The grouping refers to a process of respectively aggregating objects according to object attributes to calculate statistical attributes of the objects after the anonymization of the objects. Wherein the sample set herein may comprise sample set j; j is a positive integer less than or equal to N. Further, the first device may acquire first sample sub-information associated with each sample object in the sample group j from the H first sample sub-information, and perform feature extraction processing on the acquired first sample sub-information to obtain a sample group encoding vector V corresponding to the sample group j _j . When the sample group coding vector corresponding to each of the N sample groups is acquired, the first device may obtain the sample group coding features (i.e., values in the self-attention mechanism) associated with the N sample groups, and may further determine initial model parameters (keys in the self-attention mechanism) associated with the N sample groups based on the sample group coding features. For example, the first device may directly pool the sample group encoded features to obtain sample group pooled features associated with the N sample groups, i.e., initial model parameters of the first initial model.

In step S206, in the first initial model, the sample platform features, the sample group coding features and the initial model parameters of the sample object are subjected to a self-attention feature extraction process, so as to obtain the sample group attention features of the sample object for the N sample groups.

The sample platform features are obtained by extracting features of second sample information of the sample object on the second device based on the second initial model. The initial model parameters comprise sample group pooling vectors respectively corresponding to the N sample groups, and the sample group coding features comprise sample group coding vectors respectively corresponding to the N sample groups. Specifically, the first An apparatus may input sample platform features to the self-attention layer in the first initial model, and may obtain a sample group pooling vector K for sample group j from the N sample group pooling vectors _j Sample group code vector V of sample group j is obtained from N sample group code vectors _j . At this time, the first device may pool the vector K for the sample platform feature, the sample group, through the self-attention layer of the first initial model _j Sample group encoding vector V _j Correlation processing is carried out to obtain the attention coefficient W corresponding to the sample group j _j . When obtaining the attention coefficient corresponding to each of the N sample groups, the first device may obtain the sample group attention characteristic of the sample object for the N sample groups by summing the N attention coefficients through the self-attention layer.

It should be appreciated that embodiments of the present application may use a self-attention mechanism instead of clustering operations such that models implement end-to-end learning, with one model training to complete the overall functional requirements. For ease of understanding, further, please refer to fig. 7, fig. 7 is a schematic diagram of a self-attention mechanism provided in an embodiment of the present application. As shown in FIG. 7, embodiments of the present application may consider constituent elements in the self-attention mechanism as being corresponded by a series of sample groups <K,V>The data pairs are formed. Here, the number of sample groups may be 4, and specifically includes sample group 1, sample group 2, sample group 3, and sample group 4. Wherein the initial model parameters (i.e., the sample group pooling features associated with the 4 sample groups) may be represented by sample group pooling vector 7K corresponding to sample group 1 ₁ Sample group pooling vector 7K corresponding to sample group 2 ₂ … and sample group pooling vector K corresponding to sample group 4 ₄ Is composed of the components. Sample group encoding features associated with 4 sample groups may be represented by sample group encoding vectors 7V for sample group 1 ₁ Sample group coding vector 7V corresponding to sample group 2 ₂ … and sample group code vector V corresponding to sample group 4 ₄ Is composed of the components.

When the sample platform feature of the sample object is obtained, the first device may consider the sample platform feature as a Query in the self-Attention mechanism, further, the first device obtains a weight coefficient of each sample group by determining a correlation between the sample platform feature and the sample group coding feature (Value), that is, a similarity or a correlation between the sample platform feature and the sample group coding vectors of each sample group, and then performs weighted summation on the initial model parameters (that is, the sample group pooling feature), so as to obtain a final Attention Value (that is, the sample group Attention feature).

Step S207 predicts a predictor parameter of the sample object for the sample traffic data based on the sample data feature, the sample platform feature, and the sample group attention feature of the sample traffic data in the first initial model.

Specifically, the first device may perform a stitching process on the sample platform feature and the sample group attention feature through the feature stitching layer in the first initial model, so as to obtain a sample stitching feature corresponding to the sample object. Meanwhile, the first device can also acquire sample data features obtained after feature extraction of the sample service data. Further, the first device may input the sample splicing feature and the sample data feature to a multi-layer perceptron in the first initial model, and perform feature extraction processing on the sample splicing feature and the sample data feature through the multi-layer perceptron, so as to predict a prediction index parameter of the sample object for the sample service data.

Step S208, determining a model total loss corresponding to the first initial model based on the predictor parameter, the actual index parameter, the sample set encoding feature and the sample set attention feature.

The model total loss can be used for indicating the second equipment to perform federal learning training on the second initial model to obtain a second service model, and the second service model can be used for predicting object platform characteristics of the service object. The first device may determine a first model loss of the first initial model based on the sample set encoding features and the sample set attention features; and further, the second model loss of the second initial model can be determined based on the prediction index parameter and the actual index parameter, and the model total loss corresponding to the first initial model is obtained by performing superposition processing on the first model loss and the second model loss.

Step S209, based on the model total loss and the attention characteristics of the sample group, performing federal learning training on the first initial model to obtain a first business model.

Specifically, the first device may perform federal learning training on the first initial model based on the total model loss to obtain a model training result. If the model training result indicates that the trained first initial model meets the model convergence condition associated with the first initial model, the first device may directly use the first initial model as the first service model. Optionally, if the model training result indicates that the trained first initial model does not meet the model convergence condition associated with the first initial model, the first device may adjust the initial model parameter based on the attention feature of the sample group and the total model loss, further use the adjusted first initial model as a transition model, and perform federal learning training on the transition model until the transition model after federal learning training meets the model convergence condition, and use the transition model meeting the model convergence condition as the first service model. The first business model is used for predicting business index parameters of the business objects.

The specific way to adjust the model parameters of the first initial model can be seen in the following formula (6):

K _t ＝α*K _t-1 +(1-α)*W (6)

wherein, K is here _t Refers to the model parameters, K, of the first initial model currently undergoing federal learning _t-1 The model parameters of the first initial model at the previous federal learning training time are referred to; alpha refers to a constant that falls within a range of intervals from a first threshold, which may be 0.7, to a second threshold, which may be 0.9, which may be dynamically adjusted according to traffic demands, and which will not be limited herein. W refers to the sample group attention characteristics of the sample object for the N sample groups.

For ease of understanding, further, please refer to fig. 8, fig. 8 is a schematic diagram of model training based on federal learning according to an embodiment of the present application. As shown in fig. 8, in the advertisement delivery scenario, the device 81F in the embodiment of the present application may be a first device corresponding to the advertisement main side, that is, a computer device deployed with a model 810m (that is, a first initial model), for example, the device 81F may be the server 11F shown in fig. 1. The device 82F in the embodiment of the present application may be a second device corresponding to the advertisement platform side, that is, a computer device deployed with a model 820m (i.e., a second initial model) and a model 830m (i.e., a third initial model), for example, the device 82F may be the server 12F shown in fig. 1.

The device 81F and the device 82F may perform secure sample alignment based on the federal platform to obtain an identification intersection set, so that a historical object corresponding to an object identification in the identification intersection set may be used as a sample object for performing federal learning training. The device 82F may perform feature extraction processing on second sample information (e.g., sample information 82X) recorded by the sample object on the device 82F based on the model 820m to obtain a sample platform feature corresponding to the sample object, and may then send the sample platform feature to the device 81F.

As shown in fig. 8, the model 810m may include a self-attention layer, a feature stitching layer, and a multi-layer perceptron. Upon acquisition of a sample object, device 81F may acquire sample group encoding features associated with N sample groups (e.g., feature 8V shown in fig. 8) and initial model parameters associated with the N sample groups (i.e., sample group pooling features, e.g., feature 8K shown in fig. 8) based on the first sample information of the H sample objects on device 81F (e.g., sample information 81X shown in fig. 8) and model 810 m. The feature 8K may be a feature obtained by pooling the sample group encoded features.

Further, the device 81F may perform a self-attention feature extraction process on the sample platform features, the features 8K, and the features 8V of the sample object in the self-attention layer in the model 810m to obtain sample group attention features (e.g., the features 8W) of the sample object for the N sample groups. Further, the device 81F may perform feature stitching processing on the sample platform features and the features 8W through the feature stitching layer in the model 810m, so as to obtain sample stitching features (for example, features 8P) corresponding to the sample objects, and further input the features 8P and sample data features (for example, features 8S) corresponding to the sample service data into the multi-layer perceptron in the model 810m together, so as to predict the prediction index parameters of the sample objects for the sample service data.

It should be understood that the model convergence condition acquired by the device 81F refers to a service condition for instructing the model to stop training, and the model convergence condition may be a training time threshold (for example, 100 times) set for the model training time to reach the model convergence condition, or may be a loss threshold in the model convergence condition for the total loss of the current model, which will not be limited herein.

It will be appreciated that the device 81F may determine a first model loss (e.g., model loss 10L shown in fig. 8) for the model 810m based on the feature 8W and the sample set encoding feature, determine a second model loss (e.g., model loss 20L shown in fig. 8) for the model 810m based on the predictor parameter and the actual index parameter indicated by the sample tag, and further perform a superposition process on the model loss 10L and the model loss 20L to obtain a model total loss for the model 810 m. Further, the device 81F may train the first initial model based on the model total loss to obtain model training results. It will be appreciated that if the model training result indicates that the trained model 810m meets the model convergence condition, the device 81F may directly use the model 810 as the first service model. Optionally, if the model training result indicates that the trained model 810m does not meet the model convergence condition, the device 81F may calculate a gradient based on the attention feature of the sample set and the total model loss by back propagation, so as to adjust the model parameters of the model 810m based on the gradient and the formula (6), further use the adjusted model 810m as a transition model, and perform federal learning training on the transition model until the transition model after federal learning training meets the model convergence condition, and use the transition model meeting the model convergence condition as the first service model.

It will be appreciated that device 81F may also determine return parameters for returning to device 82F based on the model total loss. For example, device 81F may directly take the model total loss as a return parameter to cause device 82F to federally learn train model 820m based on the model total loss. Alternatively, the device 81F may take the gradient determined based on the model total loss as a return parameter to cause the device 82F to perform federal learning training on the model 820m directly based on the gradient. The manner in which the device 82F performs federal learning training on the model 820m may be referred to as the manner in which the device 81F performs federal learning training on the model 810m, and will not be described in detail herein. In addition, the device 82F may further train the model 830m based on the output result of the model 820m (e.g., the sample platform feature of the sample object) when training the model 820m is completed, and further train the model parameters of the model 830m based on the sample platform feature of the sample object and the sample tag of the sample object for the model 830m (e.g., whether to send a traffic request) to obtain a third traffic model for reducing the request level.

In the embodiment of the application, in the online reasoning process, the object identification of the service object is not required to be transmitted between the first equipment and the second equipment, the object platform characteristic determined by the second equipment based on the second service model is transmitted, and then the anonymized object group information (namely, the object group attention characteristic of the service object for N objects) is acquired through the object platform characteristic and the first service model in the follow-up process of the first equipment, so that the risk that the object identification of the service object is revealed is greatly reduced, the data security of the service object is improved, namely, the object privacy is further protected while the model precision is maintained. In the federal learning training process, the sample objects are grouped to obtain N object groups associated with the first business model when training is completed, so that when a large number of business index parameters of the business objects need to be predicted subsequently, the prediction accuracy and the prediction efficiency can be improved.

Further, referring to fig. 9, fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. As shown in fig. 9, the data processing apparatus 1 may be a computer program (including program code) running in a computer device, for example, the data processing apparatus 1 is an application software; the data processing device 1 may be adapted to perform the respective steps of the method provided by the embodiments of the present application. As shown in fig. 9, the data processing apparatus 1 may be operated on a first device, which is a computer device deployed with a first service model, and the first device may be the server 11F in the embodiment corresponding to fig. 1, or may be any one of the terminal device clusters in the embodiment corresponding to fig. 1, for example, the terminal device 100a, which is operated with the first service model. The data processing apparatus 1 may include: platform feature acquisition module 10, business parameter acquisition module 20, object group feature determination module 30, index parameter prediction module 40, decision parameter determination module 50, and decision parameter return module 60.

The platform feature acquisition module 10 is configured to acquire object platform features of a service object; the object platform features are obtained by extracting features of object information of the service object on the second device based on the second service model;

The service parameter obtaining module 20 is configured to obtain service model parameters associated with N object groups and object group coding features associated with the N object groups based on the first service model; n is a positive integer greater than 1; the first business model and the second business model are obtained after federal learning is carried out on the sample object;

the object group feature determining module 30 is configured to perform self-attention feature extraction processing on the service model parameters, the object group coding features and the object platform features in the first service model, so as to obtain object group attention features of the service object for N object groups.

the object group feature determination module 30 includes: a platform feature input unit 301, a vector acquisition unit 302, a correlation processing unit 303, and a summation processing unit 304.

The platform feature input unit 301 is configured to input the object platform feature to the self-attention layer in the first business model;

The vector obtaining unit 302 is configured to obtain an object group pooling vector K of the object group i from N object group pooling vectors _i Obtaining an object group encoding vector V of an object group i from N object group encoding vectors _i ；

The correlation processing unit 303 is configured to pool the object platform feature and the object group into a vector K through the self-attention layer _i Object group encoding vector V _i Performing correlation processing to obtain an attention coefficient W corresponding to the object group i _i ；

The summation processing unit 304 is configured to, when obtaining the attention coefficients corresponding to each of the N object groups, obtain the attention characteristics of the service object for the object groups of the N object groups by performing summation processing on the N attention coefficients by using the self-attention layer.

The specific implementation manner of the platform feature input unit 301, the vector obtaining unit 302, the correlation processing unit 303, and the summation processing unit 304 may be referred to the description of step S103 in the embodiment corresponding to fig. 3, and the description thereof will not be repeated here.

The index parameter prediction module 40 is configured to predict, in the first service model, a service index parameter of the service object for the candidate service data based on the candidate data feature, the object platform feature, and the object group attention feature of the candidate service data.

Wherein, the index parameter prediction module 40 includes: the stitching processing unit 401, the candidate data feature acquisition unit 402, and the index parameter prediction unit 403.

The stitching processing unit 401 is configured to perform stitching processing on the object platform feature and the object group attention feature through a feature stitching layer in the first service model, so as to obtain a service stitching feature corresponding to the service object;

the candidate data feature obtaining unit 402 is configured to obtain candidate data features obtained after feature extraction is performed on candidate service data;

the index parameter prediction unit 403 is configured to input the service stitching feature and the candidate data feature to a multi-layer perceptron in the first service model, perform feature extraction processing on the service stitching feature and the candidate data feature through the multi-layer perceptron, and predict a service index parameter of the service object for the candidate service data.

The specific implementation manner of the stitching processing unit 401, the candidate data feature obtaining unit 402, and the index parameter predicting unit 403 may refer to the description of step S104 in the embodiment corresponding to fig. 3, and the detailed description will not be repeated here.

The number of the candidate business data is M, and M is a positive integer;

the decision parameter determining module 50 is configured to, when acquiring, based on the first service model, service index parameters of a service object for each candidate service data of the M candidate service data, take the M service index parameters as M decision parameters;

the decision parameter returning module 60 is configured to return the M decision parameters to the second device, so that the second device screens the service data to be displayed, which is used for being sent to the service object, from the M candidate service data; the service data to be displayed are P candidate service data which are acquired based on the data sorting result and are ranked ahead; the data sorting result is obtained by the second equipment after sorting the M candidate service data based on sorting parameters corresponding to the M candidate service data respectively; the ranking parameter of a candidate service data is determined based on the decision parameter of a candidate service data; p is a positive integer less than or equal to M.

The specific implementation manners of the platform feature obtaining module 10, the service parameter obtaining module 20, the object group feature determining module 30, the index parameter predicting module 40, the decision parameter determining module 50 and the decision parameter returning module 60 can be referred to the description of the step S101 to the step S104 in the embodiment corresponding to the above-mentioned fig. 3, and the detailed description thereof will not be repeated here. In addition, the description of the beneficial effects of the same method is omitted.

Further, referring to fig. 10, fig. 10 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing means 2 may be a computer program (comprising program code) running in a computer device, for example the data processing means 2 is an application software; the data processing device 2 may be adapted to perform the respective steps of the method provided by the embodiments of the present application. As shown in fig. 10, the data processing apparatus 2 may be operated in a first device, which is a computer device deployed with a first service model, and the first device may be the server 11F in the embodiment corresponding to fig. 1, or may be any one of the terminal device clusters in the embodiment corresponding to fig. 1, for example, the terminal device 100a, which is operated with the first service model. The data processing apparatus 2 may include: an initial parameter acquisition module 100, a sample group feature determination module 200, a prediction module 300, a loss determination module 400, a training module 500, a second identification set acquisition module 600, a first identification set acquisition module 700, an alignment processing module 800, and an identification intersection return module 900.

The initial parameter obtaining module 100 is configured to obtain sample group coding features associated with N sample groups and initial model parameters associated with the N sample groups based on first sample information and a first initial model of the sample object on the first device; n is the number of object groups configured by the first initial model, and N is a positive integer greater than 1; the sample tag of the first sample information is used to characterize the actual index parameters of the sample object for the sample traffic data.

the initial parameter acquisition module 100 includes: a grouping unit 1010, a sample group encoding vector determining unit 1020, a sample group encoding feature determining unit 1030, and an initial parameter determining unit 1040.

The grouping unit 1010 is configured to obtain the number of object groups configured for the first initial model, and group the H sample objects based on the number of object groups to obtain N sample groups; the sample group includes sample group j; j is a positive integer less than or equal to N;

the sample group encoding vector determining unit 1020 is configured to obtain first sample sub-information associated with each sample object in the sample group j from the H first sample sub-information, and perform feature extraction processing on the obtained first sample sub-information to obtain a sample group encoding vector V corresponding to the sample group j _j ；

The sample group coding feature determining unit 1030 is configured to obtain sample group coding features associated with N sample groups when sample group coding vectors corresponding to each of the N sample groups are acquired;

The initial parameter determining unit 1040 is configured to determine initial model parameters associated with the N sample groups based on the sample group encoding features.

The specific implementation manners of the grouping unit 1010, the sample group coding vector determining unit 1020, the sample group coding feature determining unit 1030 and the initial parameter determining unit 1040 can be referred to the description of step S205 in the embodiment corresponding to fig. 5, and the detailed description will not be repeated here.

The sample group feature determining module 200 is configured to perform self-attention feature extraction processing on sample platform features, sample group coding features and initial model parameters of a sample object in a first initial model, so as to obtain sample group attention features of the sample object for N sample groups; the sample platform features are obtained by extracting features of second sample information of the sample object on the second device based on a second initial model by the second device;

the prediction module 300 is configured to predict, in a first initial model, a prediction index parameter of a sample object for sample service data based on a sample data feature, a sample platform feature, and a sample group attention feature of the sample service data;

the loss determination module 400 is configured to determine a model total loss corresponding to the first initial model based on the predictor parameter, the actual index parameter, the sample set encoding feature, and the sample set attention feature; the model total loss is used for indicating the second equipment to perform federal learning training on the second initial model to obtain a second business model.

Wherein the loss determination module 400 includes: a first loss determination unit 4010, a second loss determination unit 4020, and a total loss determination unit 4030.

The first loss determining unit 4010 is configured to determine a first model loss of the first initial model based on the sample group encoding feature and the sample group attention feature;

the second loss determining unit 4020 is configured to determine a second model loss of the second initial model based on the prediction index parameter and the actual index parameter;

the total loss determining unit 4030 is configured to perform superposition processing on the first model loss and the second model loss, to obtain a model total loss corresponding to the first initial model.

The specific implementation manner of the first loss determining unit 4010, the second loss determining unit 4020 and the total loss determining unit 4030 may be referred to the description of step S208 in the embodiment corresponding to fig. 5, and the detailed description thereof will not be repeated here.

The training module 500 is configured to perform federal learning training on the first initial model based on the total model loss and the attention characteristics of the sample set, to obtain a first business model; the first business model is used for predicting business index parameters of the business object.

Wherein, this training module 500 includes: training result determining unit 5010, parameter adjusting unit 5020 and business model determining unit 5030.

The training result determining unit 5010 is configured to perform federal learning training on the first initial model based on the total model loss to obtain a model training result;

the parameter adjusting unit 5020 is configured to adjust initial model parameters based on the attention characteristics of the sample set and the total model loss if the model training result indicates that the trained first initial model does not meet the model convergence condition associated with the first initial model;

the service model determining unit 5030 is configured to perform federal learning training on the transition model by using the adjusted first initial model as a transition model, and use the transition model that satisfies the model convergence condition as the first service model until the transition model after federal learning training satisfies the model convergence condition.

The specific implementation manner of the training result determining unit 5010, the parameter adjusting unit 5020, and the service model determining unit 5030 may be referred to the description of step S209 in the embodiment corresponding to fig. 5, and the detailed description will not be repeated here.

The second identifier set obtaining module 600 is configured to obtain, when a sample alignment request sent by the second device is obtained, a second object identifier set associated with the second device based on the sample alignment request; the object identifications in the second set of object identifications are object identifications of historical objects recorded on the second device.

Wherein the second identification set obtaining module 600 includes: an encrypted information acquisition unit 6010, a decryption processing unit 6020, a signature verification processing unit 6030, and an identification set determination unit 6040.

The encryption information acquiring unit 6010 is configured to acquire, when acquiring a sample alignment request sent by the second device, identification encryption information in the sample alignment request; the identification encryption information is obtained by the second equipment after encryption processing is carried out on the identification signature information and the platform object identification set based on the first public key of the first equipment; the identification signature information is obtained after the second equipment signs the platform object identification set based on a second private key of the second equipment;

the decryption processing unit 6020 is configured to decrypt the identifier encryption information based on the first private key of the first device, to obtain identifier signature information and a platform object identifier set;

the signature verification processing unit 6030 is configured to obtain a second public key of the second device, and perform signature verification processing on the identification signature information based on the second public key to obtain a signature verification result;

the identification set determining unit 6040 is configured to use the platform object identification set as a second object identification set associated with the second device when the signature verification result indicates that the signature verification is successful.

The specific implementation manner of the encryption information obtaining unit 6010, the decryption processing unit 6020, the signature verification processing unit 6030 and the identification set determining unit 6040 may refer to the description of the second object identification set in the embodiment corresponding to fig. 6, and the detailed description will not be repeated here.

The first identifier set obtaining module 700 is configured to obtain a first object identifier set associated with a first device based on object identifiers of historical objects recorded on the first device;

the alignment processing module 800 is configured to perform alignment processing on the first object identification set and the second object identification set to obtain an identification intersection set, and use a history object corresponding to the identification object in the identification intersection set as a sample object having an intersection relationship with the second device; the identification objects in the identification intersection set belong to a first object identification set and a second object identification set;

the identify intersection return module 900 is configured to return the identify intersection set to the second device, so that the second device determines sample platform features of sample objects in the identify intersection set.

The specific implementation manner of the initial parameter obtaining module 100, the sample group feature determining module 200, the predicting module 300, the loss determining module 400, the training module 500, the second identifier set obtaining module 600, the first identifier set obtaining module 700, the alignment processing module 800 and the identifier intersection returning module 900 may be referred to the description of step S201 to step S209 in the embodiment corresponding to fig. 5, and will not be further described herein. In addition, the description of the beneficial effects of the same method is omitted.

Further, referring to fig. 11, fig. 11 is a schematic diagram of a computer device according to an embodiment of the present application. As shown in fig. 11, the computer device 1000 may be a first device, where the first device is a computer device deployed with a first service model, and the first device may be the server 11F in the embodiment corresponding to fig. 1, or may be any one of the terminal device clusters in the embodiment corresponding to fig. 1, for example, the terminal device 100a running the first service model. The computer device 1000 may include: at least one processor 1001, e.g., a CPU, at least one network interface 1004, memory 1005, at least one communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may also optionally be at least one storage device located remotely from the aforementioned processor 1001. As shown in fig. 11, the memory 1005, which is one type of computer storage medium, may include an operating system, a network communication module, a user interface module, and a device control application. In some embodiments, the computer device may further include a user interface 1003 shown in fig. 11, for example, if the computer device is a terminal device (for example, the terminal device 100 a) with the first service model deployed as shown in fig. 1, the computer device may further include the user interface 1003, where the user interface 1003 may include a Display screen (Display), a Keyboard (Keyboard), and so on.

In the computer device 1000 shown in fig. 11, the network interface 1004 is mainly used for network communication; while user interface 1003 is primarily used as an interface for providing input to a user; and the processor 1001 may be used to invoke device control applications stored in the memory 1005.

It should be understood that the computer device 1000 described in the embodiments of the present application may perform the description of the data processing method in the embodiments corresponding to fig. 3 and 5, and may also perform the description of the data processing apparatus 1 in the embodiments corresponding to fig. 9 or the description of the data processing apparatus 2 in the embodiments corresponding to fig. 10, which are not repeated herein. In addition, the description of the beneficial effects of the same method is omitted.

Furthermore, it should be noted here that: the embodiments of the present application further provide a computer readable storage medium, in which the aforementioned computer program executed by the data processing apparatus 1 or the data processing apparatus 2 is stored, and the computer program includes program instructions, when executed by the processor, can execute the description of the data processing method in the embodiment corresponding to fig. 3 or fig. 5, and therefore, a description will not be given here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the computer-readable storage medium according to the present application, please refer to the description of the method embodiments of the present application. As an example, program instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or, alternatively, across multiple computing devices distributed across multiple sites and interconnected by a communication network, where the multiple computing devices distributed across multiple sites and interconnected by a communication network may constitute a blockchain system.

The embodiment of the present application further provides a computer readable storage medium, where a computer program is stored, where the computer program includes program instructions, and when executed by a processor, implement a data processing method provided by each step in fig. 3 and 5, and specifically refer to an implementation manner provided by each step in fig. 3 and 5, which is not described herein again.

Further, referring to fig. 12, fig. 12 is a schematic structural diagram of a data processing system according to an embodiment of the present application. The data processing system 3 may comprise data processing means 1a and data processing means 2a. The data processing apparatus 1a may be the data processing apparatus 1 in the embodiment corresponding to fig. 9, and it is to be understood that the data processing apparatus 1a may be integrated in the first device, which is a computer device deployed with a first service model, and the first device may be the server 11F in the embodiment corresponding to fig. 1, or any one of the terminal device clusters in the embodiment corresponding to fig. 1 may be a terminal device running with the first service model, for example, the terminal device 100a, which will not be described herein. The data processing apparatus 2a may be the data processing apparatus 2 in the embodiment corresponding to fig. 10, and it is understood that the data processing apparatus 2a may also be integrated in the first device, and therefore, a detailed description thereof will not be provided here. In addition, the description of the beneficial effects of the same method is omitted. For technical details not disclosed in the embodiments of the data processing system according to the present application, please refer to the description of the method embodiments of the present application.

Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of computer programs, which may be stored on a computer-readable storage medium, and which, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.

The foregoing disclosure is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the claims herein, as the equivalent of the claims herein shall be construed to fall within the scope of the claims herein.

Claims

1. A method of data processing, the method performed by a first device, comprising:

acquiring object platform characteristics of a business object; the object platform features are obtained by extracting features of object information of the business object on the second device based on a second business model by the second device;

acquiring service model parameters associated with N object groups and object group coding features associated with the N object groups based on a first service model; n is a positive integer greater than 1; the first service model and the second service model are obtained after federal learning is carried out on a sample object;

Performing self-attention feature extraction processing on the service model parameters, the object group coding features and the object platform features in the first service model to obtain object group attention features of the service object for the N object groups;

predicting a business index parameter of the business object for the candidate business data based on the candidate data characteristic, the object platform characteristic and the object group attention characteristic of the candidate business data in the first business model.

2. The method of claim 1, wherein the object platform feature is obtained by the first device upon obtaining a traffic request for the business object; the flow requests of the business objects belong to X flow requests sent by the second equipment; x is a positive integer; the X flow requests are screened based on initial platform features corresponding to Y initial objects and a third service model deployed on the second device when the second device generates flow requests corresponding to Y initial objects respectively; y is a positive integer greater than or equal to X; the initial platform characteristic of the initial object is obtained by extracting the characteristic of the object information of the initial object on the second device based on the second service model by the second device.

3. The method of claim 1, wherein the business model parameters comprise respective object group pooling vectors for the N object groups; the object group coding feature comprises object group coding vectors corresponding to the N object groups respectively; the N object groups comprise an object group i; i is a positive integer less than or equal to N;

the self-attention feature extraction processing is performed on the service model parameters, the object group coding features and the object platform features in the first service model to obtain object group attention features of the service object for the N object groups, including:

inputting the object platform features to a self-attention layer in the first business model;

obtaining an object group pooling vector K of the object group i from N object group pooling vectors _i Obtaining an object group coding vector V of the object group i from N object group coding vectors _i ；

Pooling, by the self-attention layer, the object platform feature, the object group, vector K _i The object group encoding vector V _i Performing correlation processing to obtain an attention coefficient W corresponding to the object group i _i ；

And when the attention coefficient corresponding to each of the N object groups is obtained, summing the N attention coefficients through the self-attention layer to obtain the attention characteristics of the business object for the N object groups.

4. The method of claim 1, wherein predicting, in the first business model, business metric parameters of the business object for candidate business data based on candidate data features of the candidate business data, the object platform characteristics, and the object group attention features, comprises:

splicing the object platform features and the object group attention features through a feature splicing layer in the first service model to obtain service splicing features corresponding to the service objects;

acquiring candidate data characteristics obtained after extracting the characteristics of the candidate service data;

and inputting the service splicing characteristics and the candidate data characteristics into a multi-layer perceptron in the first service model, performing characteristic extraction processing on the service splicing characteristics and the candidate data characteristics through the multi-layer perceptron, and predicting service index parameters of the service object aiming at the candidate service data.

5. The method of claim 1, wherein the number of candidate service data is M, M being a positive integer;

the method further comprises the steps of:

when the service index parameters of the service object aiming at each candidate service data in the M candidate service data are acquired based on the first service model, the M service index parameters are used as M decision parameters;

Returning the M decision parameters to the second device so that the second device screens the service data to be displayed, which are used for being sent to the service object, from the M candidate service data; the service data to be displayed are P candidate service data which are acquired based on a data sorting result and are ranked ahead; the data sorting result is obtained by the second equipment after sorting the M candidate service data based on sorting parameters corresponding to the M candidate service data respectively; the ranking parameter of a candidate service data is determined based on the decision parameter of a candidate service data; p is a positive integer less than or equal to M.

6. A method of data processing, the method performed by a first device, comprising:

based on first sample information and a first initial model of a sample object on the first device, acquiring sample group coding features associated with N sample groups and initial model parameters associated with the N sample groups; n is the number of object groups configured by the first initial model, and N is a positive integer greater than 1; the sample label of the first sample information is used for representing the actual index parameter of the sample object aiming at the sample service data;

In the first initial model, performing self-attention feature extraction processing on sample platform features, sample group coding features and initial model parameters of the sample object to obtain sample group attention features of the sample object for N sample groups; the sample platform features are obtained by extracting features of second sample information of the sample object on the second device based on a second initial model by the second device;

predicting a predictor parameter of the sample object for sample business data in the first initial model based on sample data features of the sample business data, the sample platform features and the sample group attention features;

determining a model total loss corresponding to the first initial model based on the predictor parameter, the actual index parameter, the sample set coding feature and the sample set attention feature; the model total loss is used for indicating the second equipment to perform federal learning training on the second initial model to obtain a second service model;

based on the model total loss and the sample group attention characteristics, performing federal learning training on the first initial model to obtain a first business model; the first business model is used for predicting business index parameters of the business object.

7. The method of claim 6, wherein the method further comprises:

when a sample alignment request sent by the second device is obtained, obtaining a second object identification set associated with the second device based on the sample alignment request; the object identifiers in the second object identifier set are object identifiers of history objects recorded on the second device;

acquiring a first object identification set associated with the first device based on object identifications of historical objects recorded on the first device;

performing alignment processing on the first object identification set and the second object identification set to obtain an identification intersection set, and taking a historical object corresponding to an identification object in the identification intersection set as a sample object with an intersection relation with the second device; the identification objects in the identification intersection set belong to a first object identification set and a second object identification set;

the identified intersection set is returned to the second device to cause the second device to determine sample platform characteristics of the sample objects in the identified intersection set.

8. The method of claim 7, wherein upon obtaining the sample alignment request sent by the second device, obtaining a second set of object identifiers associated with the second device based on the sample alignment request, comprises:

When a sample alignment request sent by the second equipment is obtained, obtaining identification encryption information in the sample alignment request; the identification encryption information is obtained by the second equipment after encryption processing of the identification signature information and the platform object identification set based on the first public key of the first equipment; the identification signature information is obtained after the second equipment signs the platform object identification set based on a second private key of the second equipment;

decrypting the identification encryption information based on a first private key of the first device to obtain the identification signature information and the platform object identification set;

acquiring a second public key of the second equipment, and performing signature verification processing on the identification signature information based on the second public key to obtain a signature verification result;

and when the signature verification result indicates that the signature verification is successful, the platform object identification set is used as a second object identification set associated with the second device.

9. The method of claim 6, wherein the number of sample objects is H, H being a positive integer greater than or equal to N; the first sample information comprises first sample sub-information of H sample objects on the first device respectively;

The obtaining sample group coding features associated with the N sample groups and initial model parameters associated with the N sample groups based on the first sample information and the first initial model of the sample object on the first device includes:

acquiring the number of object groups configured for a first initial model, and grouping H sample objects based on the number of object groups to obtain N sample groups; the sample group includes a sample group j; j is a positive integer less than or equal to N;

acquiring first sample sub-information associated with each sample object in a sample group j from H pieces of first sample sub-information, and performing feature extraction processing on the acquired first sample sub-information to obtain a sample group coding vector V corresponding to the sample group j _j ；

When a sample group coding vector corresponding to each sample group in N sample groups is obtained, sample group coding characteristics associated with the N sample groups are obtained;

initial model parameters associated with the N sample groups are determined based on the sample group encoding features.

10. The method of claim 6, wherein the determining the model total loss corresponding to the first initial model based on the predictor parameter, the actual index parameter, the sample set encoding feature, and the sample set attention feature comprises:

Determining a first model loss for the first initial model based on the sample set encoding features and the sample set attention features;

determining a second model loss for the second initial model based on the predictor parameters and the actual index parameters;

and carrying out superposition processing on the first model loss and the second model loss to obtain a model total loss corresponding to the first initial model.

11. The method of claim 6, wherein the federally learning training the first initial model based on the model total loss and the sample set of attention features to obtain a first business model comprises:

performing federal learning training on the first initial model based on the model total loss to obtain a model training result;

if the model training result indicates that the trained first initial model does not meet the model convergence condition associated with the first initial model, adjusting the initial model parameters based on the sample group attention features and the model total loss;

and taking the adjusted first initial model as a transition model, performing federal learning training on the transition model, and taking the transition model meeting the model convergence condition as a first service model when the transition model after federal learning training meets the model convergence condition.

12. A data processing apparatus, comprising:

the platform characteristic acquisition module is used for acquiring object platform characteristics of the business object; the object platform features are obtained by extracting features of object information of the business object on the second device based on a second business model by the second device;

the service parameter acquisition module is used for acquiring service model parameters associated with N object groups and object group coding features associated with the N object groups based on the first service model; n is a positive integer greater than 1; the first service model and the second service model are obtained after federal learning is carried out on a sample object;

the object group feature determining module is used for carrying out self-attention feature extraction processing on the service model parameters, the object group coding features and the object platform features in the first service model to obtain object group attention features of the service object for the N object groups;

and the index parameter prediction module is used for predicting the service index parameters of the service object aiming at the candidate service data in the first service model based on the candidate data characteristics of the candidate service data, the object platform characteristics and the object group attention characteristics.

13. A data processing apparatus, comprising:

an initial parameter acquisition module for acquiring sample group coding features associated with N sample groups and initial model parameters associated with the N sample groups based on first sample information and a first initial model of a sample object on a first device; n is the number of object groups configured by the first initial model, and N is a positive integer greater than 1; the sample label of the first sample information is used for representing the actual index parameter of the sample object aiming at the sample service data;

the sample group feature determining module is used for carrying out self-attention feature extraction processing on the sample platform features, the sample group coding features and the initial model parameters of the sample object in the first initial model to obtain sample group attention features of the sample object for N sample groups; the sample platform features are obtained by extracting features of second sample information of the sample object on the second device based on a second initial model by the second device;

a prediction module, configured to predict, in the first initial model, a predictor parameter of the sample object for the sample service data based on a sample data feature of the sample service data, the sample platform feature, and the sample group attention feature;

The loss determination module is used for determining a model total loss corresponding to the first initial model based on the prediction index parameter, the actual index parameter, the sample group coding feature and the sample group attention feature; the model total loss is used for indicating the second equipment to perform federal learning training on the second initial model to obtain a second service model;

the training module is used for performing federal learning training on the first initial model based on the total model loss and the attention characteristics of the sample group to obtain a first business model; the first business model is used for predicting business index parameters of the business object.

14. A computer device, comprising: a processor and a memory and a network interface;

the processor is connected to the memory and the network interface, wherein the network interface is configured to provide a data communication function, the memory is configured to store a computer program, and the processor is configured to invoke the computer program to cause the computer device to perform the method of any of claims 1 to 11.

15. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program adapted to be loaded and executed by a processor to cause a computer device having the processor to perform the method of any of claims 1 to 11.

16. A computer program product, characterized in that it comprises a computer program stored in a computer readable storage medium, which computer program is adapted to be read and executed by a processor to cause a computer device with the processor to perform the method of any one of claims 1 to 11.