CN116089715A - Sequence recommendation method based on personalized federal technology - Google Patents

Sequence recommendation method based on personalized federal technology Download PDF

Info

Publication number
CN116089715A
CN116089715A CN202310023696.9A CN202310023696A CN116089715A CN 116089715 A CN116089715 A CN 116089715A CN 202310023696 A CN202310023696 A CN 202310023696A CN 116089715 A CN116089715 A CN 116089715A
Authority
CN
China
Prior art keywords
sequence
client
local
data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310023696.9A
Other languages
Chinese (zh)
Inventor
刘柏嵩
董倩
王冰源
邵晓雯
徐尔聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo University
Original Assignee
Ningbo University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo University filed Critical Ningbo University
Priority to CN202310023696.9A priority Critical patent/CN116089715A/en
Publication of CN116089715A publication Critical patent/CN116089715A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Bioethics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a sequence recommending method based on personalized federal technology, which comprises the following steps: s1, preprocessing self-interaction data and attribute values of interaction items on local equipment by each client; s2, the client builds a hash index on the local equipment through self-interaction data; s3, the client performs enhancement operation on the training data on the local equipment based on a Bayesian training strategy; s4, combining the hash index with the enhanced data, constructing a local sequence recommendation frame based on multiple tasks on the local equipment by the client, and training a local sequence model in cooperation with the central server until the local sequence model converges; s5, the client acquires parameters of the embedded network of the user, and combines the converged local sequence model output to acquire a preference prediction result at the next moment so as to finish recommendation. The invention can effectively model the dynamic preference of the user on the premise of privacy protection, and has the advantages of expandability, portability and privacy protection.

Description

Sequence recommendation method based on personalized federal technology
Technical Field
The invention relates to the technical field of computers, in particular to a sequence recommendation method based on a personalized federal technology.
Background
The main task of the sequence recommendation system is to mine the behavior pattern of the user through the behavior sequence with time sequence relationship so as to model the dynamic preference of the client and predict the article selection condition of the client at the next moment. Since the sequence recommendation is highly likely to use the collected user data for malicious transactions and the like, many users worry about their own privacy being compromised and are reluctant to share the data. This can easily cause serious problems such as "data islands", "recommended barriers". Thus, privacy-based sequence recommendation systems have received high attention and considerable research in industry and academia.
The current sequence recommendation method based on privacy protection mainly carries out data protection by introducing cryptography knowledge, such as homomorphic encryption algorithm and differential privacy technology. In the latest technical research, some researchers use federal learning to protect privacy of a sequence recommendation system. While federal architecture may keep user data out of place, such distributed data and architecture reduces the effectiveness of the sequence recommendation model. The above problem is particularly manifested in that, assuming that clients with certain properties, which can be classified as one class, are referred to as one domain: (1) inter-domain model imbalance: a single global model cannot accommodate the sequence features of all domains. This may be due to the different properties of the individual domains, such as custom bias caused by geographical factors, etc. (2) intra-domain model imbalance: the amount of data held by different clients varies. Clients with smaller data volumes cannot effectively cope with complex sequence models and model migration is easily generated in the aggregation operation of the central server. Therefore, there is a need to develop a new sequence recommendation method based on privacy protection to solve these problems in the current sequence recommendation method.
Disclosure of Invention
The invention aims to provide a sequence recommending method based on a personalized federal technology. The invention can solve the problems of unbalance of the inter-domain model and unbalance of the intra-domain model caused by distributed data and clients, can effectively model the dynamic preference of users on the premise of privacy protection, and has the advantages of expandability, portability and privacy protection.
The technical scheme of the invention is as follows: a sequence recommendation method based on personalized federal technology comprises the following steps:
step S1, each client maintains the held data on the local equipment, and preprocesses the self-interaction data and the attribute values of the interaction items so as to remove the intervention items and the abnormal values in the data; meanwhile, the data structures of all clients are aligned in the distributed framework;
s2, the client builds a hash index on the local equipment through self-interaction data, and a hash storage table is formed; after the hash index is constructed, the client uploads the hash index to the central server;
step S3, the client performs enhancement operation on the training data on the local equipment based on a Bayesian training strategy to obtain enhancement data, wherein the enhancement data is used for self-supervision learning of the local sequence model so as to strengthen the characterization capability of the local sequence model;
step S4, combining the hash index constructed in the step S2 with the enhancement data obtained in the step S3, constructing a local sequence recommendation frame based on multiple tasks (the local sequence recommendation frame can effectively integrate a universal sequence encoder to form a local sequence model) on local equipment by a client, and carrying out distributed training on the local sequence model in cooperation with a central server until the local sequence model converges;
firstly, initializing and pre-training the local sequence model by the client; secondly, uploading the local sequence model to a central server, and performing personalized aggregation operation based on local sensitive hash; then, the central server sends a specific aggregation model to the client, and after the client receives the aggregation model, the client performs a model training task of the next round until the local sequence model sending converges;
and S5, the client acquires parameters of the embedded network of the user, and acquires a preference prediction result at the next moment by combining the output of the converged local sequence model to finish personalized recommendation of the client.
The predicted results only exist in the client device, the central server does not contact any training data sources and recommended results, and no communication is performed between the clients.
In the foregoing sequence recommendation method based on the personalized federation technology, the overall preprocessing direction in the step S1 is to sort and record the access condition of the client to all the items, and the access condition is represented by a vector matrix;
preprocessing specifically comprises cleaning abnormal values and missing values of interaction data (ensuring the availability of the data), archiving project attributes of projects accessed by a client (such as project types), archiving user attributes of the client (such as the region, age, occupation, hobbies and the like); and performing data structure alignment operation on the item attributes and the user attributes on all clients, and representing the aligned item attributes and user attributes through a vector matrix.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 2 includes the following sub-steps:
step S2.1, the client converts the history interaction data into binary feature vectors, downloads Guan Haxi data from the central server, and constructs a group of hash function clusters in the local device according to the data downloaded from the central server;
s2.2, combining the binary feature vector with the hash function cluster to generate a hash index specific to the client and uploading the hash index to a central server;
sub-step S2.3, the central server receives the hash index of each client to construct a hash storage table.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 3 includes the following sub-steps:
sub-step S3.1, constructing a sub-sequence set to generate training data, wherein each piece of training data comprises a positive sample and a negative sample sequence pair, and the training data is used for Bayesian model optimization;
s3.2, constructing an enhanced positive sample based on the positive sample in the training data;
s3.3, constructing an enhanced negative sample based on the negative sample in the training data;
and S3.4, combining the enhancement positive sample and the enhancement negative sample into enhancement data in pairs for self-supervision learning of the Bayesian model.
In the foregoing sequence recommendation method based on the personalized federation technology, in the substep S3.2, an enhanced positive sample is constructed according to the correlation between the items and the length of the positive sample sequence, and the degree of correlation between the items is determined according to the rule of calculating the area by the triangle.
In the foregoing sequence recommendation method based on the personalized federal technology, the step S3.3 constructs an enhanced negative sample according to the length of the negative sample sequence.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 4 includes the following sub-steps:
s4.1, constructing a sequence recommendation frame based on multi-task learning, wherein the sequence recommendation frame has expandability and portability, and consists of a user attribute embedded network, a local contrast learning mechanism, a project embedded network and a universal sequence encoder, and after the universal sequence encoder is selected as required, a local sequence model is formed;
s4.2, initializing and updating a local sequence model by receiving initialization parameters of the central server, locally training the local sequence model by using training data and enhancement data, and uploading the trained local sequence model to the central server to wait for the transmission of the personalized aggregation model of the central server;
s4.3, the center server receives the local sequence models from all the clients, obtains similar users of all the clients by inquiring the global hash storage table, and performs personalized aggregation on all the local sequence models according to the inquiring result so that a specific client corresponds to a specific aggregation model, and sends the aggregation model to the corresponding client;
and S4.4, the client receives the aggregation model, and continuously updates the aggregation model for the next round by using the training data and the enhancement data until the local sequence model converges.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 5 includes the following sub-steps:
s5.1, extracting parameters of the user attribute embedded network, which are used as characteristic representation of the user attribute, and performing inner product operation on the characteristic representation and an output vector of the local sequence model;
sub-step S5.2, taking the inner product result as a recommendation prediction result to predict the preference of the user at the next moment; the prediction result is only reserved in the local equipment of the client and is not shared with the central server, so that the privacy of the user is protected;
s5.3, online real-time maintenance of a global hash table by a recommendation system; but when a client exits the federal training framework, the recommender system will not keep any model send records about the client.
Compared with the prior art, the invention has the beneficial effects that: the personalized federation is embodied in the process of cooperatively training the sequence encoder by the user and the central server, the central server generates a specific model according to the data distribution characteristics of the client, the model training condition, the environment position and the like, and the common sequence encoder can be effectively integrated into the federation distributed framework under the condition of not acquiring the privacy data of the user. Therefore, the federal sequence recommendation system based on the invention has expandability, portability and privacy protection.
Specifically, the invention introduces local sensitive hash, thereby designing a personalized federal aggregation strategy. The strategy can alleviate the problem that a single global model cannot adapt to the sequence characteristics of all domains. In addition, a data enhancement method based on Bayesian training is designed to improve a contrast learning strategy, so that the characterization capability of the local sequence encoder is enhanced. Furthermore, the clients with a small amount of training data can also effectively cope with complex sequence models, and can effectively participate in the model training process of other clients in a distributed scene. Therefore, the federal sequence technology with personalized and characterization enhancement is introduced, so that the universal sequence model can be suitable for a common encoder, the dynamic preference of a user can be effectively modeled on the premise of privacy protection, a large amount of computing resources are not consumed when the project preference of the user at the next moment is predicted, and the recommendation performance is improved.
Drawings
FIG. 1 is a flowchart illustrating an implementation of an embodiment of a personalized federal technology based sequence recommendation method;
FIG. 2 is a schematic diagram of the overall framework of a local multitasking sequence model and its distributed training;
fig. 3 is a sample diagram of a data enhancement method involved in a sequence recommendation method based on personalized federal technology.
Detailed Description
The invention is further illustrated by the following figures and examples, which are not intended to be limiting.
Examples: in the invention, the data preprocessing operation performed by the client can be selected according to specific scene requirements. For example, in the POI recommendation scene, preprocessing may be selected for the distance attribute of the POI. An overall framework for personalized federal technology is shown in fig. 2. The framework diagram shows all details concerning the methods and the sequential relationship.
In this embodiment, a sequence recommendation method based on personalized federal technology includes steps S1 to S5 in fig. 1:
step S1: each client maintains the held data on the local equipment and preprocesses the self-interaction data and the attribute value of the interaction item;
preprocessing comprises the steps of cleaning abnormal values and missing values of interaction data, archiving item attributes of items accessed by a client, and archiving own user attributes; and performing data structure alignment operation on the item attributes and the user attributes on all clients, and representing the aligned item attributes and user attributes through a vector matrix.
Step S2: the client converts the historical interaction data into binary feature vectors, and constructs a group of hash function clusters according to the assistance instructions of the central server; and generating a specific hash index by combining the binary feature vector and the hash function family, and uploading the specific hash index to a central server to construct a global hash storage table. The implementation of the whole step S2 specifically comprises the following sub-steps S2.1-S2.3:
s2.1: assuming that there are n items, for a single user u in federal learning, all of its interaction data can be represented as an n-dimensional vector R u =(v 1 ,v 2 ,…,v i ,…,v n );v i Representing the interaction condition of the user u; if v i =0, representing that user u has no access to item i; conversely, v i =1 represents that user u accessed item i.
S2.2: client acquires a random vector q= (Q) 1 ,q 2 ,…q i ,…,q n ) Wherein q is i ∈[-1,1]The hash function defines the following equation (1):
Figure BDA0004043712520000071
wherein the symbols are
Figure BDA0004043712520000072
Representing the dot product operation between vectors.
Based on formula (1), a hash index is constructed as: dex u =[G 1 ,G 2 ,..,G i ,…,G K ]Each G i From { g 1 (R u ),g 2 (R u ),…,g r (R u ) Composition of G i Representing the hash value generated by a hash bucket, K such hash values constitute a hash index Dex u
S2.3: dex is to u Uploaded to a central server and become part of building a global hash table.
Step S3: the client generates an enhanced positive and negative sample pair through training the positive and negative sample pair, training data is used for Bayesian model optimization, and enhanced data is used for self-supervision learning of the Bayesian model; in the data enhancement process, correlation between items and the sequence length of training samples are fully considered, and an example of data enhancement is shown in fig. 3, in which the training sample length l is set to 5. The implementation of the whole step S3 specifically comprises the following sub-steps S3.1-S3.4:
s3.1: the client constructs a subsequence set to generate training samples for BPR optimization; e.g. when l=5, subsequence [ v ] 1 ,v 2 ,v 3 ,v 4 ,v 5 ]And [ v ] 1 ,v 2 ,v 3 ,v 4 ,v 6 ]Respectively, as a pair of positive and negative sequences, i.e. training samples, where v 5 And v 6 Respectively considered as a positive label and a negative label.
S3.2: enhancement of positive samples: with positive sample S p =[v 1 ,v 2 ,…,v i ,…,v lp ]For example, v lp Representing a positive label item; randomly select v i ∈S p Using related items
Figure BDA0004043712520000081
Replacement v i Generating an enhanced positive sample->
Figure BDA0004043712520000082
Wherein v is i ≠v lp
Figure BDA0004043712520000083
Reference v i Related items (associated items); v i And->
Figure BDA0004043712520000084
The degree of association between the two points of interest is determined by the geographic properties between the two points of interest; specifically, the present invention fully considers the user positive sample S p Medium v i Is a last access sequence v of (2) i-1 And the next access sequence v i+1 To calculate +.>
Figure BDA0004043712520000085
And v i Is a degree of association of (a); first, as shown in formula (2), v-based i A set of POIs within a geographic range is selected.
Figure BDA0004043712520000086
Wherein α represents a constant value representing a threshold value of a geographic distance; dis () represents the geographic distance size;
Figure BDA0004043712520000087
then the representation is equal to v i A set of all POIs within a range from each other.
Next, based on the idea of calculating the triangle area in terms of side length, we calculate v as in the following equation (3) -equation (7) i And
Figure BDA0004043712520000088
is a correlation of: />
a=dis(v i+1 ,v i-1 ) (3)
b 1 =dis(v i ,v i-1 ),c 1 =dis(v i ,v i+1 ) (4)
Figure BDA00040437125200000815
Figure BDA0004043712520000089
Figure BDA00040437125200000810
Wherein dis () represents a distance, which may be a cosine distance or an actual physical space distance; a represents v i+1 And v i-1 Distance between b 1 Representing v i And v i-1 Distance between c 1 Representing v i And v i+1 Distance between b 2 Representation of
Figure BDA00040437125200000811
And v i-1 Distance between c 2 Representation->
Figure BDA00040437125200000812
And v i+1 Distance between, p 1 And p 2 Represents the average length of three distances, +.>
Figure BDA00040437125200000813
Can be regarded as v i And->
Figure BDA00040437125200000814
Is a correlation of (a) and (b).
In addition, in the operation of enhancing the positive sample, the number of generated substitutions is determined by the sequence length l, and the specific calculation is shown in the formula (8).
S3.3: enhancement of negative samples: in negative sample S n =[v 1 ,v 2 ,…,v i ,…,v ln ]For example, v ln Representing a negative tag item; selecting a replacement term v i (≠v ln ) Randomly selecting the points of interest v 'which are not interacted by the user' i As a replacement value, a sequence S is generated n =[v 1 ,v 2 ,…,v′ i ,…,v ln ]. The number of substitutions follows the algorithm of equation (9):
Figure BDA0004043712520000091
wherein fre pos Representing the number of substitutions in the enhanced positive samples, fre neg Representing the number of substitutions in the enhanced negative example.
S3.4: the enhanced positive samples and the enhanced negative samples are paired to form enhanced data for self-supervised learning of the Bayesian model.
Step S4: the client builds a sequence recommendation framework based on multi-task learning, wherein the sequence recommendation framework is composed of a user attribute embedded network, a local contrast learning mechanism, a project embedded network and a universal sequence encoder, and after the sequence encoder is selected according to the requirement, a local sequence model is formed; then, the central server combines the global hash table and the combined client to complete the personalized training of the local sequence model. The implementation of the whole step S4 specifically comprises the following sub-steps S4.1-S4.4:
s4.1: the client U represents the attribute of the client U as a one-dimensional vector U through a neural embedded network u The client constructs a local multi-task learning model in conjunction with the desired sequence encoder. Multitasking includes recommending tasks and contrasting learning tasks; for recommendation tasks, a given user u, a sequence encoder f (·), a point of interest embedding V, a sequence Seq of the user at a time stamp t u,t Time embedding T and context characteristics I, a hidden layer h of the sequence encoder can be obtained t 。h t Can be represented by the following formula (10):
h t =f(V,T,I,Seq u,t ;θ s ) (10)
wherein θ s Representing the parameter set of the encoder.
Representing U based on user embedding u Evaluating the preference degree of user u to item j at time stamp t
Figure BDA0004043712520000101
The following formula (11) shows:
Figure BDA0004043712520000102
next, a pairwise Bayesian Personalized Ranking (BPR) is applied to learn the sequence encoder and the parameters θ of the embedded network r Recommended task loss function
Figure BDA0004043712520000103
The following formula (12):
Figure BDA0004043712520000104
where σ (·) represents the sigmod () function, and (j, k) is a pair of positive and negative labels in the training subsequence.
S4.2: for the contrast learning task, the enhanced samples are represented by equation (13) and equation (14):
Figure BDA0004043712520000105
Figure BDA0004043712520000106
wherein, seq aug-p Finger enhanced positive sample, seq aug-n Finger enhancement negative samples;
Figure BDA0004043712520000107
regarded as a pair of aligned samples, the remaining 2N +.>
Figure BDA0004043712520000108
Considered as its negative example; sequence pair->
Figure BDA0004043712520000109
The encoded characteristic is->
Figure BDA00040437125200001010
Its corresponding negative example->
Figure BDA00040437125200001011
The coded characteristic is h' i . Multi-class cross entropy loss function (NCE) learning was used to compare tasks as shown in equation (15) below:
Figure BDA0004043712520000111
wherein τ is a temperature coefficient, and an optimal constant value is obtained after a parameter adjustment experiment;
Figure BDA0004043712520000112
is a sequence pair
Figure BDA0004043712520000113
Coding features of->
Figure BDA0004043712520000114
Is a sequence pair->
Figure BDA0004043712520000115
Is a coding feature of (a); similarity is calculated according to sim () expression;
Figure BDA0004043712520000116
Representing the loss function of the comparison task.
Further, the recommended task and the comparison learning task are combined to obtain a final multi-task sequence model. The optimization penalty of the multitasking sequence model is expressed as the following equation (16):
Figure BDA0004043712520000117
where lambda is a constant coefficient for controlling the duty cycle of the comparison task,
Figure BDA0004043712520000118
is a loss function of the recommended task, +.>
Figure BDA0004043712520000119
Is the loss function of the comparison task, +.>
Figure BDA00040437125200001110
Is the final multitasking loss function, and the client performs local model training according to the operation.
S4.3: client u will model Θ locally u ={W u Upload to central server, W u Representing Θ u The central server queries the global hash table to find other users in the same hash bucket as u
Figure BDA00040437125200001111
It is defined as
Figure BDA00040437125200001112
Next, the central server generates a specific personalized model for u, representing the following formula (17):
Figure BDA00040437125200001113
wherein Avg (·) refers to an averaging operation, α is a constant to control the degree of influence of similar and dissimilar users on the local model of user u, W u Refers to the local model of a similar user to user u, W z Referring to the local model of dissimilar users of user u, ln refers to the number of similar users to user u,
Figure BDA0004043712520000121
refers to the personalized local model generated for user u.
S4.4: the central server will be specific to
Figure BDA0004043712520000122
Is sent to u, u pair->
Figure BDA0004043712520000123
A new training cycle is performed and the process is repeated until the model converges.
Step S5: according to U u And sequence model output h t A preference prediction for the next time is obtained.
The implementation of the whole step S5 specifically comprises the substeps S5.1-S5.3:
s5.1: for trained U u And h t And performing inner product operation.
S5.2: and predicting user preference at the next moment according to the inner product result, wherein all recommendation results are only reserved in the local equipment of the client and are not shared with the central server.
S5.3: the recommendation system maintains a global hash table online in real time, and when a certain client exits from training, the system will not keep any data record about that client.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention can be made by one of ordinary skill in the art without departing from its spirit and are intended to be within the scope of the present invention.

Claims (8)

1. A sequence recommending method based on personalized federal technology is characterized in that: the method comprises the following steps:
step S1, each client maintains the held data on the local equipment and preprocesses self-interaction data and attribute values of interaction items;
s2, the client builds a hash index on the local equipment through self-interaction data, and a hash storage table is formed;
step S3, the client performs enhancement operation on the training data on the local equipment based on a Bayesian training strategy to obtain enhancement data;
step S4, combining the hash index constructed in the step S2 with the enhancement data obtained in the step S3, constructing a local sequence recommendation frame based on multiple tasks on local equipment by a client, and carrying out distributed training on a local sequence model in cooperation with a central server until the local sequence model is converged;
and S5, the client acquires parameters of the embedded network of the user, and acquires a preference prediction result at the next moment by combining the output of the converged local sequence model to finish personalized recommendation of the client.
2. A personalized federal technology based sequence recommendation method according to claim 1, wherein: preprocessing in the step S1 comprises the steps of cleaning abnormal values and missing values of interaction data, archiving item attributes of items accessed by a client, and archiving own user attributes; and performing data structure alignment operation on the item attributes and the user attributes on all clients, and representing the aligned item attributes and user attributes through a vector matrix.
3. A personalized federal technology based sequence recommendation method according to claim 1, wherein: said step 2 comprises the sub-steps of:
step S2.1, the client converts the history interaction data into binary feature vectors, downloads Guan Haxi data from the central server, and constructs a group of hash function clusters in the local device according to the data downloaded from the central server;
s2.2, combining the binary feature vector with the hash function cluster to generate a hash index specific to the client and uploading the hash index to a central server;
sub-step S2.3, the central server receives the hash index of each client to construct a hash storage table.
4. A personalized federal technology based sequence recommendation method according to claim 1, wherein: said step 3 comprises the sub-steps of:
sub-step S3.1, constructing a sub-sequence set to generate training data, wherein each piece of training data comprises a positive sample and a negative sample sequence pair, and the training data is used for Bayesian model optimization;
s3.2, constructing an enhanced positive sample based on the positive sample in the training data;
s3.3, constructing an enhanced negative sample based on the negative sample in the training data;
and S3.4, combining the enhancement positive sample and the enhancement negative sample into enhancement data in pairs for self-supervision learning of the Bayesian model.
5. The personalized federal technology based sequence recommendation method according to claim 4, wherein: in said substep S3.2, an enhanced positive sample is constructed from the correlation between the items and the length of the positive sample sequence, and the degree of correlation between the items is determined from the rule of calculating the area by the edges of the triangle.
6. The personalized federal technology based sequence recommendation method according to claim 4, wherein: in said substep S3.3 an enhanced negative sample is constructed according to the length of the negative sample sequence.
7. A personalized federal technology based sequence recommendation method according to claim 1, wherein: said step 4 comprises the sub-steps of:
s4.1, constructing a sequence recommendation frame based on multi-task learning, wherein the sequence recommendation frame is composed of a user attribute embedded network, a local contrast learning mechanism, a project embedded network and a universal sequence encoder, and after the universal sequence encoder is selected as required, a local sequence model is formed;
s4.2, initializing and updating a local sequence model by receiving initialization parameters of a central server, performing local training on the local sequence model by using training data and enhancement data, and uploading the trained local sequence model to the central server;
s4.3, the center server receives the local sequence models from all the clients, obtains similar users of all the clients by inquiring the global hash storage table, and performs personalized aggregation on all the local sequence models according to the inquiring result so that a specific client corresponds to a specific aggregation model, and sends the aggregation model to the corresponding client;
and S4.4, the client receives the aggregation model, and continuously updates the aggregation model for the next round by using the training data and the enhancement data until the local sequence model converges.
8. A personalized federal technology based sequence recommendation method according to claim 1, wherein: the step 5 comprises the following sub-steps:
s5.1, extracting parameters of the user attribute embedded network, which are used as characteristic representation of the user attribute, and performing inner product operation on the characteristic representation and an output vector of the local sequence model;
sub-step S5.2, taking the inner product result as a recommendation prediction result to predict the preference of the user at the next moment;
and S5.3, the recommendation system maintains the global hash table on line in real time.
CN202310023696.9A 2023-01-09 2023-01-09 Sequence recommendation method based on personalized federal technology Pending CN116089715A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310023696.9A CN116089715A (en) 2023-01-09 2023-01-09 Sequence recommendation method based on personalized federal technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310023696.9A CN116089715A (en) 2023-01-09 2023-01-09 Sequence recommendation method based on personalized federal technology

Publications (1)

Publication Number Publication Date
CN116089715A true CN116089715A (en) 2023-05-09

Family

ID=86213415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310023696.9A Pending CN116089715A (en) 2023-01-09 2023-01-09 Sequence recommendation method based on personalized federal technology

Country Status (1)

Country Link
CN (1) CN116089715A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116361561A (en) * 2023-05-30 2023-06-30 安徽省模式识别信息技术有限公司 Distributed cross-border service recommendation method and system based on variational reasoning
CN117494191A (en) * 2023-10-17 2024-02-02 南昌大学 Point-of-interest micro-service system and method for information physical security

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116361561A (en) * 2023-05-30 2023-06-30 安徽省模式识别信息技术有限公司 Distributed cross-border service recommendation method and system based on variational reasoning
CN117494191A (en) * 2023-10-17 2024-02-02 南昌大学 Point-of-interest micro-service system and method for information physical security

Similar Documents

Publication Publication Date Title
Li et al. Rank-geofm: A ranking based geographical factorization method for point of interest recommendation
CN106940801B (en) A kind of deeply study recommender system and method for Wide Area Network
US20220391778A1 (en) Online Federated Learning of Embeddings
CN113011587B (en) Privacy protection model training method and system
CN116089715A (en) Sequence recommendation method based on personalized federal technology
Li et al. Exploiting explicit and implicit feedback for personalized ranking
Weston et al. Nonlinear latent factorization by embedding multiple user interests
Liu et al. Deep learning based recommendation: A survey
Lu et al. Fedclip: Fast generalization and personalization for clip in federated learning
CN108446964B (en) User recommendation method based on mobile traffic DPI data
Li et al. Dynamic structure embedded online multiple-output regression for streaming data
CN113609398A (en) Social recommendation method based on heterogeneous graph neural network
CN115631008B (en) Commodity recommendation method, device, equipment and medium
Ko et al. Mascot: A quantization framework for efficient matrix factorization in recommender systems
Song et al. Coupled variational recurrent collaborative filtering
Xu et al. Machine learning-driven apps recommendation for energy optimization in green communication and networking for connected and autonomous vehicles
Tang et al. Accurately predicting quality of services in ioT via using self-attention representation and deep factorization machines
Yuan et al. Optimizing factorization machines for top-n context-aware recommendations
CN115439770A (en) Content recall method, device, equipment and storage medium
CN116432039B (en) Collaborative training method and device, business prediction method and device
Ravi et al. Hybrid user clustering-based travel planning system for personalized point of interest recommendation
Shan et al. NASM: nonlinearly attentive similarity model for recommendation system via locally attentive embedding
Crankshaw et al. Scalable training and serving of personalized models
Ye et al. Robust clustered federated learning
Fushimi et al. Accelerating Greedy K-Medoids Clustering Algorithm with Distance by Pivot Generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination