CN116089715A - Sequence recommendation method based on personalized federal technology - Google Patents
Sequence recommendation method based on personalized federal technology Download PDFInfo
- Publication number
- CN116089715A CN116089715A CN202310023696.9A CN202310023696A CN116089715A CN 116089715 A CN116089715 A CN 116089715A CN 202310023696 A CN202310023696 A CN 202310023696A CN 116089715 A CN116089715 A CN 116089715A
- Authority
- CN
- China
- Prior art keywords
- sequence
- client
- local
- data
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000005516 engineering process Methods 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 46
- 230000003993 interaction Effects 0.000 claims abstract description 12
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 230000002776 aggregation Effects 0.000 claims description 16
- 238000004220 aggregation Methods 0.000 claims description 16
- 239000013598 vector Substances 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 13
- 238000003860 storage Methods 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 5
- 230000002159 abnormal effect Effects 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 description 4
- 238000006467 substitution reaction Methods 0.000 description 4
- 238000012512 characterization method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Bioethics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a sequence recommending method based on personalized federal technology, which comprises the following steps: s1, preprocessing self-interaction data and attribute values of interaction items on local equipment by each client; s2, the client builds a hash index on the local equipment through self-interaction data; s3, the client performs enhancement operation on the training data on the local equipment based on a Bayesian training strategy; s4, combining the hash index with the enhanced data, constructing a local sequence recommendation frame based on multiple tasks on the local equipment by the client, and training a local sequence model in cooperation with the central server until the local sequence model converges; s5, the client acquires parameters of the embedded network of the user, and combines the converged local sequence model output to acquire a preference prediction result at the next moment so as to finish recommendation. The invention can effectively model the dynamic preference of the user on the premise of privacy protection, and has the advantages of expandability, portability and privacy protection.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a sequence recommendation method based on a personalized federal technology.
Background
The main task of the sequence recommendation system is to mine the behavior pattern of the user through the behavior sequence with time sequence relationship so as to model the dynamic preference of the client and predict the article selection condition of the client at the next moment. Since the sequence recommendation is highly likely to use the collected user data for malicious transactions and the like, many users worry about their own privacy being compromised and are reluctant to share the data. This can easily cause serious problems such as "data islands", "recommended barriers". Thus, privacy-based sequence recommendation systems have received high attention and considerable research in industry and academia.
The current sequence recommendation method based on privacy protection mainly carries out data protection by introducing cryptography knowledge, such as homomorphic encryption algorithm and differential privacy technology. In the latest technical research, some researchers use federal learning to protect privacy of a sequence recommendation system. While federal architecture may keep user data out of place, such distributed data and architecture reduces the effectiveness of the sequence recommendation model. The above problem is particularly manifested in that, assuming that clients with certain properties, which can be classified as one class, are referred to as one domain: (1) inter-domain model imbalance: a single global model cannot accommodate the sequence features of all domains. This may be due to the different properties of the individual domains, such as custom bias caused by geographical factors, etc. (2) intra-domain model imbalance: the amount of data held by different clients varies. Clients with smaller data volumes cannot effectively cope with complex sequence models and model migration is easily generated in the aggregation operation of the central server. Therefore, there is a need to develop a new sequence recommendation method based on privacy protection to solve these problems in the current sequence recommendation method.
Disclosure of Invention
The invention aims to provide a sequence recommending method based on a personalized federal technology. The invention can solve the problems of unbalance of the inter-domain model and unbalance of the intra-domain model caused by distributed data and clients, can effectively model the dynamic preference of users on the premise of privacy protection, and has the advantages of expandability, portability and privacy protection.
The technical scheme of the invention is as follows: a sequence recommendation method based on personalized federal technology comprises the following steps:
step S1, each client maintains the held data on the local equipment, and preprocesses the self-interaction data and the attribute values of the interaction items so as to remove the intervention items and the abnormal values in the data; meanwhile, the data structures of all clients are aligned in the distributed framework;
s2, the client builds a hash index on the local equipment through self-interaction data, and a hash storage table is formed; after the hash index is constructed, the client uploads the hash index to the central server;
step S3, the client performs enhancement operation on the training data on the local equipment based on a Bayesian training strategy to obtain enhancement data, wherein the enhancement data is used for self-supervision learning of the local sequence model so as to strengthen the characterization capability of the local sequence model;
step S4, combining the hash index constructed in the step S2 with the enhancement data obtained in the step S3, constructing a local sequence recommendation frame based on multiple tasks (the local sequence recommendation frame can effectively integrate a universal sequence encoder to form a local sequence model) on local equipment by a client, and carrying out distributed training on the local sequence model in cooperation with a central server until the local sequence model converges;
firstly, initializing and pre-training the local sequence model by the client; secondly, uploading the local sequence model to a central server, and performing personalized aggregation operation based on local sensitive hash; then, the central server sends a specific aggregation model to the client, and after the client receives the aggregation model, the client performs a model training task of the next round until the local sequence model sending converges;
and S5, the client acquires parameters of the embedded network of the user, and acquires a preference prediction result at the next moment by combining the output of the converged local sequence model to finish personalized recommendation of the client.
The predicted results only exist in the client device, the central server does not contact any training data sources and recommended results, and no communication is performed between the clients.
In the foregoing sequence recommendation method based on the personalized federation technology, the overall preprocessing direction in the step S1 is to sort and record the access condition of the client to all the items, and the access condition is represented by a vector matrix;
preprocessing specifically comprises cleaning abnormal values and missing values of interaction data (ensuring the availability of the data), archiving project attributes of projects accessed by a client (such as project types), archiving user attributes of the client (such as the region, age, occupation, hobbies and the like); and performing data structure alignment operation on the item attributes and the user attributes on all clients, and representing the aligned item attributes and user attributes through a vector matrix.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 2 includes the following sub-steps:
step S2.1, the client converts the history interaction data into binary feature vectors, downloads Guan Haxi data from the central server, and constructs a group of hash function clusters in the local device according to the data downloaded from the central server;
s2.2, combining the binary feature vector with the hash function cluster to generate a hash index specific to the client and uploading the hash index to a central server;
sub-step S2.3, the central server receives the hash index of each client to construct a hash storage table.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 3 includes the following sub-steps:
sub-step S3.1, constructing a sub-sequence set to generate training data, wherein each piece of training data comprises a positive sample and a negative sample sequence pair, and the training data is used for Bayesian model optimization;
s3.2, constructing an enhanced positive sample based on the positive sample in the training data;
s3.3, constructing an enhanced negative sample based on the negative sample in the training data;
and S3.4, combining the enhancement positive sample and the enhancement negative sample into enhancement data in pairs for self-supervision learning of the Bayesian model.
In the foregoing sequence recommendation method based on the personalized federation technology, in the substep S3.2, an enhanced positive sample is constructed according to the correlation between the items and the length of the positive sample sequence, and the degree of correlation between the items is determined according to the rule of calculating the area by the triangle.
In the foregoing sequence recommendation method based on the personalized federal technology, the step S3.3 constructs an enhanced negative sample according to the length of the negative sample sequence.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 4 includes the following sub-steps:
s4.1, constructing a sequence recommendation frame based on multi-task learning, wherein the sequence recommendation frame has expandability and portability, and consists of a user attribute embedded network, a local contrast learning mechanism, a project embedded network and a universal sequence encoder, and after the universal sequence encoder is selected as required, a local sequence model is formed;
s4.2, initializing and updating a local sequence model by receiving initialization parameters of the central server, locally training the local sequence model by using training data and enhancement data, and uploading the trained local sequence model to the central server to wait for the transmission of the personalized aggregation model of the central server;
s4.3, the center server receives the local sequence models from all the clients, obtains similar users of all the clients by inquiring the global hash storage table, and performs personalized aggregation on all the local sequence models according to the inquiring result so that a specific client corresponds to a specific aggregation model, and sends the aggregation model to the corresponding client;
and S4.4, the client receives the aggregation model, and continuously updates the aggregation model for the next round by using the training data and the enhancement data until the local sequence model converges.
In the foregoing sequence recommendation method based on the personalized federal technology, the step 5 includes the following sub-steps:
s5.1, extracting parameters of the user attribute embedded network, which are used as characteristic representation of the user attribute, and performing inner product operation on the characteristic representation and an output vector of the local sequence model;
sub-step S5.2, taking the inner product result as a recommendation prediction result to predict the preference of the user at the next moment; the prediction result is only reserved in the local equipment of the client and is not shared with the central server, so that the privacy of the user is protected;
s5.3, online real-time maintenance of a global hash table by a recommendation system; but when a client exits the federal training framework, the recommender system will not keep any model send records about the client.
Compared with the prior art, the invention has the beneficial effects that: the personalized federation is embodied in the process of cooperatively training the sequence encoder by the user and the central server, the central server generates a specific model according to the data distribution characteristics of the client, the model training condition, the environment position and the like, and the common sequence encoder can be effectively integrated into the federation distributed framework under the condition of not acquiring the privacy data of the user. Therefore, the federal sequence recommendation system based on the invention has expandability, portability and privacy protection.
Specifically, the invention introduces local sensitive hash, thereby designing a personalized federal aggregation strategy. The strategy can alleviate the problem that a single global model cannot adapt to the sequence characteristics of all domains. In addition, a data enhancement method based on Bayesian training is designed to improve a contrast learning strategy, so that the characterization capability of the local sequence encoder is enhanced. Furthermore, the clients with a small amount of training data can also effectively cope with complex sequence models, and can effectively participate in the model training process of other clients in a distributed scene. Therefore, the federal sequence technology with personalized and characterization enhancement is introduced, so that the universal sequence model can be suitable for a common encoder, the dynamic preference of a user can be effectively modeled on the premise of privacy protection, a large amount of computing resources are not consumed when the project preference of the user at the next moment is predicted, and the recommendation performance is improved.
Drawings
FIG. 1 is a flowchart illustrating an implementation of an embodiment of a personalized federal technology based sequence recommendation method;
FIG. 2 is a schematic diagram of the overall framework of a local multitasking sequence model and its distributed training;
fig. 3 is a sample diagram of a data enhancement method involved in a sequence recommendation method based on personalized federal technology.
Detailed Description
The invention is further illustrated by the following figures and examples, which are not intended to be limiting.
Examples: in the invention, the data preprocessing operation performed by the client can be selected according to specific scene requirements. For example, in the POI recommendation scene, preprocessing may be selected for the distance attribute of the POI. An overall framework for personalized federal technology is shown in fig. 2. The framework diagram shows all details concerning the methods and the sequential relationship.
In this embodiment, a sequence recommendation method based on personalized federal technology includes steps S1 to S5 in fig. 1:
step S1: each client maintains the held data on the local equipment and preprocesses the self-interaction data and the attribute value of the interaction item;
preprocessing comprises the steps of cleaning abnormal values and missing values of interaction data, archiving item attributes of items accessed by a client, and archiving own user attributes; and performing data structure alignment operation on the item attributes and the user attributes on all clients, and representing the aligned item attributes and user attributes through a vector matrix.
Step S2: the client converts the historical interaction data into binary feature vectors, and constructs a group of hash function clusters according to the assistance instructions of the central server; and generating a specific hash index by combining the binary feature vector and the hash function family, and uploading the specific hash index to a central server to construct a global hash storage table. The implementation of the whole step S2 specifically comprises the following sub-steps S2.1-S2.3:
s2.1: assuming that there are n items, for a single user u in federal learning, all of its interaction data can be represented as an n-dimensional vector R u =(v 1 ,v 2 ,…,v i ,…,v n );v i Representing the interaction condition of the user u; if v i =0, representing that user u has no access to item i; conversely, v i =1 represents that user u accessed item i.
S2.2: client acquires a random vector q= (Q) 1 ,q 2 ,…q i ,…,q n ) Wherein q is i ∈[-1,1]The hash function defines the following equation (1):
Based on formula (1), a hash index is constructed as: dex u =[G 1 ,G 2 ,..,G i ,…,G K ]Each G i From { g 1 (R u ),g 2 (R u ),…,g r (R u ) Composition of G i Representing the hash value generated by a hash bucket, K such hash values constitute a hash index Dex u 。
S2.3: dex is to u Uploaded to a central server and become part of building a global hash table.
Step S3: the client generates an enhanced positive and negative sample pair through training the positive and negative sample pair, training data is used for Bayesian model optimization, and enhanced data is used for self-supervision learning of the Bayesian model; in the data enhancement process, correlation between items and the sequence length of training samples are fully considered, and an example of data enhancement is shown in fig. 3, in which the training sample length l is set to 5. The implementation of the whole step S3 specifically comprises the following sub-steps S3.1-S3.4:
s3.1: the client constructs a subsequence set to generate training samples for BPR optimization; e.g. when l=5, subsequence [ v ] 1 ,v 2 ,v 3 ,v 4 ,v 5 ]And [ v ] 1 ,v 2 ,v 3 ,v 4 ,v 6 ]Respectively, as a pair of positive and negative sequences, i.e. training samples, where v 5 And v 6 Respectively considered as a positive label and a negative label.
S3.2: enhancement of positive samples: with positive sample S p =[v 1 ,v 2 ,…,v i ,…,v lp ]For example, v lp Representing a positive label item; randomly select v i ∈S p Using related itemsReplacement v i Generating an enhanced positive sample->Wherein v is i ≠v lp ,Reference v i Related items (associated items); v i And->The degree of association between the two points of interest is determined by the geographic properties between the two points of interest; specifically, the present invention fully considers the user positive sample S p Medium v i Is a last access sequence v of (2) i-1 And the next access sequence v i+1 To calculate +.>And v i Is a degree of association of (a); first, as shown in formula (2), v-based i A set of POIs within a geographic range is selected.
Wherein α represents a constant value representing a threshold value of a geographic distance; dis () represents the geographic distance size;then the representation is equal to v i A set of all POIs within a range from each other.
Next, based on the idea of calculating the triangle area in terms of side length, we calculate v as in the following equation (3) -equation (7) i Andis a correlation of: />
a=dis(v i+1 ,v i-1 ) (3)
b 1 =dis(v i ,v i-1 ),c 1 =dis(v i ,v i+1 ) (4)
Wherein dis () represents a distance, which may be a cosine distance or an actual physical space distance; a represents v i+1 And v i-1 Distance between b 1 Representing v i And v i-1 Distance between c 1 Representing v i And v i+1 Distance between b 2 Representation ofAnd v i-1 Distance between c 2 Representation->And v i+1 Distance between, p 1 And p 2 Represents the average length of three distances, +.>Can be regarded as v i And->Is a correlation of (a) and (b).
In addition, in the operation of enhancing the positive sample, the number of generated substitutions is determined by the sequence length l, and the specific calculation is shown in the formula (8).
S3.3: enhancement of negative samples: in negative sample S n =[v 1 ,v 2 ,…,v i ,…,v ln ]For example, v ln Representing a negative tag item; selecting a replacement term v i (≠v ln ) Randomly selecting the points of interest v 'which are not interacted by the user' i As a replacement value, a sequence S is generated n =[v 1 ,v 2 ,…,v′ i ,…,v ln ]. The number of substitutions follows the algorithm of equation (9):
wherein fre pos Representing the number of substitutions in the enhanced positive samples, fre neg Representing the number of substitutions in the enhanced negative example.
S3.4: the enhanced positive samples and the enhanced negative samples are paired to form enhanced data for self-supervised learning of the Bayesian model.
Step S4: the client builds a sequence recommendation framework based on multi-task learning, wherein the sequence recommendation framework is composed of a user attribute embedded network, a local contrast learning mechanism, a project embedded network and a universal sequence encoder, and after the sequence encoder is selected according to the requirement, a local sequence model is formed; then, the central server combines the global hash table and the combined client to complete the personalized training of the local sequence model. The implementation of the whole step S4 specifically comprises the following sub-steps S4.1-S4.4:
s4.1: the client U represents the attribute of the client U as a one-dimensional vector U through a neural embedded network u The client constructs a local multi-task learning model in conjunction with the desired sequence encoder. Multitasking includes recommending tasks and contrasting learning tasks; for recommendation tasks, a given user u, a sequence encoder f (·), a point of interest embedding V, a sequence Seq of the user at a time stamp t u,t Time embedding T and context characteristics I, a hidden layer h of the sequence encoder can be obtained t 。h t Can be represented by the following formula (10):
h t =f(V,T,I,Seq u,t ;θ s ) (10)
wherein θ s Representing the parameter set of the encoder.
Representing U based on user embedding u Evaluating the preference degree of user u to item j at time stamp tThe following formula (11) shows:
next, a pairwise Bayesian Personalized Ranking (BPR) is applied to learn the sequence encoder and the parameters θ of the embedded network r Recommended task loss functionThe following formula (12):
where σ (·) represents the sigmod () function, and (j, k) is a pair of positive and negative labels in the training subsequence.
S4.2: for the contrast learning task, the enhanced samples are represented by equation (13) and equation (14):
wherein, seq aug-p Finger enhanced positive sample, seq aug-n Finger enhancement negative samples;regarded as a pair of aligned samples, the remaining 2N +.>Considered as its negative example; sequence pair->The encoded characteristic is->Its corresponding negative example->The coded characteristic is h' i . Multi-class cross entropy loss function (NCE) learning was used to compare tasks as shown in equation (15) below:
wherein τ is a temperature coefficient, and an optimal constant value is obtained after a parameter adjustment experiment;is a sequence pairCoding features of->Is a sequence pair->Is a coding feature of (a); similarity is calculated according to sim () expression;Representing the loss function of the comparison task.
Further, the recommended task and the comparison learning task are combined to obtain a final multi-task sequence model. The optimization penalty of the multitasking sequence model is expressed as the following equation (16):
where lambda is a constant coefficient for controlling the duty cycle of the comparison task,is a loss function of the recommended task, +.>Is the loss function of the comparison task, +.>Is the final multitasking loss function, and the client performs local model training according to the operation.
S4.3: client u will model Θ locally u ={W u Upload to central server, W u Representing Θ u The central server queries the global hash table to find other users in the same hash bucket as uIt is defined asNext, the central server generates a specific personalized model for u, representing the following formula (17):
wherein Avg (·) refers to an averaging operation, α is a constant to control the degree of influence of similar and dissimilar users on the local model of user u, W u Refers to the local model of a similar user to user u, W z Referring to the local model of dissimilar users of user u, ln refers to the number of similar users to user u,refers to the personalized local model generated for user u.
S4.4: the central server will be specific toIs sent to u, u pair->A new training cycle is performed and the process is repeated until the model converges.
Step S5: according to U u And sequence model output h t A preference prediction for the next time is obtained.
The implementation of the whole step S5 specifically comprises the substeps S5.1-S5.3:
s5.1: for trained U u And h t And performing inner product operation.
S5.2: and predicting user preference at the next moment according to the inner product result, wherein all recommendation results are only reserved in the local equipment of the client and are not shared with the central server.
S5.3: the recommendation system maintains a global hash table online in real time, and when a certain client exits from training, the system will not keep any data record about that client.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention can be made by one of ordinary skill in the art without departing from its spirit and are intended to be within the scope of the present invention.
Claims (8)
1. A sequence recommending method based on personalized federal technology is characterized in that: the method comprises the following steps:
step S1, each client maintains the held data on the local equipment and preprocesses self-interaction data and attribute values of interaction items;
s2, the client builds a hash index on the local equipment through self-interaction data, and a hash storage table is formed;
step S3, the client performs enhancement operation on the training data on the local equipment based on a Bayesian training strategy to obtain enhancement data;
step S4, combining the hash index constructed in the step S2 with the enhancement data obtained in the step S3, constructing a local sequence recommendation frame based on multiple tasks on local equipment by a client, and carrying out distributed training on a local sequence model in cooperation with a central server until the local sequence model is converged;
and S5, the client acquires parameters of the embedded network of the user, and acquires a preference prediction result at the next moment by combining the output of the converged local sequence model to finish personalized recommendation of the client.
2. A personalized federal technology based sequence recommendation method according to claim 1, wherein: preprocessing in the step S1 comprises the steps of cleaning abnormal values and missing values of interaction data, archiving item attributes of items accessed by a client, and archiving own user attributes; and performing data structure alignment operation on the item attributes and the user attributes on all clients, and representing the aligned item attributes and user attributes through a vector matrix.
3. A personalized federal technology based sequence recommendation method according to claim 1, wherein: said step 2 comprises the sub-steps of:
step S2.1, the client converts the history interaction data into binary feature vectors, downloads Guan Haxi data from the central server, and constructs a group of hash function clusters in the local device according to the data downloaded from the central server;
s2.2, combining the binary feature vector with the hash function cluster to generate a hash index specific to the client and uploading the hash index to a central server;
sub-step S2.3, the central server receives the hash index of each client to construct a hash storage table.
4. A personalized federal technology based sequence recommendation method according to claim 1, wherein: said step 3 comprises the sub-steps of:
sub-step S3.1, constructing a sub-sequence set to generate training data, wherein each piece of training data comprises a positive sample and a negative sample sequence pair, and the training data is used for Bayesian model optimization;
s3.2, constructing an enhanced positive sample based on the positive sample in the training data;
s3.3, constructing an enhanced negative sample based on the negative sample in the training data;
and S3.4, combining the enhancement positive sample and the enhancement negative sample into enhancement data in pairs for self-supervision learning of the Bayesian model.
5. The personalized federal technology based sequence recommendation method according to claim 4, wherein: in said substep S3.2, an enhanced positive sample is constructed from the correlation between the items and the length of the positive sample sequence, and the degree of correlation between the items is determined from the rule of calculating the area by the edges of the triangle.
6. The personalized federal technology based sequence recommendation method according to claim 4, wherein: in said substep S3.3 an enhanced negative sample is constructed according to the length of the negative sample sequence.
7. A personalized federal technology based sequence recommendation method according to claim 1, wherein: said step 4 comprises the sub-steps of:
s4.1, constructing a sequence recommendation frame based on multi-task learning, wherein the sequence recommendation frame is composed of a user attribute embedded network, a local contrast learning mechanism, a project embedded network and a universal sequence encoder, and after the universal sequence encoder is selected as required, a local sequence model is formed;
s4.2, initializing and updating a local sequence model by receiving initialization parameters of a central server, performing local training on the local sequence model by using training data and enhancement data, and uploading the trained local sequence model to the central server;
s4.3, the center server receives the local sequence models from all the clients, obtains similar users of all the clients by inquiring the global hash storage table, and performs personalized aggregation on all the local sequence models according to the inquiring result so that a specific client corresponds to a specific aggregation model, and sends the aggregation model to the corresponding client;
and S4.4, the client receives the aggregation model, and continuously updates the aggregation model for the next round by using the training data and the enhancement data until the local sequence model converges.
8. A personalized federal technology based sequence recommendation method according to claim 1, wherein: the step 5 comprises the following sub-steps:
s5.1, extracting parameters of the user attribute embedded network, which are used as characteristic representation of the user attribute, and performing inner product operation on the characteristic representation and an output vector of the local sequence model;
sub-step S5.2, taking the inner product result as a recommendation prediction result to predict the preference of the user at the next moment;
and S5.3, the recommendation system maintains the global hash table on line in real time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310023696.9A CN116089715A (en) | 2023-01-09 | 2023-01-09 | Sequence recommendation method based on personalized federal technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310023696.9A CN116089715A (en) | 2023-01-09 | 2023-01-09 | Sequence recommendation method based on personalized federal technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116089715A true CN116089715A (en) | 2023-05-09 |
Family
ID=86213415
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310023696.9A Pending CN116089715A (en) | 2023-01-09 | 2023-01-09 | Sequence recommendation method based on personalized federal technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116089715A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116361561A (en) * | 2023-05-30 | 2023-06-30 | 安徽省模式识别信息技术有限公司 | Distributed cross-border service recommendation method and system based on variational reasoning |
CN117494191A (en) * | 2023-10-17 | 2024-02-02 | 南昌大学 | Point-of-interest micro-service system and method for information physical security |
-
2023
- 2023-01-09 CN CN202310023696.9A patent/CN116089715A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116361561A (en) * | 2023-05-30 | 2023-06-30 | 安徽省模式识别信息技术有限公司 | Distributed cross-border service recommendation method and system based on variational reasoning |
CN117494191A (en) * | 2023-10-17 | 2024-02-02 | 南昌大学 | Point-of-interest micro-service system and method for information physical security |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Rank-geofm: A ranking based geographical factorization method for point of interest recommendation | |
CN106940801B (en) | A kind of deeply study recommender system and method for Wide Area Network | |
US20220391778A1 (en) | Online Federated Learning of Embeddings | |
CN113011587B (en) | Privacy protection model training method and system | |
CN116089715A (en) | Sequence recommendation method based on personalized federal technology | |
Li et al. | Exploiting explicit and implicit feedback for personalized ranking | |
Weston et al. | Nonlinear latent factorization by embedding multiple user interests | |
Liu et al. | Deep learning based recommendation: A survey | |
Lu et al. | Fedclip: Fast generalization and personalization for clip in federated learning | |
CN108446964B (en) | User recommendation method based on mobile traffic DPI data | |
Li et al. | Dynamic structure embedded online multiple-output regression for streaming data | |
CN113609398A (en) | Social recommendation method based on heterogeneous graph neural network | |
CN115631008B (en) | Commodity recommendation method, device, equipment and medium | |
Ko et al. | Mascot: A quantization framework for efficient matrix factorization in recommender systems | |
Song et al. | Coupled variational recurrent collaborative filtering | |
Xu et al. | Machine learning-driven apps recommendation for energy optimization in green communication and networking for connected and autonomous vehicles | |
Tang et al. | Accurately predicting quality of services in ioT via using self-attention representation and deep factorization machines | |
Yuan et al. | Optimizing factorization machines for top-n context-aware recommendations | |
CN115439770A (en) | Content recall method, device, equipment and storage medium | |
CN116432039B (en) | Collaborative training method and device, business prediction method and device | |
Ravi et al. | Hybrid user clustering-based travel planning system for personalized point of interest recommendation | |
Shan et al. | NASM: nonlinearly attentive similarity model for recommendation system via locally attentive embedding | |
Crankshaw et al. | Scalable training and serving of personalized models | |
Ye et al. | Robust clustered federated learning | |
Fushimi et al. | Accelerating Greedy K-Medoids Clustering Algorithm with Distance by Pivot Generation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |